Submitted by Richard Cave on Mon, 2008-07-14 15:38.
We experienced a hardware malfunction yesterday that caused the TOPAZ hosted journals to be offline from 4pm - 10pm PST.
James sent an email late afternoon yesterday indicating site errors on the PLoS journal websites. Soon after his email, the IT team started receiving SMS alerts. I assumed that something had occurred with the Topaz framework and started looking at the appropriate log files but couldn't find anything. I spent the requisite amount of time banging my head against the "site error" wall without success and I called up Russ for assistance. After a bit of digging through the server logs, he found the culprit - a drive had failed on the Mulgara server. This drive is part of a RAID 0 configuration, so we didn't lose any data but we also mysteriously lost the connection from the Mulgara server to the DAS array (disk storage for the Mulgara data).