Submitted by Richard Cave on Fri, 2008-03-07 13:01.

We had a "cascade event" ~5:45am which backed up Mulgara. This event happens when a query pulls so much data from Mulgara that all other queries are waiting for it to finish. This happens infrequently and depending on site traffic, Mulgara is able to gracefully handle the back log of requests. But this morning, Mulgara was unable to clear out the queue. This query is hard to track down (the saying "needle in a haystack" is appropriate) and we've upped the logging levels to trace all queries sent to Mulgara.

Unfortunately, we also had a human error this morning. When I restarted the applications after the cascade event, a new article XSL file caused articles to throw errors trying to parse the article XML. Articles were serving "page not found" which resulted in more cascade events until I could revert the XSL file. We tested the XSL file on dev/stage without an error so something else must have snuck into the release.

Now that the XSL file is fixed, the sites are purring. We've also identified another cache fix that will significantly help performance and a patch should be up today.

Reply

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
More information about formatting options
Captcha Image: you will need to recognize the text in it.
Please type in the letters/numbers that are shown in the image above.
Please enter capital letters only.