Relevant Technology: SQL Server 2008 R2 RAID 5 (4 disk) Windows Server 2008
To preface, our RAID 5 array had a disk partially fail. No failure was detected, but the drive light flashed amber periodically (solid amber indicates drive failure) after an unplanned power outage over the weekend and failure of the UPS. The outage was on a Saturday and I found the light Tuesday after noticing a "PAGEIOLATCH_SH" error and reading the post What is PAGEIOLATCH_SH wait type in SQL Server? (among others). We've replaced the drive and let it rebuild, but I'm still seeing the error.
The query is against a large table through a view which has several indexes on the underlying tables. I've rebuilt the indexes, re-saved the view in hopes of a better execution path, and simplified the query. Nothing has fixed the problem. The query has run without issue since 2006 and had no issues with the upgrade to SQL Server 2008 or to R2, both of which were applied when they first became available.
Originally the execution plan showed a fairly even distribution, but now it shows the majority on the second item, "Sort (Distinct Sort)", with about 30% split among Index Seeks. The time used to be between 2 and 10 seconds but now is upwards of 2 minutes.
At this point I'm not sure how to isolate what is causing the issue. I assume it's either corrupt data that I'm not finding or that the query has re-optimized itself to something that is far from optimal, or there's something wrong with the RAID that doesn't set off any lights or warnings.
I've done what PAGEIOLATCH_SH and similar issues usually require, and the indexes not only seem right, but have worked for years up to this point. I've also done everything I know to do to make sure the drive is working. My question is basically how do I go about diagnosing the root of the problem in this situation?
EDIT: Found that the server did not actually go down with the power outage, but the rack beside it did. Not sure why the drive partially failed, but at this point it seems to be coincidental to the outage.