Monday, March 30, 2009

Best practices for installing service packs, cumulative updates and hotfixes for SQL Server

Here is the short version:

  1. Test on a test/dev SQL Server first and only after you have confirmed that all applications are working as expected then install it on a production SQL Server.
  2. Review the Readme for Service Pack/cumulative update/hotfix. Any concerns/recommendations will be found in the readme.
  3. Run DBCC CHECKDB on ALL databases (user and system databases) and ensure that there were no errors reported.
  4. Backup ALL databases (user and system databases) and full-text catalogs (if applicable). This is NOT required but highly recommended.
  5. Stop Monitoring and Anti-virus services
  6. Make sure you have the proper permissions to install (administrative privilege on server/cluster node)
  7. The below points are for clustered SQL Server instance

· Make sure ALL SQL resources come online on ALL cluster nodes

· Make sure that ALL disk resources (even the ones that not being used by SQL Server) are online and not in failed state.

· Verify that there are no dependencies other than those created by the SQL Server setup on any SQL Server cluster resources.

· Run MPSRPT_SQL.exe on all cluster nodes (not required but recommended)

· LOG OFF of all other nodes except the one from which you are running setup. Make sure all remote desktop connections are closed. You can connect to the node (you are running the setup from) using remote desktop connections but you should disconnect any remote connections to other cluster nodes.

· Make sure SQLSERVER.EXE and its disks are on the node from which you are running setup. We can run hotfix, Cumulative Updates, and Service Pack installs on a passive node (i.e a node where sqlservr.exe is not running). There is a switch called /passive that be used to do this.

 

For the long version, check out Installing SQL Server 2005.

 

Thanks to Uttam Parui for starting the email thread that resulted in this list.

Tuesday, March 3, 2009

Look for 825 Errors = Save yourself from Doom

SQL Server throws and 825 error when it could not read a page between one a three times, but only at level 10. Add a specific Agent alert for error 825 to get a jump on IO system problems.

From a post on SQLSkills.com by Paul S. Randal - http://www.sqlskills.com/BLOGS/PAUL/post/A-little-known-sign-of-impending-doom-error-825.aspx for more details

Monday, March 2, 2009

Testing Your New I/O Hardware

SQLIOStress simulates a very heavy SQL Server workload and should do a far better job of discovering flaws in your hardware setup than individual hardware vendors' diagnostics will.

Trace flag 806 will cause 'DBCC-style' page auditing to be performed whenever a database page is read into the buffer pool.

Trace flag 3422 will cause auditing of transaction log records as they're read win

 

Thanks to Paul S. Randal from SQLSkills for these tips.  Details at

http://www.sqlskills.com/blogs/paul/post/How-to-tell-if-the-IO-subsystem-is-causing-corruptions.aspx

Sunday, March 1, 2009

Checksum

Checksum is only applied to an updated database when pages are read into cache, changed and written back to disk. Just turning on the CheckSum option makes no change to the pages. The checksum value is only calculated and applied once the page is read into cache, changed and written back to disk.

The only fix for a corrupt page is to delete the page and fix the links

Always use the WITH CHECKSUM option when taking backups if checksum option is on

CONTINUE_AFTER_ERROR option will force backup to do just that. CONTINUE_AFTER_ERROR will also allow a corrupt backup to restore.

Thanks to Paul S. Randal from SQLSkills for these tips. Details at

http://www.sqlskills.com/blogs/paul/post/Example-20002005-corrupt-databases-and-some-more-info-on-backup-restore-page-checksums-and-IO-errors.aspx