A DBA Guide to SQL Server Performance Troubleshooting

Troubleshooting SQL Server performance is a multi-faceted task. Performance is influenced by a myriad of hardware and software factors. Proactive monitoring of these parameters is key to preventing potential performance degradation. However, issues inevitably arise. When performance degradation occurs, the initial step is to identify the problem, pinpoint its root cause, and implement a solution. While severe performance problems demand immediate attention, less critical issues can be addressed with a more measured approach. In all cases, DBAs should aim to prevent future occurrences.

It’s important to distinguish between a temporary slowdown and a genuine performance problem. Resource-intensive actions are expected to occasionally stress the server. These should not be confused with chronic performance issues. A key aspect of being a DBA is understanding the baseline behavior of the system under normal conditions. This baseline information, gathered by monitoring resource usage, allows for quicker identification and diagnosis of performance deviations.

To effectively manage your SQL Server environment, consistent database monitoring is crucial. Track performance metrics over time to establish baselines and identify trends during normal operation. Isolate processes that consume excessive resources. This enables effective debugging and resolution of performance bottlenecks.

Diagnosing SQL Server Performance Problems

The first step in troubleshooting is accurately defining the symptoms. Common SQL Server performance indicators include CPU, memory, network, and I/O bottlenecks, as well as slow-running queries.

CPU Bottlenecks: These are often caused by insufficient hardware resources. The troubleshooting process begins by identifying the processes that consume the most CPU resources. While occasional spikes in CPU usage are usually not a cause for concern, consistently high CPU utilization requires investigation. Simply adding more processors or using a more powerful CPU may not always resolve the problem, especially if poorly designed processes are the culprit. Optimizing queries, improving execution plans, and reconfiguring the system can be more effective solutions. A best practice to minimize bottlenecks is to dedicate a server exclusively to SQL Server and migrate all other software to a separate machine.

Memory Bottlenecks: These can manifest as slow application responsiveness, overall system slowdown, or even application crashes. It’s important to determine when the system is running with insufficient memory, which applications are the largest memory consumers, and whether there are bottlenecks affecting other system resources. Optimizing queries, reconfiguring memory settings, and adding physical memory can often improve performance.

Network Bottlenecks: Network issues can sometimes be mistaken for SQL Server performance problems caused by other resources. For example, delays in data transmission over the network can appear as slow SQL Server response times.

I/O Bottlenecks: These occur when there is excessive reading and writing of database pages from and onto the disk. Symptoms include long response times, application slowdowns, and task timeouts. If other applications are also heavily using disk resources, SQL Server may not have sufficient resources for its normal operation and will be forced to wait.

Slow-Running Queries: These can stem from various factors, including missing indexes, inefficient execution plans, poor application and schema design, and more.

Common Root Causes of SQL Server Performance Issues

Performance problems often arise from a number of common causes:

  • Poorly Designed Database: A poorly designed database schema can lead to inefficient queries and slow performance.
  • Incorrectly Configured System: Improper system configuration can create bottlenecks and hinder SQL Server performance.
  • Insufficient Disk Space or System Resources: Lack of adequate disk space or other system resources can significantly impact performance.
  • Excessive Query Compilation and Recompilation: Frequent query compilation and recompilation can consume significant resources and slow down performance.
  • Bad Execution Plans: Missing or outdated statistics can lead to poor execution plans, resulting in slow query execution.
  • Queries or Stored Procedures with Long Execution Times: Inefficiently designed queries or stored procedures can take an excessive amount of time to execute.
  • Memory Bottlenecks: Limited available memory and memory pressure caused by SQL Server, the operating system, or other applications can lead to performance degradation. Poor indexing can also contribute to memory bottlenecks by requiring full table scans, which involve reading a large number of rows from disk into memory.
  • Network Bottlenecks: Overloads on the server or network infrastructure can impede data flow and cause performance issues.
  • I/O Issues: These can be caused by slow hardware, poor storage solution design, and configuration problems. In addition to hardware factors like disk types, disk array types, and RAID configurations, unnecessary database requests can also contribute to I/O bottlenecks. Frequent index scans, inefficient queries, and outdated statistics can all increase I/O workload.

Key Performance Metrics to Monitor

To effectively troubleshoot SQL Server performance, it’s essential to monitor specific metrics:

  • % Processor Time: This Performance Monitor counter indicates the percentage of time the processor is busy. If this value consistently exceeds 80%, the processor is under pressure.

The following counters help identify the specific causes of processor pressure:

  • Batch Requests/sec: This counter measures the number of batch requests processed per second.
  • SQL Compilations/sec: This counter tracks the number of SQL compilations per second.
  • SQL Re-Compilations/sec: This counter measures the number of SQL recompilations per second.

These counters are available in Performance Monitor and in the sys.dm_os_performance_counters view.

SELECT *
FROM sys.dm_os_performance_counters
WHERE counter_name IN ('Batch Requests/sec', 'SQL Compilations/sec', 'SQL Re-Compilations/sec');

Alt text: SQL Server Performance Monitor showing key metrics for troubleshooting processor pressure, including Batch Requests/sec, SQL Compilations/sec, and SQL Re-Compilations/sec.

Note that the counter type for these is 272696576 and the values are cumulative since the last SQL Server start. You will need to calculate the difference between samples taken at different times to get the actual values. For example, this script shows how to calculate Batch Requests/sec:

DECLARE @BatchRequests BIGINT;

SELECT @BatchRequests = cntr_value
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Batch Requests/sec';

WAITFOR DELAY '00:00:10';

SELECT (cntr_value - @BatchRequests) / 10 AS 'Batch Requests/sec'
FROM sys.dm_os_performance_counters
WHERE counter_name = 'Batch Requests/sec';

The Batch Requests/sec value depends on the hardware, but it should generally be under 1000. For SQL Compilations/sec, the recommended value is less than 10% of Batch Requests/sec, and for SQL Re-Compilations/sec, it should be less than 10% of SQL Compilations/sec.

For monitoring memory issues, use the following counters:

  • Memory Available KB: This Performance Monitor counter shows the amount of available memory in kilobytes. Normal values should be above 200 MB. If this value remains below 100 MB for an extended period, it indicates insufficient memory on the server.

The following counters are also valuable and are available via the sys.dm_os_performance_counters view:

  • Total Server Memory (KB)
  • Target Server Memory (KB)

Another useful counter is:

  • Pages/sec: This Performance Monitor counter displays the rate at which pages are written to and read from disk. Values exceeding 50 indicate intensive memory activity and potential memory pressure, which can negatively impact SQL Server performance.

The following counters can help determine if dirty pages are being flushed to disk too frequently:

  • Checkpoint pages/sec
  • Lazy writes/sec

Dirty pages are automatically flushed to disk at a checkpoint. If available free space in the buffer cache between checkpoints is low, a lazy write will occur to free up memory. The Lazy Writes/sec value should ideally be below 20. Both counters are accessible in Performance Monitor and the sys.dm_os_performance_counters view. Because the counter type is 272696576, the values returned by the view must be calculated.

If the Lazy Writes/sec value consistently exceeds the threshold, examine the Page Life Expectancy value. Values below 300 seconds suggest memory pressure. This counter is available in Performance Monitor and the sys.dm_os_performance_counters view and does not require additional calculation.

  • Buffer Cache Hit Ratio: This counter represents the percentage of data pages found in the SQL Server buffer cache compared to all data page requests. If a page is not found in the buffer cache, it must be read from disk, which slows performance. The recommended value is above 90%. This counter is available in Performance Monitor and the sys.dm_os_performance_counters view.

Because the counter type is 537003264, the value returned by the view must be calculated using the Buffer Cache Hit Ratio Base value:

Buffer Cache Hit Ratio % = 100 * Buffer Cache Hit Ratio / Buffer Cache Hit Ratio Base = 100 * 1,797 / 1,975 = 90.98%

To diagnose network issues, begin by identifying queries, functions, and stored procedures with slow response times. If these execute quickly individually, but there are long delays between calls, a network problem may be indicated. SQL Server Profiler can be used to pinpoint the specific queries, functions, and stored procedures being executed.

For I/O problems, monitor the following disk-related counters in Performance Monitor: Average Disk Queue Length, Average Disk Sec/Read, Average Disk Sec/Write, %Disk Time, Average Disk Reads/Sec, and Average Disk Writes/Sec.

  • Average Disk Queue Length: Shows the average number of I/O operations waiting to be written to or read from disk, including currently processed reads and writes. The recommended value is below 2 per individual disk. Higher values suggest I/O bottlenecks.

  • Average Disk Sec/Read: Shows the average time in seconds required to read data from the disk. Under 8ms is excellent performance, while above 20ms indicates a serious I/O issue.

  • Average Disk Sec/Write: Shows the average time in seconds required to write data to the disk. Excellent performance is below 1ms, while above 4ms is considered poor.

  • Average Disk Reads/Sec and Average Disk Writes/Sec: Show the rate of read and write operations on the disk, respectively. Low values indicate slow disk I/O processing, and it is recommended to check processor usage and disk-intensive queries. Normal values depend on disk specifications and server configuration. Monitor these metrics over time to determine trends and establish a baseline, as there is no fixed threshold for these counters.

SQL Server performance can be impacted by many factors. Effective troubleshooting requires a clear starting point, understanding of normal performance counter values, and selection of appropriate tools for analysis.

Resources

Table of contents

A Dba Guide To Sql Server Performance Troubleshooting – Part 1 – Problems and performance metrics
A DBA guide to SQL Server performance troubleshooting – Part 2 – Monitoring utilities

Milena Petrovic

Milena is a SQL Server professional with more than 20 years of experience in IT. She has started with computer programming in high school and continued at University. She has been working with SQL Server since 2005 and has experience with SQL 2000 through SQL 2014.

Her favorite SQL Server topics are SQL Server disaster recovery, auditing, and performance monitoring.

View all posts by Milena “Millie” Petrovic

Latest posts by Milena Petrovic (see all)

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *