SQL Server
Architecture
DATABASE
TRANSACTIONS
The most well-known feature of a transaction is that it
must complete all of the commands in their entirety or none of them. This
features, called automicity, is just one of four properties
defined in the early days of database theory as requirements for a database
transaction, collectively known as ACID properties.
ACID
Properties
1.
atomicity
2.
consistency
3.
isolation
4.
durability
Atomicity
All the effects of the transaction
must complete successfully or the changes are rollback back.
An example of an atomic transaction is withdrawal from an ATM machine;
the machine must both dispense the cash & debit your bank acc.
Consistency
The consistency requirement ensures that the transaction
cannot break the integrity rules of the database; it must leave the database in
a consistent state.
An example:
1.
Stock
levels cannot be a negative value
2.
Spare
part cannot exist without a parent object
3.
Sex
field must be Male / Female
In order to be
consistent, a transaction must not break any of the constraints or rules
defined for the data.
Isolation
Keeping the changes of incomplete transactions running at
the same time separate from one another.
Changes it makes must not be
readable by any other transaction, although SQL Server does allow you to
control the degree of isolation in order to find a balance between business and
performance requirements.
Durability
Once a transaction is committed, it must persist even if
there is a system failure – that is, it must be durable.
The information needed to replay changes made in a transaction is
written to the transaction log before the transaction is considered to be
committed.
SQL SERVER TRANSACTION
Two types of transactions in SQL
Server that are differentiated only the way they are created: implicit
& explicit.
Implicit transactions are used automatically by
SQL Server to guarantee the ACID properties of single commands.
For ex: an Update statement that
modified 10 rows, SQL Server would run it as an implicit transaction so that
the ACID properties would apply, and all 10 rows would be updated or none of
them would.
Emplicit transactions are started by using the
BEGIN TRANSACTION & are stopped by using the COMMIT TRANSACTION or ROLLBACK
TRANSACTION commands.
Emplicit
transactions are used to group together changes to which u want to apply the
ACID properties as a whole, which also enables u to roll back the changes at
any point if ur
business logic determines that u should cancel the change.
THE LIFE
CYCLE OF A QUERY (fig 1:1)
To introduce the high-level
components of SQL Server’s architecture, this session uses the example of a
query’s life cycle to put each component into context in order to foster ur understanding and
create a foundation for the rest of the book.
The basic SELECT query life cycle
xxxxxxxxxxxxxxx
The
Relational & Storage Engines
The sql server is split into
two main engines:
1.
Relational
engine
2.
Storage
engine
The Relational engine is also sometimes called the query
processor because its primary function is query optimization and execution.
It contains a command parser to
check query syntax and prepare query trees, a Query Optimizer that is arguably
the crown jewel of any database system, and a Query Executor responsible for
execution.
The Storage Engine is
responsible for managing all I/O to the data, and contains the Access Methods
code.
Handles I/O requests for rows,
indexes, pages, allocations and row versions, and a Buffer Manager, which deals
with SQL Server’s main memory consumer, the buffer pool.
THE BUFFER
POOL
The largest consumer of memory in
SQL Server, the buffer pool contains all the different caches in SQL Server,
including the plan cache and the data cache, which is covered as the sections
follow the query through its life cycle.
A BASIC
SELECT QUERY
Start at the client, where the first
component u touch is the SQL Server Network Interface (SNI).
SQL Server
NW interface (SNI)
SNI is a protocol layer that
establishes the NW connection between the client and the server.
SNI is not connect directly, need to
configure a NW protocol on the client and the server.
- Shared memory: Simple and fast, shared memory is the default protocol used to connect from a client running on the same computer as SQL server.
- TCP/IP: TCP/IP is the most commonly used access protocol for SQL Server. It enables u to connect to SQL Server by specifying an IP address and a port number.
Typically, this happens automatically when you specify an instance
to connect to.
Default TCP port no: 1433
Or
SQL Browser service will find the right port for a named instance
using
UDP port: 1434
- Named Pips: TCP / IP and Named Pipes are comparable protocols in the architecture in which they can be used.
- A named pipe was developed for LANs but it can be inefficient across slower NWs such as WANs.
- To use Named pipes u first need to enable it in SQL Server Configuration Manager (connecting remotely) and then create a SQL Server alias, which connects to the server using Named Pipes as the protocol.
- Named Pipes uses TCP port 445, so ensure that the port is open on any firewalls between the two computers, including the WINDOWS firewall.
- VIA: Virtual Interface Adapter is a protocol that enables high-performance communications between two systems. It requires specialized hardware at both ends and a dedicated connection.
- Like Named Pipes, to us the VIA protocol u first need to enable it in SQL Server Configuration Manager and then create a SQL Server alias that connects to the server using VIA as the protocol.
- TDS( Tabular Data Stream) Endpoints
TDS is a MS-proprietary protocol originally designed by Sybase
that is used to interact with a database server. Once a connection has been
made using a NW protocol such as TCP/IP, a link is established to the
relevant TDS endpoint that then acts as the communication point between
the client and the server.
There is one TDS endpoint for each
network protocol and an additional one reserved for us by the dedicated
administrator connection (DAC).
Once connectivity is established, TDS
messages are used to communicate between the client and the server.
The SELECT statement is sent to the SQL
Server as a TDS message across a TCP/IP connection (TCP/IP is the default
protocol).
- Protocol Layer
- When the protocol layer in SQL Server receives ur TDS packet, it has to reverse the work of the SNI at the client and unwrap the packet to find out what request it contains.
- The protocol layer is responsible for packaging up results and status messages to send back to the client as TDS messages.
- Our SELECT statement is marked in the TDS packet as a message of type “SQL Command,” so it’s passed on to the next component, the Query Parser, to begin the path toward execution.
- Fig 1:2 Shows where our query has gone so far. At the client, the statement was wrapped in a TDS packet by the SQL Server Network Interface and sent to the protocol layer on the SQL Server where it was unwrapped, identified as a SQL command, and the code sent to the Command Parser by the SNI.
xxxxxxxxxxxxxxxxxxxxxxx
Command
Parser
The command parser’s role is to
handle T-SQL language event.
It first checks the syntax and
returns any errors back to the protocol layer to send to the client.
If the syntax is valid, then the
next step is to generate a query plan or find an execution plan.
The query plan contains the details
about how SQL Server is going to execute a piece of code.
It is commonly referred to as an execution
plan.
To check for a query plan, the Command
Parser generates a hash of the T-SQL and checks it against the plan cache to
determine whether a suitable plan already exists.
The plan cache is an area in the
buffer pool used to cache query plans.
If it finds a match, then the plan
is read from cache and passed on to the Query Executor for execution.
Plan cache
Creating execution plans
can be time consuming and resource intensive, so it makes sense that if SQL
server already found a good way to execute a piece of code that it should try
to reuse it for subsequent requests.
The plan cache, part of
SQL Server’s buffer pool, is used to store execution plans in case they are
needed later.
If no cached plan is
found, then the command parser generates a query tree based on the T-SQL.
Our basic query didn’t
have an existing plan so a query tree was created and passed to the Query
Optimizer.
xxxxxxxxxxx
Optimizer
The Optimizer is the one of the most
prized possession of the SQL Server team and one of the most complex and
secretive parts of the product.
“cost-based”
optimizer: It evaluates multiple ways to execute a query and the picks the
method that it deems will have the lowest cost to execute.
Fig 1-4 shows the addition of the query plan as
the output from the optimizer being passed to the query executor.
Also introduced is the Storage Engine,
which is interfaced by the Query Executor via OLE DB to the Access Methods
(coming up next)
xxxxxxxxxxxxxxxxxxxxxxxx
Access Methods
It is a collection of
code that provides the storage structure for your data and indexes as well as
the interface through which data is retrieved and modified.
Our SELECT statement needs to read just a few rows that are all on
a single page. The access methods code will ask the Buffer Manager to retrieve
the page so that it can prepare an OLE DB rowset to pass back to the Relational
Engine.
Buffer Manager
Manages buffer pool; the
majority of SQL Server’s memory usage.
SQL Server needs to
maintain a minimum level of free pages in memory.
Data Cache
Largest part of the
buffer pool; it’s the largest memory consumer within SQL Server.
Every data page that is
read from disk is written to before being used.
The sys.dm_os_buffer_descriptors
DMV contains one row for every data page currently held in cache.
SELECT COUNT(*)*8/1024 as 'Cached Size (MB)', CASE database_id
WHEN 32767 THEN 'ResourceDB'
ELSE DB_NAME(DATABASE_ID)
END AS 'dbtabase'
FROM sys.dm_os_buffer_descriptors
GROUP BY db_name(database_id), database_id
ORDER BY 'Cached Size (MB)' DESC
OUTPUT:
Cached size (mb) Database
3287
People
34
tempdb
12
ResourceDB
4
msdb
The amount of time that pages stay in cache is determined by a least
recently used (LRU) policy.
Can view how long SQL Server expects to be able to keep a page in
cache by looking at the MSSQL$<instance>: Buffer Manager\Page Life Expectancy counter
in Performance Monitor.
A Basic select Statement Life Cycle Summary (Fig1-5)
xxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxx
Buffer
Manager
The key point here is that the UPDATE statement has
changed the data in the data cache, not in the actual db file on disk.
This is done for performance
reasons, and the page is now what’s called a dirty page bcos it’s
different in memory than it is on disk.
The page that needs to be modified
is already in cache, so all the buffer manager needs to do is modify the page
as requested by the Access Methods.
Recovery
Modifications are written to the
transaction log first and are then actioned in memory only.
This is done for performance reasons
and bcos u can recover the changes from the transaction log should you need to.
Dirty pages
Once the page has been modified in memory it is marked as a
dirty page.
Clean pages can be flushed from
cache using dbcc dropcleanbuffers, which can be handy when
ur
troubleshooting development and test environments bcos it forces subsequent
reads to be fulfilled from disk, rather than cache, but doesn’t touch any dirty
pages.
Sys.dm_os_buffer_descriptions
DMV, display dirty
pages exist in each database:
SELECT DB_NAME(DATABASE_ID) AS 'DBNAME', COUNT(PAGE_ID) AS 'DIRTY PAGES'
FROM SYS.DM_OS_BUFFER_DESCRIPTORS
WHERE IS_MODIFIED =1
GROUP BY DB_NAME(DATABASE_ID)
ORDER BY COUNT(PAGE_ID) DESC
Dirty pages
will be written back periodically whenever the free buffer list
is low or a checkpoint occurs.
Lazy writer
The Lazywriter is a thread
that periodically checks the size of the free buffer list.
When it’s low, it scans the whole
data cache to age-out any pages that haven’t been used for a while.
If it finds any dirty pages that
haven’t been used for a while, they are flushed to disk before being marked as
free in memory.
Checkpoint
Process
A checkpoint is a point in time
created by the checkpoint process at which SQL Server can be sure that any committed
transactions have had all their changes written to disk.
By default, on a busy server, SQL
Server will issue a checkpoint roughly every minute, which is
market in the transaction log.
ALL ABOUT
TRACE FLAGS
Trace flags provide a way to
change the behavior of SQL Server temporarily and are generally used to help
with troubleshooting or for enabling and disabling certain features for
testing.
Recovery
Interval
By default the recovery interval is
set to 0, which allows SQL Server to choose an appropriate interval, which
usually equates to roughly one minute between automatic checkpoints.
Recovery
Models
SQL Server has three database recovery models:
1.
FULL,
2.
BULK-LOGGED
3.
SIMPLE
Full
All of their operations
fully logged in the transaction log and must have a backup strategy that
includes full backups and transaction log backups.
SQL server databases
that require the highest level of recoverability should use the FULL Recovery
Model.
Bulk-logged
Temporarily to improve
the performance of certain bulk operations by minimally-logging them.
Redo information
is not logged which means that v also lose point-in-time recovery.
All other operations are
fully-logged just like the full recovery model.
Simple
When the simple recovery
model is set on a database, all committed transactions are truncated from the
transaction log every time a checkpoint occurs.
xxxxxxxxxxxxxxxxxxxxxxxxx
Another benefit to having everything in one place is that v can now
get better visibility of what’s happening at that level than was possible prior
to SQLOS.
V can access all this information through dynamic management views
(DMVs).
Any DMV that starts with sys.dm_os_provides an insight into
the workings of SQLOS.
Note: the SQLOS doesn’t replace Windows. Ultimately, everything
ends up using the documented Windows system services; SQL Server just uses them
in such a way as to optimize for its own specific scenarios.
SQLOS is not a way to port the SQL server architecture to other
platforms like Linux or MacOS so it’s not an OS abstraction layer.
It doesn’t wrap all the OS APIs like other frameworks such as .NET,
which is why it’s referred to as a “thin” user-mode layer. Only the things that
SQL Server really needs have been put into SQLOS.
SUMMARY
o
The Query Optimizer’s job is to find a good plan in a
reasonable amount of time; not the BEST plan.
o
Anything u want to read or update will need to be read
into memory first.
o
Any updates to data will be written to the transaction
log on disk before updated in memory so transaction log performance is
critical; the update isn’t written directly to the data file
o
A database page that is changed in memory but not on
disk is known as a dirty page.
o
Dirty pages are flushed to disk by the checkpoint
process and the lazywriter.
o
Checkpoints occur automatically, roughly every minute
and provide the starting point for recovery.
o
The lazywriter keeps space available in cache by
flushing dirty pages to disk and keeping only recently used pages in cache.
o
When a database is using the Full recovery model, full
backups will not truncate the transaction log. You must configure transaction
log backups.
The SQLOS is
a framework used by components in SQL Server for scheduling, I/O, and memory
management.
ShortCut
|
Details
|
Ctrl + D
|
Results in
Grid
|
Ctrl + T
|
Results in
Text
|
Ctrl + U
|
Change
database
|
Ctrl + E or
F5
|
Execute Query
|
F8
|
Show/Hide
Object browser
|
Ctrl + Q
|
New Query
Window
|
Ctrl + N
|
New Query
with New Connection
|
Ctrl + O
|
Open a .SQL
Script file
|
Shift
+ Alt + Enter
|
Full
Screen
|
Ctrl +
F5
|
Parse
the query
|
Ctrl +
R
|
Show/Hide
Results Pane
|
F6
|
Switch
between query and results panes
|
Alt +
F1
|
Information
about all the objects in the current database
|
CHAP 2: UNDERSTANDING MEMORY
o Understanding physical memory and how
virtual memory addressing is used.
o Optimizing 32 bit systems with large amount
of memory
o Optimizing the memory configuration on
64 bit systems
o AWE and its uses on 32-bit and 64-bit
environments
o Explaining MemToLeave
o SQL Server’s memory clerks, caches, and
pools
o Looking at SQL Server’s plan cache
o Determining a setting for Max Server
memory
o An in-depth look at Query / Workspace
memory
Memory, disk and CPU are the holy trinity of resources in a
computer system, and memory is first because it’s the are that ur most likely to have an
issue with.
Memory issues can cause both disk and CPU saturation, so when
troubleshooting a server issues (or at least a performance issue) u need to
start by looking at the memory profile of the system.
Physical Memory
Most commonly referred to as RAM.
RAM= primary storage / main memory /
system memory.
Bcos it’s directly addressable by the
CPU.
The peak transfer rate for a mainstream
RAM module would be about 5GB/sec.
xxxxxxxxxxxxxxxxxxxx
CHAP 8 : DEFINING YOUR APPROACH TO TROUBLESHOOTING
o
How
to approach troubleshooting
o
Defining
the problem ur
working on
o
Knowing
your SLA and baselines
o
Defining
the exit criteria
o
Choosing
how to gather data
o
Your
options for analyzing a data collection
o
Troubleshooting
components like failover Clustering, Replication, and Analysis services
Tips for Identifying a Problem
o
Take
screenshots or cut and paste pop-up screens indicating errors.
o
Obtain
copies of all relevant logs such as the Event Viewer logs, the SQL Server Error
Log, dump files, and application-specific logs. Sometimes different logs will
have different levels of detail or even different information captured.
o
Understand
the steps necessary to reproduce the problem.
o
When
you gather your information, make sure you understand the context. Known whether the logs are from a client
workstation, a middle tier server, or a server running sql server.
If it is an
existing system, look back at your change control documents to see what has
changed recently on the system.
Any change,
even if seemingly unrelated, should be reviewed.
Such as windows
or sql server patch, a new policy or removed permission, a configuration
option, or an application or database schema change, needs to be reviewed.
Bite-Size
Chunks
Understand the overall problem u can break it down into
manageable pieces and isolate the various pieces. The following is a list of troubleshooting by
category.
- Connectivity Issues
Check if the problem only occurs with once protocol such as Named
Pipes or TCP/IP.
Can you connect using the dedicated administrator connection (DAC)?
- Performance Issues
For a performance problem u need to determine if the issue is on
the client, the middle tier, the server on which SQL Server runs, or the
network.
Another application on the server is using all the available CPU
resources and leaving nothing for SQL Server to use. This can make SQL server
seem unresponsive or slow.
- Memory or Compilation Issues
If you are able to identify one user query that is slow, the most
common causes are insufficient resources (network, memory, I/O, CPU), blocking,
out-of-date statistics, or inefficient indexes.
Sometimes the problem is not with the query execution but with the
compilation. Often, analyzing the plan cache can lead to resolving the problem.
A basic data
collection will capture the following counters:
- LogicalDisk and PhysicalDisk:
- Avg. Disk Bytes / {Read, Write, or Transfer}
- Disk Bytes/sec
- Disk Transfers/sec
- Avg. Disk sec/Read
- Avg. Disk sec/Write
- Memory
- Available Mbytes
- Free system page Table Entries
- SQL Server Buffer Node:
- Page life Expectancy
- Target Pages
- Total Pages
- SQL Server Databases:
- Active Transactions
- Log File(s) Size (KB)
- Log File(s) Used Size (KB)
- Percent Log Used
- Transactions / sec
- SQL Server General Statistics:
- Processes Blocked
- User Connections
- SQL Server Latches:
- Average Latch Wait Time (ms)
- Latch Waits /sec
- SQL Server Locks:
- Average wait Time (ms)
- Lock Requests/sec
- Lock Timeouts/sec
- Lock Waits/sec
- Number of Deadlocks/sec
- SQL Server Memory Manager:
- Optimizer Memory (KB)
- SQL Cache Memory (KB)
- Target Server Memory (KB)
- Total Server Memory (KB)
- SQL SERVER SQL Statistics:
- All counters
- SQL SERVER Transactions:
- Free Space in tempdb (KB)
- Lognest Transaction Running Time
- The other counters if you are using any type of snapshots, including triggers
- SQL Server Wait Statistics:
- All counters
- Process (Choose at least your SQL Server instance(s) as well as any other applications/process that might compete with SQL Server.):
- % Processor Time
- Private Bytes
- Virtual Bytes
- Working Set
- Processor (“<All instances>” and Total):
- % Processor Time
Storage vendor
I/O counters and PerfMon I/O counters rarely match and you have to understand
the purpose and scope of each.
A SQLTrace may
have different levels of detail. The more events you choose, the bigger the
trace file and the greater the overhead of capturing the data. A big trace file
gives you more data to analyze but may become too large to realistically manage
even with rollover files.
Some events, such as locks and even
some of the performance events, are much more intensive than others.
Some of the
events you will commonly see in a baseline are:
o
Some
or all of the “completed” events from the RPC and T-SQL categories
o
Most
or all of the events from the “Errors and Warnings” category
o
All
of the “Database” category events (grow, shrink, etc.)
o
The
“Locks” category Deadlock Graph event and possibly the escalation and
timeout events but generally NOT any of the other lock related events. There
are other ways to troubleshoot blocking that are much less intensive.
o
The
“Progress Report” category events
o
Some,
all, or none of the Performance events. These events can take a lot of space so
consider the trade-offs carefully.
Events and
Alerts
Windows has built in event logs, SQL Server has an Error
log, and your application may have a long as well.
For example:
- I/O errors from SQL Server are often accompanied by hardware errors in the windows event logs.
- The cause of a cluster failover might be apparent from looking in the cluster log or the Windows event logs.
- A series of monitoring tool alerts warning that a disk is approaching capacity might indicate whether an out of space error is due to a sudden problem or steady growth over time.
Turn off any monitoring; make sure you turn it back on once the
problem is resolved.
TOOLS AND
UTILITES
The following is a list of some information-gathering tools
nd utilities that are available.
SQLDiag
It can be used to gather basic
environmental information such as the SQL Error Logs, event logs, and SQL
Server configuration settings with the /X option. It is captured in SQLTrace
and PerfMon information.
PSSDiag
This is a wrapper around SQLDiag with some extensions that
Microsoft Support configure and send to customers to help troubleshoot a
specific issue.
Windows
Event logs (Application, System, and Security)
These logs are often useful to show you what errors,
warnings, and informational messages have occurred in the recent past.
Custom
Application Logs
Application is instrumented to write errors or diagnostic
information.
User Dumps
See an exception in the SQL Error
Logs you should also see a mini-dump file with the extension .mdmp in the same
directory.
PerfMon
Performance Monitor ships with Windows and can be used to
gather information on the resources used on the server.
It can track system information such
as CPU and memory usage, I/O statistics, and network activity.
NetMon
Network Monitor is a free network packet analyzer tool that
is used to look at data as it is sent over the network.
Management
Data Warehouse
This is used for performance trending and can be used to
see how performance has changed over time.
TROUBLESHOOTING OTHER COMPONENTS
Failover
Clustering
- When you collect a full user dump from SQL Server it will appear that SQL Server is unresponsive. If you need to gather a user dump from a clustered instance, you need to increase the time that the cluster service waits before it decides to do an automatic failover.
- Any files u expect to be available, regardless of which node owns the clustered instance.
- To start the instance in single_user mode, u need to start it outside the cluster so that the cluster service does not take the only available connection.
- Take the SQL Server, SQL Agent, and Full-Text resources offline.
- Don’t take the SQL Server name, IP, or disks offilen.
- After this u can start SQL Server from the command line using –m parameter and whatever other parameters are necessary for your troubleshooting steps.
- Finally stop the command-line version of SQL Server and then bring SQL Server, SQL Agent, and Full Text online.
- The SQL Server service depends on an IP address, a name (DNS entry), and at lest one disk.
- If any of these dependent resources are offline, SQL Server cannot be brought online (started).
- Do the other applications (s.a other instances of SQL Server) already have resources in use that this instance cannot use?
- Are the permissions to directories, files, and Registry keys the same on all nodes?
Some queries u may want to use when troubleshooting a clustered
instance:
o Check to see if the instance is
clustered:
SELECT SERVERPROPERTY('ISCLUSTERED')
o Use the below DMV to list the drives
that can be used by this clustered instance of SQL Server for database files,
transaction logs, and SQL Server backups.
SELECT * FROM
SYS.DM_IO_CLUSTER_SHARED_DRIVES
o
Get
a list of possible owenrs of this SQL Server instnace
SELECT * FROM
SYS.DM_OS_CLUSTER_NODES
Replication
In many respects u can treat replication as a third party
product that happens to use SQL server as a backend.
Replication activity occurs via
“agents” that are SQL Agent jobs that call replication executables. Each agent
makes calls to stored procedures that result in data movement between
replicated databases so u can trace and troubleshoot those stored procedure
calls just like u would any other application.
You can use SQLTrace or Profiler to
find out what procedure calls replication is making at any given time. In most
cases u have at least some network traffic to deal with as most production
systems involve at least 2-3 different servers.
U have a
performance problem, u can look at SLAs, baselines, and nw traces just like
with any other application.
If any of the replication databases are on a clustered instance,
replication works the same way as it does on a standalone instance; however,
just as with any other clustered application, the files that replication
references must be on a clustered shared drive. For replication those files are
specified in snapshot properties for
each type of replication.
For
transactional replication latency problems u can use tracer tokens.
A tracker
token is basically just a simple bit of data sent through the system. It is
used to measure the elapsed time from the publisher to the distributer and from
the distributor to the subscriber(s).
Break down the
problem into where it occurs and which agent is running when the problem
occurs.
The executable
called by the replication agent (a replication job in SQL Server Agent)
initiates the connection to SQL Server and passes it data and parameters.
The parameters
can be specified directly in the agent job or in an agent profile. The
parameters passed into each replication agent can greatly impact performance.
Most
replication performance can be handled with normal SQL Server performance
troubleshooting.
Replication is
in many ways just another application that runs on top of SQL server.
It is
important to understand how and when each agent runs; once u do so, it becomes
easier to narrow down the problem and concentrate on the correct instance of
SQL Server.
TO help you understand
the role of each agent, here are some descriptions:
- SNAPSHOT AGENT
Reads data on the publisher and writes the output to a
file. It is used in all types of replication.
Sometimes it can cause blocking or performance problems on the
publisher.
The job that initiates the snapshot has to be able to read the data
in the database and write to the designated snapshot folder.
If the snapshot folder is not local to the subscriber, you have to
consider where nw bandwith impacts any performance issues you may see.
- LOG READER AGENT
It is used only for transactional replication but this includes
peer-to-peer replication.
It reads the transaction log on the publishing database and sends
the data to the distribution database.
This adds some load to the publisher
and in particular adds load to the directory where the log file of the
publishing database resides.
When the log reader is running it
changes the log from a purely sequential operation to a semi-random operation.
On the system that is already
stressed, the additional movement of the disk read heads may cause some
increased response times for that drive.
The log reader is writing to the
distribution database which may be on a different server (a remote
distributor) from the publisher so once again you may need to consider
network bandwidth in performance problems.
- DISTRIBUTION AGENT
It is used in transactional (including peer-to-peer) and snapshot
replication. It moves data from the distribution database to the subscriber(s).
Initially this data movement is through the application of a
snapshot. If the article is configured so that the object is recreated when a
snapshot is applied, applying the snapshot through the Distribution Agent will
read the snapshot folder, find the CREATE statements stored there is various
files when the Snapshot Agent ran.
If the article is configured to replace the data on the subscribers
(this is almost always true) then the Distribution Agent will Bulk Copy the
data from the snapshot folder into the subscriber tables.
The majority of the time the Distribution Agent is not applying a
snapshot, it is applying changes captured by the log reader. Network bandwidth
can be an issue with this agent as well.
- Merge Agent
It is used only in merge replication; it moves data between the
publisher and subscribers. Merge replication is not as straight fwd as the
other types of replication.
The distribution database plays a much smaller role as data is
moved directly between the publisher and the various subscribers.
Triggers on the tables involved in merge replication populate these
metadata tables on the local database.
Network bandwidth can sometimes contribute to performance issues.
- Cleanup Agents
It remove metadata that is no longer needed by replication. The
cleanup agents may modify large amounts of data at once which can cause a surge
in log usage and possibly blocking.
These agents are running on the local instance so network bandwidth
is not an issue.
- Replication Monitor
SSMS has a Replication Monitor that is used to monitor the health
of the replication topology and indicate the status of each replication job.
This is a very useful tool that can provide immediate feedback
about the health of our replication topology.
You can use SQLTrace/Profiler to see what commands are issued by
Replication Monitor and mode those commands to create ur own custome monitoring or reports if
necessary.
Analysis Services
Apply to both SQL Server and Analysis Services troubleshooting. For
performance troubleshooting, u need baselines to compare to, for test systems u
need data that is as close to production as possible, and Profiler and PerfMon
are two of your main troubleshooting tools.
XMLA Queries
It is the language used to submit
administrative queries to Analysis Services. Processing, for example is a
collection of XMLA commands.
The key Profiler events to capture
for an XMLA query problem are:
o
Command
event : ALL
o
Errors
and Warnings: ALL
o
Progress
Reports: ALL
CHAP 9:
Viewing Server Performance with PerfMon and
the PAL Tool
WHAT’S IN
THIS CHAPTER
- When and how to use Pefmonance monitor
- How to baseline server performance
- Using sample thresholds to identify performance problems
- Automating data analysis with Performance Analysis for Logs (PAL)
PerfMon:
It is a Windows tool for measuring
real-time server performance and recording system performance to a file over
time.
Performance monitor can assist with
resolving complex problems, some of which may seem random or intermittent, by
narrowing the focus to the root cause.
This chapter
addresses these questions in three main ways:
- It provides key counters and thresholds for issues
- It helps u gather a baseline form a healthy server
- It demonstrates tools to assist with analyzing performance logs.
PERFORMANCE
MONITOR OVERVIEW
PerfMon provides server-wide,
real-time and logged performance monitoring. In windows server 2008, PerfMon
has been incorporated into the Reliability and Performance Monitor.
Use Performance
Monitor for the following common tasks:
- View real-time performance data on your server.
- See performance data represented visually
- Record performance data over an extended time frame
- Quantify the performance impact of HW or SW changes
- Save and export performance data
- Fire alerts based on performance thresholds
- Compare performance data from different servers
Reliability
and Performance Monitor in Windows Server 2008
Reliability and Performance Monitor comprises three main
components:
1.
Monitoring
Tools
2.
Data
Collector Sets
3.
Reports
Data
Collector Sets
It combine all information necessary
for common problem diagnostics, including event tracing, performance counters,
and configuration.
Three predefined system data
collector sets are included: LAN Diagnostics, System Diagnostics, and System
Performance.
Reliability
Monitor
It provides a system stability chart. Here, events s.a. H/W
failure, Application failures, or Windows failures are tracked against a timeline.
The chart displayed by Reliability Monitor plots the system Stability Index – a
rating system for reliability whereby 10 reflects a stable server and 1
reflects an unstable server.
NEW COUNTERS
FOR SQL SERVER 2008 IN PERFMON
SQL Server 2008 has some new PerfMon counters to provide
performance logging for new features s.a Resource Governor, and there are
new counter instances for improved features, s.a for monitoring compressions in
database mirroring.
There are two new counters for Resource
Governor called i. Resource
Pool State
& ii. Workload Group Stats, each containing more than ten
instances relating to the new Resource Governor functionality.
New PerfMon
Counters for SQL Server 2008
xxxxxxxxxx
New Counters for DB Mirroring
xxxxxxxxxxxxxxxxxxx
TROUBLESHOOTING
SQL SERVER PROBLEMS
When accepting
a problem to investigate, it’s useful to know the following information:
- The problem statement, in order to understand the user experience and expected application behavior.
- When troubleshooting steps have been carried out already (by the users and first-or second-line support engineers)
- What data has already been gathered about the problem
REPRODUCING
PROBLEMS
If possible, take the time to
reproduce the problem. When you reproduce a problem, you’re half the way to
fixing it, as recreating the exact scenario in which a problem condition
occurred required a thorough understanding of the problem itself.
PerfMon is great tool to start
troubleshooting most kinds of problems. It can be used to do either of the
following:
- Prove a problem condition
- Narrow the problem scope
If practical,
establish and prove the problem condition.
PerfMon can then be used to eliminate areas of doubt or identify
specific areas where further investigation is required.
PerfMon is
vital in situations where SQL Server is co-resident with another application or
a Windows feature such as ISS.
DATA-DRIVEN
TROUBLESHOOTING
It refers to an iterative approach to troubleshooting problems
whereby each decision or recommendation is based on data gathered about the
problem.
Data-driven
troubleshooting relies on three components:
- Tools to collect data
- Engineers to read data and understand the problem
- Any one of a number of people to assist with problem resolution (server h/w engineers, storage engineers, or developers)
Note:
PerfMon will help u
identify the problem area. For ex, if you note that CPU is higher than usual,
or disk reads are slow, you can focus on high CPU queries or queries with
expensive disk access as necessary.
Here’s a (
non-exhaustive) list of tools available for troubleshooting SQL Server
problems:
1.
SQL Server error log
2.
Windows event log
3.
SQL Profiler
4.
PerfMon
5.
Stored procedure
6.
DMV
7.
DBCC
8.
Replication Monitor
9.
Wait stats
10. Performance Data Warehouse
11. SQL Nexus
12. Database Tuning Advisor
13. PSSBlocker script
14. PerfStats Script
15. SQL Server Management Studio Reports
16. Trace flags
17. Execution Plans
18. Stack dumps
PerfMon provides a server-level view of
hardware utilization. It is Windows monitoring tool; as such it can be used to
monitor any application.
MONITORING READ-TIME SERVER ACTIVITY
Within the Reliability
and Performance Monitor, the following steps provide an overview of server
activity:
1.
Select
Performance Monitor from the Monitor Tools folder.
2.
Right-click
anywhere in the char area and choose Add Counters.
3.
Select
“MEMORY”, and click to expand
4.
Select “Available Mbytes” and click add.
5.
Scroll to local the “LogicalDisk” counter and
select “Avg. Disk sec/Transfer”. If you have
multiple disks, then select each disk individually from the instance list and
click OK.
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
A couple of pointers to help out:
1.
Organize
the columns so u can read each row more easily.
2.
Press
Ctrl+H to enable highlighting for the selected counter – this means the
line graph is highlighted.
3.
Select
all counters, right-click, and choose Scale Selected Counters to ensure that
they are all displayed within the screen.
Mostly u’ll want to monitor a server, and to monitor a remote
server by typing the server name from the Add Counters dialog.
STARTING OUT WITH DATA COLLECTOR SETS
Data collector sets are groups of data
gathering tools, and can include kernel tracing, performance logs, and
configuration data.
To utilize, a predefined
data collector, expand Data Collector Sets > System, right-click system
Performance, and choose Start.
The System Performance
collector will run for 60 seconds. Once collection has finished, navigate to Reports
> System > System Performance, and choose the most recent
report.
xxxxxxxxxxxxxxxxxxxxxxx
Brief summary of the three types of data that can be captured in
the data log:
LOG TYPE
|
DESCRIPTION
|
Performance
Counter
|
PerfMon
counters provide performance data for most aspects of Windows and SQL Server
|
Event
trace data
|
Using Event
Tracing for Windows provides low-level operating system tracing
|
System
configuration information
|
This captures
Registry keys
|
Create new data collector set a useful name and create it manually.
Click Add, and select all counters as below
xxxxxxxxxxxxxxxxxx
WHAT TO BE
AWARE OF WHEN RUNNING PerfMon
When monitoring, consider performance implications with a
view to reducing overhead and minimizing risk of the following:
o
Making
problems worse
o
Affecting
monitoring results
THE IMPACT
OF RUNNING PERFMON
PerfMon is a lightweight tool and its impact on any given
server is partly related to how PerfMon is configured and partly dependent on
the workload of that server while PerfMon is running.
One of the challenges with many performance problems is
that we must obtain a Perfmon log to identify the cause of the
problem. Without a log, engineers and
managers can observer poor application performance and hypothesize about
potential causes and remedies; but in order to diagnose the problem and take
remedial action, we must obtain performance data.
SERVER WITH
VERY POOR PERFORMANCE
- Run PerfMon remotely
- Reduce the sampling interval
- Include as few counters as possible
- Log to disk
COMMON
PerfMon PROBLEMS
In some cases you may have problems
with PerfMon itself, specifically counters could be missing or aren’t displayed
correctly, or there could be problems connecting to servers remotely.
If you are running SQL Server in a Windows-on-windows (WOW)
mode, i.e., 86 SQL Server on x64 Windows, u’ll be unable to run PerfMon
remotely from other x64 machines bcos the remote Registry service is an x64
process; therefore, counters will only be visible to x86 processes.
REMOTE
MONITORING FAILURES
If you’re trying to monitor remote servers and this is failing, the
most likely cause is permissions problems. Some common causes:
1.
Ensure
that the account is the local administrator on the target server
2.
Confirm
NetBIOS access to the target server
3.
Ensure
that the remove Registry service is running on the target server
4.
Ensure
that no local security policy or Active Directory group policy is restricting
access.
CPU
PERFORMANCE COUNTERS
SQL Servers suffering from performance problems caused by high CPU
usage is one of the most frequently occurring performance problems.
It can be easy to identify sqlservr.exe using Task Manager;
the following table describes counters that will provide further evidence to
assist troubleshooting.
COUNTER
|
WHAT
TO LOOK FOR
|
PROBLEM
CONDITION
|
|
Processor
|
%ProcessTime
|
Percent
of total time the CPUs are busy servicing productive request
|
>
80%
|
Processor
|
%
Privilege Time
|
Percent
of total CPU time spent servicing kernel mode request
|
>
30%
|
Process
|
%
Processor Time (sqlservr)
|
Percent
of total time SQL Server spent running on CPU (user mode + Privilege mode)
|
>
80%
|
Process
|
%
Privilege Time (sqlservr)
|
Percent
of time SQL Server was executing in Privilege mode
|
>
30% of % Processor Time (sqlserver)
|
COMMON CAUSES OF CPU PROBLEMS
The three common
causes of high CPU condition are:
- Missing or Outdated Statistics:
The Query Optimizer is dependent on relevant statistics to
determine a good execution plan – missing or outdated statistics could cause it
to select a sub-optimal plan, causing excessive CPU consumption.
- Missing Indexes:
A lack of useful indexes can result in a high-CPU condition. SQL
Server is dependent on meaningful indexes to retrieve data efficiently, and
missing indexes can often cause execution CPU utilization.
A lack of useful indexes can result in expensive operations such as
hash joins and sorts that could otherwise be avoided with improved indexes.
- Excessive Recompilations:
Poor plan reuse can cause a high-CPU condition whereby SQL Server
consumes excessive CPU cycles while generating query plans. Recompilations can
be caused by ad-hoc or dynamic queries or by a lack of memory (procedure
cache), causing plans to be dropped from cache.
INVESTIGATING MEMORY-RELATED PROBLEMS
It is closely
related to the availability and performance of sufficient memory.
- SP_CONFIGURE:
- Min/max server memory
- AWE Enabled
- Min memory per query
- Windows:
- /3GB,/USERVA,/PAE
- Lock Pages in Memory privilege
Windows Task Manager
doesn’t provide a representative measure of the memory consumed by SQL Server
because Task Manager only reflects the buffer pool, and SQL Server makes memory
allocations outside this space for objects such as extended stored procedures,
linked servers, and the CLR.
Additionally, when
SQL Server is using AWE on 32-bit servers, Task Manager will not reflect the
correct memory allocation.
TYPES OF MEMORY PRESSURE
SQL Server can suffer from internal or
external memory pressure, and understanding how to identify and troubleshoot
each will enable more targeted troubleshooting.
External memory
pressure occurs most often when SQL Server is running on a shared computer and
several processes are competing for memory.
The Resource
Monitor within the SQL Server Operating System (SQLOS) will receive
a signal from Windows requesting SQL Server to reduce its committed memory.
This will cause SQL Server to recalculate its target commit level, reducing it
if required.
Internal memory pressure occurs when
internal SQL Server resources compete with each other for memory.
This is typically
caused by SQL Server shrinking the buffer pool.
Use the DBCC MEMORYSTATUS command to gain visibility of SQL Server memory consumption.
xxxxxxxxxxxxxxxxxxxx
Disk or Storage Problems
SQL server read/write
performance is closely related to the ability of Windows to retrieve and write
data pages to disk efficiently.
COUNTER
|
WHAT
TO LOOK FOR
|
PROBLEM
CONDITION
|
|
Physical
Disk
|
Avg.
disk sec/Read
|
Average
time in seconds to complete a read from disk
|
>
0.010 Sub-optimal > 0.020 Poor
|
Physical
Disk
|
Avg.
disk sec/Write
|
Average
time in seconds to complete a write from disk
|
>
0.010 Sub-optimal > 0.020 Poor
|
SQL Server Performance Problems
Sometimes server HW
resources do not cause bottlenecks but application performance is still bad.
In this situation,
internal SQL Server resources can become exhausted or depleted.
Key SQL Server PerfMon Counters
xxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
WAIT STATS ANALYSIS
SQL Server Wait stats
record the resource and time SQL server spends waiting for each resource. A
number of these wait types are exposed as PerfMon counters:
- Lock waits
- Log write waits
- Network I/O waits
- Non-page latch waits
- Page I/O latch waits
- Page latch waits
It may be easier to access these wait stats from the DMVs
within SQL Server; however, collecting them as part of a system-wide data
gathering exercise with PerfMon minimizes the logistics effort involved in
collecting the data.
GETTING STARTED WITH PERFORMANCE ANALYSIS
FOR LOGS (PAL)
Performance Analysis for logs is a tool
created by engineers from the product support organization with MS to automate
log file analysis to provide rapid identification of problems within log files.
PAL works using templates to analyze performance data and will flag values
outside the acceptable range as determined in the templates.
PAL was designed to reduce the
labor-intensive log analysis tasks and help engineers provide faster turnaround
in log analysis.
Since its original
release PAL has been enhanced to include templates to analyze
performance logs for many popular MS server products, s.a ISS, BizTalk,
Exchange, Active Directory, and SQL Server.
Templates are based on
thresholds used by product support teams at Microsoft when troubleshooting
customer problems.
Using PAL simplifies
the task of log analysis, providing a report that highlights key problem areas.
PAL uses
templates and widely-tested thresholds that correlate performance data from
multiple counters, which means reports provide a robust and reliable solution
for log file analysis.
PAL is not a replacement for good methodology and usual troubleshooting
steps!
There is no need to install the
PAL on a production server. Performance analysis is resource intensive;
therefore, to avoid impacting performance of a production server, capture the
logs on the server suffering the problem and copy these to a desktop or test
server for analysis with PAL
USING
SQL Server to Analyze PerfMon Logs
Analyzing PerfMon logs from within a
database has the benefit of data access through the familiar language of T-SQL,
meaning problems should be easier to identify and you can write queries looking
for specific problem conditions.
In the following example, three counters could
be combined to identify a low-memory condition:
1. Available memory less than 100 MB
2. Page life expectancy less than 60 seconds
3. Buffer cache hit ratio less than 98%
The following query could be used to
identify any instance during the data capture window when the low memory
condition existed:
SELECT * FROM subset WHERE Mem_Avail_Bytes
< 1000000 and Buff_Mgr_PLE < 60 and Buff_Cache_Hit_Ratio < 98
COMBINING PerfMon Logs and SQL Profiler
Traces
A feature first
available in SQL Server 2005 was the capability to combine PerfMon logs with
SQL Profiler traces.
Using Profiler to combine logs in this way enables u to view T-SQL code
that’s running on the server, combined with the HW impact of running the code,
s.a high CPU or low memory.
xxxxxxxxxxxxxxxxxxxx
Extracting Performance data for a Specific
Timeframe
Were a problem to
occur at 10:30 am to March 15, it would be useful to extract the time frame
from 10:00 a.m to 11.00 a.m to provide a manageable log size, without losing
any data points, as shown in the following example:
Relog Server001_LOG.blg – b
15/03/2009 10:00:00 –e 15/03/2009 11:00:00 –o Server001_LogExtract.blg
The Relog parameter –c enables a subset of
counters to be specified.
Relog Server001_LOG.blg – c
“\Memory\*” –o Server001Memory_Log.blg
USING LogMan
It is
a command line tool used to create and mange PerfMon logs. The LogMan tool can
be used to schedule the starting and stopping of logs.
This can be a useful alternative to using
the Windows AT scheduler or the scheduler functions available within PerfMon.
The great benefit of using LogMan is that
you can centrally start & stop performance monitoring.
xxxxxxxxxxxxxxxxx
Using LogParser
It
is a simple to use yet powerful tool for log file analysis, popularized for
analyzing logs from IIS web servers.
This can be great when searching Windows
event logs, IIS logs, or PerfMon logs.
xxxxxxxxxxxxxxxxx
CHAPTER 10
TRACING SQL Server with SQL Trace and
Profiler
·
An overview of why tracing is important
·
Deep look into the Architecture of SQL Trace
·
Comprehensive coverage of the Tracing views, functions,
and stored procedures
·
An extended analysis of Security and SQL Trace
·
Profiler, Profiler and more profiler
Profiler is a GUI;
SQL Trace is the just. Profiler is a very useful tool and is a great option for
ad-hoc performance analysis.
By running a trace
the DBA can answer the following questions:
·
How many calls are made to the database?
·
How long are the calls taking?
·
Are the calls synchronous or asynchronous in nature?
·
Has the developer used stored procedure or are the calls all
ad-hoc SQL statements?
WHAT TO TRACE
The SQL Trace architecture is present
to answer three types of questions.
1. The first, as you
have seen, relates principally to performance. SQL Server can tell u via SQL
trace what was being executed when any by whom.
2. It can answer the
“who accessed the system and changed a table” question but also the “who
changed the permission for a given user” questions as well.
3. The third type of
question is the “what went wrong” question. SQL Trace enables you to track
errors and warnings and can guide you to answer some problems in your
solutions.
You can get some of
this information from the SQL log or the windows event log. However, the nice
bit about capturing this data in SQL Trace is that it makes it easy to see
those errors in context”.
Knowing that a
certain stored procedure had just started prior to an exception being raised my
be critical to diagnosing a fault or issue.
How to Trace
There are only two ways to trace
information out of SQL Server and both of these methods use the SQL Trace
architecture.
1. Use Profiler
&
2. Build and execute a
server-side trace
Profiler:
Is a GUI which you can use to interactively
build and execute a tracing sessions. You can select any combination of events
and SQL Trace will return them to the GUI so you can see what is happening in
real time.
The server-side trace however is the main
event, in my opinion, when it comes to tracing SQL Server, especially in
production environments. You create server-side trace by executing a number of
system stored procedures against SQL Server.
The
server-side trace then writes the event that you have asked it to capture to a
file on the file system. For auditing style traces you should always use a server-side
trace.
THE ARCHITECTURE OF SQL TRACE
In
SQL Server, this tracking information is implemented as a series of events. The various components of the database engine
are collectively known as event procedures.
SQL server provides the trace controller (fig 10-1).
The trace controller component tells the rest of the database engine,
via a bitmap, which events are being requested by a tracer.
If someone, anyone, has an open trace running and wants a given event,
then the database engine will generate that event when it occurs. An open trace
equates to a trace session.
Once the event has been generated by the event producer, the trace
controller decides what to do with it. The event is held in a queue and
synchronized while the trace controller determines what it needs to do with it.
Bcos the trace controller knows what events each trace has requested, it passes
the complete event down the line to each trace session that has asked for it.
xxxxxxxxxxxxxxxxxxxxxxxxx
The trace session can now act on the trace
data it has received from the controller. Each event received contains a
complete set of columns which can be used to further filter the data that is
captured in the trace.
For example, u might be interested in only
those events generated by the actions of a single user; consequently you may
have created a filter to only keep events from a specific SPID.
After the events have been filtered, the
event can be sent to the I/O provider for the trace.
The I/O provider is the determining factor
for the endpoint of the trace event. The event is either written to a file or
sent to a rowset provider, where it is consumed by an application s.a Profiler.
EVENT CLASSIFICATION AND HIERARCHIES
It is classified by a mini-hierarchy that consists of two levels. The
top level is the event category, which logically groups the vents
together. An ex: might be locks or stored procedures. Note that there is no
root level that contains all event categories.
There are 21 different categories in SQL Server 2008. The second level
of the hierarchy reflects the events themselves and is known as the event
class.
The event class can also be sub-classified. An event subclass consists
of a subclass name and a subclass value. An event
subclass is grouped together by the event column and the event class to create
a unique list of permissible names and values for an event column.
An
event class can also be sub-classified. An event subclass consists
of a subclass name &
a subclass value.
xxxxxxxxxxxxxxxx
There is one view for each of the six grey
boxes seen in fig: 10-2 above. Notice how the binding box wires everything
together.
The sys.trace_event_bindings
view holds the many-to-many relationships between events and event columns to
form the event class.
SELECT CAT.NAME AS
CATEGORY_NAME
,EV.NAME AS EVENT_NAME
,COL.NAME AS COLUMN_NAME
,COL.IS_FILTERABLE AS
COLUMN_IS_FILTERABLE
,COL.MAX_SIZE AS
COLUMN_MAX_SIZE
,COL.[TYPE_NAME] AS
COLUMN_TYPE_NAME
,SUB.SUBCLASS_NAME AS
SUBCLASS_NAME
,SUB.SUBCLASS_VALUE AS SUBCLASS_VALUE
FROM SYS.TRACE_CATEGORIES CAT
JOIN SYS.TRACE_EVENTS EV
ON CAT.CATEGORY_ID=EV.CATEGORY_ID
JOIN SYS.TRACE_EVENT_BINDINGS BI ON
BI.TRACE_EVENT_ID=EV.TRACE_EVENT_ID
JOIN SYS.TRACE_COLUMNS COL ON BI.TRACE_COLUMN_ID=COL.TRACE_COLUMN_ID
LEFT JOIN SYS.TRACE_SUBCLASS_VALUES
SUB
ON SUB.TRACE_COLUMN_ID=BI.TRACE_COLUMN_ID AND
SUB.TRACE_EVENT_ID=BI.TRACE_EVENT_ID ORDER
BY CAT.NAME,EV.NAME,COL.NAME,SUB.SUBCLASS_VALUE
SQL TRACE CATALOG VIEWS
That
section missed the most important view: sys.traces. This is the view
that actually shows you the list of trace definitions on the server.
All the other tracing views are present to provide the metadata, but sys.traces
shows the actual object definitions of user-defined and system-defined
traces.
SYS.TRACES
This is the catalog view to see all traces that have been configured
against the SQL Server. A trace will appear in this view when a SQL trace
definition has been created using sp_trace_create. A server-side
trace is first created with a status of “stopped”.
The stopped state is mentioned here bcos the
status of the trace determines its visibility in sys.traces. A
trace is visible when in two states: stopped or started. However, using sp_trace_setstatus
u can also close the trace.
If u want to audit the creation and
deletion of traces on a SQL Server, u can always create a trace to do this
urself. U can’t use triggers on system catalogs and so creating a trace is your
best option here. U can see the mechanics of creating a trace definition with
Profiler in the “Exporting a Trace Definition” .
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxx
Is_rowset
This one is the
least interesting. It is used to identify traces that are being consumed by a
consumer s.a Profiler or the SMO trace provider.
Using the READER_SPID column, it is easy to see which SPID is
responsible for the trace and then track that user courtesy of his or her login
information.
SELECT trc.id as Trace_id
,trc.[status] as
Trace_status
,trc.start_time as
trace_starttime
,trc.last_event_time as
trace_lasteventtime
,trc.event_count as
trace_eventcount
,trc.dropped_event_count as
trace_droppedeventcount
,ses.login_time as
session_logintime
,ses.host_name as Session_Hostname
,ses.program_name as session_programname
,ses.login_name as
session_loginname
,ses.nt_domain as
sessions_ntDomain
,ses.nt_user_name as
sessions_NTusername
,ses.cpu_time as
sessions_CPUTime
,ses.Memory_usage as
session_memoryusage
,ses.reads as
session_reads
,ses.writes as
session_writes
,ses.logical_reads as
session_logicalReads
,ses.last_request_start_time as
session_lastrequeststarttime
,ses.last_request_end_time as
session_lastrequestendtime
FROM SYS.TRACES TRC
JOIN SYS.DM_EXEC_SESSIONS SES
ON TRC.READER_SPID=SES.SESSION_ID
WHERE IS_ROWSET=1
Is_rollover
Tracking can generate a larege amount of data. When
writing to a file, this can quickly transalte to something large and unwieldly.
When files get beyond a certain size, then they become more difficult to move
and administer making house-keeping more difficult – especially if u need to
move these files over unreliable network links.
By keeping files
under 2GB u can also use the windows compresion function to conserver space.
SQL Trace provides a feature in the file
provider known as rollover. When enabled, SQL Trace will generate a new
linked file when the current file reaches the size set in the MAX_SIZE parameter.
Both the IS_ROLLOVER and the Max_Size values need to
be set for this to work.
If rollover isn’t enabled and the file has
reached the MAX_SIZE, then SQL Server will simply stop tracing – the
trace will be stopped and closed, resulting in the definition being deleted
form the system.
Ex: when rolled over, C:\Tmp\MyTrace.trc
would become C:\Tmp\MyTrace_2.trc.
SQL Server attempts to maintain the number
of files at the MAX_FILES number, but if for some reason it is prevented from
deleting the file, then it will just ignore it.
This file would need to be cleaned up
manually. To continue our example, if MAX_FILES had been set to 2 and
C:\Tmp\MyTrace_2.trc was filled, then SQL Server would delete
C:\Tmp\MyTrace.trc before creating C:\Tmp\MyTrace_3.trc.
Is_showdown
If this flag is set in the trace definition, then SQL Server will shut
itself down if for any reason it isn’t able to write out the trace information
successfully.
Interestingly, the sys.traces catalog view also provides
the number of dropped events by a trace. From this value you can infer just how
dangerous it would be to use this feature with a given trace configuration.
Is_default
The
default trace is a SQL Trace that is always running and is designed to be a
lightweight trace that ticks along in the background.
SELECT data.* into TraceTable
FROM sys.traces t CROSS APPLY
Fn_trace_gettable (t.path,default) data
WHERE is_default=1
Convert it into Human readable, use some of the user-defined functions and
traslate the integers they provide to the textual descriptions:
DECLARE @vTraceID INT;
SET @vTraceID=1;
WITH filter AS
(SELECT columnid
, CASE
logical_operator
WHEN 0 THEN 'AND'
WHEN 1 THEN 'OR'
ELSE 'err'
END AS logical_operator
,CASE comparison_operator
WHEN 0 THEN ' = '
WHEN 1 THEN ' <> '
WHEN 2 THEN ' > '
WHEN 3 THEN ' < '
WHEN 4 THEN ' >= '
WHEN 5 THEN ' <='
WHEN 6 THEN ' LIKE '
WHEN 7 THEN ' NOT LIKE '
END AS comparison_operator
FROM :: fn_trace_getfilterinfo (@vTraceID)
)
SELECT cat.name AS CategoryName
,evt.name AS EventName
,col.name AS ColumnName
,STUFF ( ( SELECT '|' + child_fil.logical_operator
+ child_fil.comparison_operator
+ CAST(child_fil.columnid AS VARCHAR(MAX)
)
FROM
filter child_fil
WHERE parent_fil.columnid
= child_fil.columnid
FOR XML PATH ('')
), 1,1, ''
) AS Colfilters
FROM fn_trace_geteventinfo(@vTraceID) evi
JOIN sys.trace_events evt
ON evi.eventid = evt.trace_event_id
JOIN sys.trace_categories cat
ON evt.category_id=cat.category_id
JOIN sys.trace_columns col
ON evi.columnid = col.trace_column_id
LEFT JOIN filter parent_fil
ON col.trace_column_id=parent_fil.columnid
GROUP BY cat.name
,evt.name
,col.name
,parent_fil.columnid
ORDER BY cat.name
,evt.name
,col.name
TRACE
CATEGORY
|
TRACE
EVENT
|
Errors
and Warnings
|
Attention
|
Errors
and Warnings
|
Exception
|
Stored
Procedures
|
RPC:
Starting
|
T-SQL
|
SQL:BatchStarting
|
The Attention and Exception events are generic, whereas the default trace
captures much more specific events. Furthermore, by capturing the RPC:Starting
and SQL: BatchStarting events, tbe black box trace
captures potentially heavily called events.
The black box can only use two files. Each file can be a maximum
size of 5MB. Therefore the maximum amount of information is constrained
to 100 MB. Given that the black box is capturing RPC:Starting and
SQL:Batchstarting events don’t expect the data in these files to
hang around for long!
Sys.trace_categories
The following code demonstrates how to query this catalog
view and also provides the textual descriptions of the category types.
SELECT category_id AS
Category_ID
,name AS
Category_name
,[TYPE] AS
Category_TypeID
,CASE [type]
WHEN 0 THEN 'Normal'
WHEN 1 THEN 'Connection'
WHEN 2 THEN 'Error'
ELSE 'Unknown'
END AS
[Category_TypeName]
FROM sys.trace_categories
Sys.trace_events
The sys.trace_events catalog view is also
primarily used for converting the event id back to a text-based descripton.
SELECT trace_event_id AS
Event_ID
,evt.name
AS Event_name
,evt.category_id
AS Event_categoryID
,cat.name AS
Category_name
From sys.trace_events evt
JOIN sys.trace_categories
cat
ON evt.category_id=cat.category_id
ORDER BY cat.name,evt.name
Sys.trace_columns
This view contains the
list of all possible columns. A couple of columns on the catalog view does
reveal some intereting information. I’d like to focus ur attention on two columns in particular:
the is_repeatable and is_repeated_base bit flags:
SELECT trace_column_id,name,TYPE_NAME,max_size,
is_filterable,is_repeatable,is_repeated_base
FROM sys.trace_columns
WHERE is_repeatable=1
OR
is_repeated_base=1
Sys.trace_subclass_values
When an event is raised,
data is passed down from SQL Server’s Trace Controller to the trace session.
This data contains the values for the columns bound to that event.
SELECT cat.name AS
Category_name
,ev.name AS event_name
,col.name as column_name
,sub.subclass_name as
subclass_name
,sub.subclass_value as
subclass_value
,datalength(sub.subclass_name) as
subclass_name_length
FROM sys.trace_categories cat
JOIN sys.trace_events ev
ON cat.category_id=ev.category_id
JOIN sys.trace_event_bindings bi
ON bi.trace_event_id=ev.trace_event_id
JOIN sys.trace_columns col
ON bi.trace_column_id=col.trace_column_id
LEFT JOIN sys.trace_subclass_values
sub
ON sub.trace_column_id=bi.trace_column_id
AND sub.trace_event_id=bi.trace_event_id
WHERE sub.subclass_name IS NOT NULL
AND cat.name = 'Locks'
AND ev.name ='Lock:Acquired'
AND col.name ='Mode'
ORDER BY cat.name
,ev.name
,col.name
,sub.subclass_value;
However, subclass_value 0 returns a name of “NULL.” With that information u probably realize
what the issue is. The value shown is actually the word NULL, not a NULL value
per se.
The hint
is the white background, and the additional filed of suclass_name_length confirms
this. The value of this field is 8, representing the byte count for the
subclass name. Clearly, then, subclass_name is a unicode data type –
hence the double-byte count for every character.
The
following code snippet returns all events that have a subclass_name of
‘NULL’ and fig 10-8 shows all nine examples of this is SQL Server 2008.
SELECT evt.name
,col.name
,sub.subclass_name
,sub.subclass_value
FROM sys.trace_subclass_values sub
JOIN sys.trace_columns col
ON sub.trace_column_id=col.trace_column_id
JOIN sys.trace_events evt
ON sub.trace_event_id=evt.trace_event_id
WHERE
subclass_name='NULL'
name
|
name
|
subclass_name
|
subclass_value
|
Lock:Deadlock
Chain
|
Mode
|
NULL
|
0
|
Lock:Released
|
Mode
|
NULL
|
0
|
Lock:Acquired
|
Mode
|
NULL
|
0
|
Lock:Deadlock
|
Mode
|
NULL
|
0
|
Lock:Cancel
|
Mode
|
NULL
|
0
|
Lock:Timeout
|
Mode
|
NULL
|
0
|
Lock:Escalation
|
Mode
|
NULL
|
0
|
Lock:Timeout
(timeout > 0)
|
Mode
|
NULL
|
0
|
Blocked
process report
|
Mode
|
NULL
|
0
|
It is also worth nothing that only eight of the 66 columns use this
feature as can be seen in the folloiwng code snippet. However, these eight columns are used across
111 of the available events, as shown in Figure 10-9.
SELECT COUNT(DISTINCT Sub.trace_column_id) AS distinct_trace_columns
,COUNT(DISTINCT sub.trace_event_id) AS distinct_trace_events
FROM sys.trace_subclass_values sub
distinct_trace_columns
|
distinct_trace_events
|
8
|
111
|
For refernece, the columns that map to specific subclass values are as
follows (the results of which can be seen in Figure 10-10):
SELECT col.trace_column_id AS
trace_column_id
,col.name AS column_name
,COUNT(DISTINCT sub.subclass_value) AS subclass_value_count
FROM sys.trace_subclass_values sub
JOIN sys.trace_columns col
ON sub.trace_column_id=col.trace_column_id
GROUP BY col.trace_column_id
,col.name
ORDER BY col.name
|
trace_column_id
|
column_name
|
subclass_value_count
|
1
|
31
|
Error
|
4
|
2
|
21
|
EventSubClass
|
23
|
3
|
55
|
IntegerData2
|
1
|
4
|
32
|
Mode
|
22
|
5
|
28
|
ObjectType
|
80
|
6
|
58
|
OwnerID
|
7
|
7
|
30
|
State
|
1
|
8
|
57
|
Type
|
12
|
sys.trace_event_bindings
The sys.trace_event_bindings catalog view simply provides the association between
all events and all columns to create the permissible list. The best way to
visualize this is probably through the Trace Properties dialog box in Profiler.
The duration column is
available for SQL:BatchCompleted but not available for SQL:BatchStarting. Since
SQL Server hasn’t yet achieved the feat of time travel, it makes sense that it
can only tell you the duration of a batch after it has finished.
SQL TRACE PROCEDURES AND FUNCTIONS
- Sp_trace_create
To
start the process off, a trace definition needs to be created. This is
performaed by using sp_trace_create. This stored procedure is
responsible for setting the ground rules for the trace.
Note that SQL Trace is able to output only to a file (i.e., the flat file
I/O provider) when invoked via this stored procedure. Moreover, this flat file should be located on
a local drive so as not to incur network I/O.
Don’t forget that the SQL Server servie account will need
sufficient permission in Windows to write the trace file in the target
directory.
Sp_trace_create will not overwrite a file that already
exists. If you try and do this you will get the error.
Sp_trace_setevent
This stored procedure performs that function .however,
somewhat painfully, it is necessary to call SP_TRACE_SETEVENT for every column that you wish to capture for the event.
A more
accurate name for this procedure would have been SP_TRACE_SETEVENTCOLUMN.
The SPID column is always included for
each event added to a trace definition. You cannot exclude it.
You simple trace
SP:STMTCOMPLETED while executing the preceding SELECT statement, then you do
not get one row as you might expect.
Sp_trace_setfilter
As the name suggests,
SP_TRACE_SETFILTER is used when building a server-side trace to filter the
results that are captured to a file.
The SYS.TRACE_COLUMNS catalog view
has a column called IS_FILTERABLE that you can use to check whether u can place
a filter against the columns:
SELECT is_filterable, COUNT(*) AS filter_count FROM sys.trace_columns GROUP BY is_filterable
The three columns that
aren’t filterable are as follows:
1. EventClass
2. EventSequence
3. ServerName
Filtering a Range
In order to create a range filter
with SP_TRACE_SETFILTER it is necessary to execute it twice. The first time
sets the lower boundary; the second time creates the upper boundary.
Sp_trace_setstatus
It is the stored procedure used to
start, stop, and destroy trace definitions.
The parameters for
sp_trace_setstatus.
If you are planning to open trace files with SQL Profiler,
try to ensure that the file sizes are manageable by rolling the file fairly
frequently. Profiler also tries to be helpful by offering to load in subsequent
rolled over files for you, which is nice.
Sp_trace_generateevent
It enables u to instrument your SQL
code with user-defined events. It is quite limited, as you are only given 10
district user-defined events to play with. Nor is it frequently used by aplication
developers who normally instrument code outise of the stored procedure layer.
Fn_trace_geteventinfo
It returns all the events and
columns specified for a currently defined trace. It’s essentially an enabling
function that is used by tool providers and DBAs who want to know what events
are being captured in trace sessions.
SELECT evt.name AS EventName
,col.name AS ColumnName
FROM fn_trace_geteventinfo(1) inf
JOIN sys.trace_events evt
ON inf.eventid = evt.trace_event_id
JOIN sys.trace_columns col
ON inf.columnid
= col.trace_column_id
ORDER BY eventid
,columnid
Fn_trace_gettable
It is the most interesting and
useful in everyday use. This function enables you to read a trace file even
while it is open and being written to. It can therefore be used to process trae
data collected in trace files and load it into tables in SQL Server for a more
detailed analysis.
SELECT data.* FROM sys.traces t
CROSS APPLY
fn_trace_gettable(t.path,default) data
WHERE is_default =
1
ORDER BY EventSequence
DOWN LOAD CODE:
SECURING SQL TRACE
SQL server 2005 changed the security
dynamic for tracing. A new permision was created called ALTER TRACE.
GRANT
ALTER TRACE TO Trace
TRACING LOGIN CREATION / DELETION
CREATE LOGIN TRACE WITH PASSWORD=’XY’, CHECK_POLICY=OFF
CREATE LOGIN [DOMAIN\USERS] FROM WINDOWS;
DROP LOGIN [DOMAIN\USERS]
DROP LOGIN Trace;
Logins Changing Passwords
There are two ways a SQL Login password can be changed:
- SP_PASSWORD
- ALTER LOGIN
USE MASTER
CREATE LOGIN
TRACE WITH PASSWORD='XYZ', CHECK_POLICY=OFF;
GRANT ALTER
TRACE TO TRACE;
EXECUTE AS Login='Trace';
ALTER LOGIN TRACE
WITH PASSWORD ='NEW XYZ'
OLE_PASSWORD
= 'XYZ'
REVERT
DROP LOGIN
TRACE
Securing the Output of SQL Trace
There are two I/O
providers for SQL Trace events:
- Flat-file provider
- Rowset provider
Rowset Provider is used by Profiler
Flat-file Provider is used by Server-side tracing.
Rowset Provider Protection
·
Save to a table
·
Save to a file
(which may or may not be XML)
PROFILER
It is a graphical GUI tool that is
oftern used to run traces against production systems.
It consume CPU and make heavy use of
the TEMP folder on your production system, it is also an implementation of a
rowset provider.
ADVANCED FEATURES OF
PROFILER
Profiler allows you to Group by one
or many events. However, u can also
right clike on the events and select a different view. This view is called
grouped view.
However, an alternative method would
be to group by a clientprocessID, SPID, HostName, NTUserName, LoginName, or
SessionLoginName to help isolate a single user in the context of all other
event data.
For SQL
statements these options only work for SQL:BatchStarting,
SQL:BatchCompleted, RPC:Starting, and RPC: Completed events.
First, create a trace in Profiler that includes the
following events and start running:
SQL:BatchStarted
Showplan
XML for Query Compile
Showplan
XML Statistics Profile
To make it easier to extract events u
may want to consider grouping the events by the EventClass as shown in the
previous sub-section “Grouping Events”
The folloiwng additional events will enable you to drill
down to the statement level and compare and review execution plans, albeit at
the expense of increased overhead in the data collection:
- Performance: Show Plan Statistics
- Stored Procedures: SP:StmtCompleted
- Stored Procedures: SP:StmtStarting
- T-SQL: SQL:StmtCompleted
- T-SQL: SQL:StmtStarting
CHAP: 14
Using Management
Studio Reports and the Performance Dashboard
Cover:
·
How to get basic troubleshooting information by pointing
and clicking in SQL Server Management Studio
·
Where to get more reports from MS for even deeper
information
·
The basics of building ur own reports
xxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxx
CHAP 14: USING MANAGEMENT STUDIO REPORTS AND THE PERFORMANCE
DASHBOARD
In SQL server 2008, Microsoft shifted
focus away from SSMS reports in favour of the new Management Data Warehouse
(MDW).
The MDW addreses some weaknesses of the
standard reports and performance dashboard.
USING THE STANDARD REPORTS
xxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxx
READING THE SERVER
DASHBOARD REPORT
It’s simply a summary of the most basic
configuration options on the server and some very rudimentary connection
statistics.
CONFIGURATION
CHANGES AND SCHEMA CHANGES REPORTS
If the server dashboard report shows a
configuration change, the configuration changes history report can help the
database administrator find out when it was mede, and by how.
Likewise,
the SCHEMA CHANGES HISTORY report, showns al DLL changes made in all databases on the
instance.
1.
the SQL Server service has been restared five times since
the change was made; or
2.
the default trace has accumulated more than 100MB of data
since the change.
JOB monitoring: If the data capture is
performed via a SQL Agent job, then a malicious user could simply turn off the
job or disable the schedule.
Memory Consumption Report
It provides a current snapshot of how SQL Server is utilizing memory. This
report is only a one-time peek into SQL Server’s internals, and so DBAs should
not make configuration decisions based on this report.
When DBAs find themselves needing the kind of information in this report,
it’s time to consider using the MDW instead. (Management Data Warehouse).
ACTIVITY REPORTS
DBA
better insight into CPU and memory issues on the server. However, viewing all
current queries through Activity Monitor or by running sp_who can seem like drinking water from a fire
hours.
If a user
calls the IT department to complain that a server is busy, the DBA can check
the following reports:
- ACTIVITY – all Blocking Transactions: This report is similar to running an sp_who2 and looking at the BlkBy column to see if any transaction blocking another transaction.
- ACTIVITY – all Cursors: If a slow cursor-based query or, heaven forbid, a nested cursor, is chewing up a great deal of resources on a server, it’s visible in this report. This is also useful for cursors that have not been closed, if a user leaves SSMS open without finishing their running code.
- ACTIVITY – Top Sessions: This report, shown in fig: 14-7, breaks out the oldest, most CPU-intensive, memory-intensive, read-intensive, and write-intensive sessions.
These reports get their data from DMVs s.a.
sys.dm_exec_sessions, sys.dm_exec_connections, and sys.dm_exec_requests.
PERFORMANCE EXECUTION REPORTS:
The
performance – Object Execution Statistics report is a great example of how the
Standard Reports come very close to providing value, yet fall just a bit short.
It makes
a good example of how to build better replacement reports.
DMV sys.dm_exec_query_stats joins to sys.dm_exec_sql_test
to get the exact SQL executed:
SELECT SQL_HANDLE,SQL_HANDLE AS CHART_DISPLAY_OPTION,SQL_HANDLE AS
CHART_DISPLAY_OPTIONIO,MASTER.DBO.FN_VARBINTOHEXSTR(SQL_HANDLE),
DENSE_RANK() OVER (ORDER BY s2.dbid,s2.objectid) as SPrank, s2.dbid,s2.objectid, (select top 1 SUBSTRING (text,(s1.statement_start_offset+2)/2,
(case when s1.statement_end_offset =
-1 then len(convert(nvarchar(max),text))*2 else s1.statement_end_offset
end -s1.statement_start_offset)/2)
from sys.dm_exec_sql_text(s1.sql_handle)) as [SQL Statement]
, execution_count,
plan_generation_num,last_execution_time, ((total_worker_time +0.0)/execution_count)/1000
as [avg_worker_time]
, total_worker_time/1000.0 , last_worker_time /1000.0,min_worker_time/1000.0,max_worker_time/1000.0,max_worker_time/1000.0
, ((total_logical_reads+0.0)/execution_count) as
[avg_logical_reads] , total_logical_reads, last_logical_reads,
min_logical_reads, max_logical_reads,
((total_logical_writes+0.0)/execution_count) as
[avg_logical_writes], total_logical_writes,last_logical_writes,
min_logical_writes, max_logical_writes,
((total_logical_writes +0.0)/execution_count
+ (total_logical_reads +0.0)/execution_count) as [avg_logical_IO],
total_logical_writes + total_logical_reads,
last_logical_writes
+ last_logical_reads,
min_logical_writes+ min_logical_reads,max_logical_writes +
max_logical_reads from sys.dm_exec_query_stats
s1
cross apply sys.dm_exec_sql_text(sql_handle) as s2 where s2.objectid is not null and DB_NAME (s2.dbid) is not null order by s1.sql_handle;
Backup and Restore
Reports:
DBA often troubleshoot backup job times by looking at the
runtime durations of the SQL Server Agent Jobs. For more granual detail,
right-click on an individual database and choose the Backup and Restore Events
reports.
SELECT t1.database_name, t1.type as [type], (AVG(datediff(ss,backup_start_date, backup_finish_date)))/60.0
as AverageBackupDuration,
COUNT(*) as BackupCount From
msdb.dbo.backupset
t1
inner join msdb.sys.databases t3 on (t1.database_name =t3.name)
group by t1.database_name,t1.type order by t1.database_name, t1.type
database_name
|
type
|
AverageBackupDuration
|
BackupCount
|
DB_Restore
|
D
|
1.3
|
1
|
DB_TABLELIST
|
D
|
0.05
|
7
|
DB_TABLELIST
|
L
|
0
|
10
|
EBILL
|
D
|
10.55
|
3
|
FTR_SR_MASTER
|
D
|
23.633333
|
4
|
MASTER
|
D
|
0.016666
|
2
|
MIS_APR2014
|
D
|
24.7
|
1
|
MODEL
|
D
|
0
|
1
|
MSBill_2014
|
D
|
6.7
|
1
|
MSDB
|
D
|
0.116666
|
1
|
REPEAT_APR2014
|
D
|
25.283333
|
1
|
SelfcarePrePaid_APR2014
|
D
|
0
|
1
|
SRFTR_APR2014
|
D
|
3.05
|
1
|
In the result set, the
Type column indicates the type of backup:
TYPE
|
DETAILS
|
D
|
FULL
|
I
|
DIFFERENTIAL
|
L
|
TRANSACTION
LOG
|
DB_Level reports can be
run for all databases by calling that T-SQL code with the system SP : sp_msforeachdb.
On the SSMS report,
clicking the plus (+) sign in the Device Type column yields more inforamtion
about each individual backup file. This comes in handy when troubleshooting a
broken LSN chain or problems with log shipping.
Sometimes end users or developers do their own full
backups out of the sequence from the DBA’s backup and recovery scheme, and the
DBA will need to track down the location of the out-of-control backup file.
Index Usage Statistics
Report
Adding indexes can be the fastest way to solve a
performance problem. Repeated applications of that solution can be a pretty
quick way to cause a performance problem.
To balace faster
reads against slower insert/ update/ deletes, the database administrator needs
to know how each index is used and how much it is being updated.
If an index is
heavily used for weeks and then suddenly is no longer needed due to a query
change or the presence of a better index, the number of User Seeks and User
Scans will still seem high.
To mitigate this
issue, the DBA needs to check the Last User Seek, Last User Scan, and Last User
Lookup Time columns.
If they are old,
then the index has not been used recently even though the Seeks and Scans
number will be high.
If a database is
only used once per month or once per quarter, the Last User Seek, Scan, and
Lookup times won’t be updated.
Index Physical Statistics
Reports
As data is inserted,
updated, and deleted in a table, the table and its indexes become fragmented.
Index fragmentation
slows down sequential operation s.a. range scans and read-ahead because the
storae needs to hunt around for all the relevant pages.
This report shows
all clustered and nonclustered indexes in the database and their depth – the
number of levels in the b-tree of the index.
Microsoft’s
whitepaper on index fragmentation in SQL Server 2000 indicates that DBAs should
not be concerned with objects under 1,000 pages in size:
http://technet.microsoft.com/en-us/library/cc966523.aspx
The easiest way to stay on top of
fragmented objects is to use maintenance plans to periodically rebuild or
reorganize all indexes in a database. Unfortunately, this cause a lot of logged
and locked operations, which affects performance while the maintenance plan
runs.
For databases over 100GB, consider
writing scripts to reorganize and/or rebuild only the fragmented objects. Rebuilding
only the most fragmented objects will reduce the performance impact on database
mirroring and log shipping, which both slow down dramatically during index
rebuilds.
USING THE PERFORMANCE
DASHBOARD
Microsoft’s Production Support Services (PSS) team for SQL
Server created a set of custome reports for SQL Server 2005 SP2 called the
Performance Dashboard.
These
reports gave DBA more powerful troubleshooting information about the health of
their server.
Sadly,
the reports were not updated for SQL Server 2008, but with a few easy tweaks,
the reports will work fine. Keep in mind, however, that they won’t account for
new SQL Server 2008 features like filestream data or spatial data.
SUMMARY
In SQL
Server 2008 and 2005, MS’s new dynamic management views gave DBAs insight into
what’s happening inside the Storage Engine, and the SSMS reports helped surface
that information.
- Using the bundled Standard Reports to get a quick view of basic health metrics
- Installing the Performance Dashboard Reports to deliver more insightful information on problems
- Building their own reports modeled on the Performance Dashboard Reports for truly powerful analysis.
15: USING SQL SERVER
MANAGEMENT DATA WAREHOUSE
·
Purpose of the Management Data Warehouse (MDW)
·
Configuring the MDW
·
Creating custom collectors
·
Reporting on the MDW
MDW Architecture
The MDW
consists of the following three key components:
1.
Data collection sets
2.
Data warehouse
3.
Reports
The Data Collection Sets are defined as SQL Server Agent
jobs used to collect and upload data from target servers to the data warehouse.
The data ware house is a centralized repository for performance data storage.
Reports are executed agains the data warehouse to present data stored in the
data warehouse.
DATA COLLECTION SETS
It is
necessary to retrieves performance datat from the targets and store this in the
data warehouse. Performance data collection is carried out by a user-mode
process called DCEXEC.EXE that runs on each target server.
This
application is called by a SQL Server Agent job and is responsible for
collecting performance data based on a collection items defined in the
collection set.
Data
collection sets can be configured in either cached mode or non-cached mode.
xxxxxxxxxxxxxxxxxxxxx
The
Server Activity collection set consists of one Generic T-SQL collector
type used to snapshot the following DMVs:
- dm_os_waiting_tasks
- dm_os_latch_stats
- dm_os_process_memory
- dm_os_memory_nodes
- dm_os_schedules
- dm_io_virtual_file_stats
excellent one
ReplyDelete