Hello Dear Reader!
Last week I posted the invitation
to T-SQL Tuesday #73. The premise?
“As you work with SQL Server look around you. Is your
environment Naughty or Nice? If it is Naughty what’s wrong with it?
What would you do to fix it? Do you have a scrooge that is giving you the
Christmas chills? Perhaps you have servers of past, present, and future
haunting you. Maybe you are looking at SQL Server 2016 like some bright
shining star in the east.”
I don’t have an environment of my own. Not one that I get to interact with every
day. Recently I’ve had several
experiences where some friends with various companies were having issues and
the root cause was all hardware. Memory,
SAN, or CPU one had become a bottleneck.
Several had to do with I/O.
One had issues with their SAN, another had issues with their code (causing
unnecessary I/O), and another on their Azure VM.
“So Balls,” you say, “What does this have to do with Naughty
& Nice?”
As always Dear Reader, thanks for keeping me on track. For this week I’m going to tackle how IOp’s
can affect your SQL Server. I’ll be combining this with the pseudo code to
mimic the issue my client had. Then a
little on Azure VM’s, and how Premium Storage made a big difference.
THE SET UP
I’ve got a really bad query. This
is from years past. Lots of unnecessary
row by row logic. Duplicate logic. It’s bad.
Sufficient to say, I’m rewriting it with pseudo code because I don’t
want to risk insulting the guilty. I
will post the stats though. We were on
an Azure D13 Series VM, and we moved to a DS 13 Series VM. This fixes nothing by itself. Click
here to see the machine specs. See
below for a quick table. We will be
using the DS3 VM. We will move from
standard storage to premium storage on the same machine to highlight our fix.
SIZE – AZURE CLASSIC PORTAL\CMDLETS
& APIS
|
CPU CORES
|
MEM
|
NICS (MAX)
|
MAX. DISK SIZES – VIRTUAL MACHINE
|
MAX. DATA DISKS (1023 GB EACH)
|
CACHE SIZE (GB)
|
MAX. DISK IOPS & BANDWIDTH
|
|
Standard_DS3\same
|
4
|
14
|
4
|
OS = 1023 GB
|
8
|
172
|
12,800
|
OUR DEMO MACHINE
|
Local SSD disk = 28
GB
|
128 MB per second
|
|||||||
Standard_DS13\same
|
8
|
56
|
8
|
OS = 1023 GB
|
16
|
288
|
25,600
|
|
Local SSD disk = 112
GB
|
256 MB per second
|
A DS Series VM gives us the ability to use Premium Storage. We will use that turn bad code into, faster
bad code. Fixing the code comes later. The first thing I will do is run CrystalDiskmark against my disk.
Our G drive is standard storage.
M drive is our premium storage.
The M drive is a 1 TB Premium Storage Drive. I have done nothing fancy here. Just attached and formatted. As you can see we received a little bit
better than 128 MB per second sequential in my baseline.
Our really bad query? It runs
in 19:48 seconds on our regular storage.
When I found it in the wild it actually ran for 4 hours. I could only replicate so much. Sufficed to say, for this example it will do.
Without tuning the query, but moving the database to premium storage,
dropping all data from memory, and rerunning the query we reduced the time to
5:49 seconds. A roughly 72% improvement.
With a DS 14 we would have been able to go above 512 MB per second. Premium storage on DS Series VM's is some serious good stuff.
THE WHOLE TRUTH
The query is bad. Really
bad. I can fix the whole thing by
converting it from row by row logic to set based logic. I can get it down to 2 seconds using the
merge statement.
We did. It was a big fix. Originally we were hoping to get the process
from 4 hours down to 2 hours. After I
was done we had it down to 5 minutes.
I say this often when teaching.
I can throw super-fast hardware at bad code and make it run faster. But there is still room for improvement. Fixing the query is for another day. Today we keep our IO on the ball. Why is it so fast?
CPU
|
Memory
|
SSD
|
Spinning Disk
|
Nanosecond
|
Nanosecond
|
Microsecond
|
Millisecond
|
You can only get as fast as your slowest component. Premium storage uses SSD's. We are hastening our retrieval of data from our disks by a factor of x10. In this case it made our query 72% faster. There are other things to fix. The IO issue is one of them.
“So Balls,” you say, “Keep our IO on the ball. Really?!”
Sorry Dear Reader, I couldn’t resist.
As always, Thank You for stopping by.
Thanks,
Brad