A little while ago, my wonderful wife Renee got a Solid State Drive (SSD) for me. Being a DBA, one of my first thoughts was to wonder what kind of effect this would have on database performance, since I use my home computer for a lot of testing and my writing projects. So, I installed the drive in my machine (a slightly more complicated process than I expected that wound up involve type III paracord…) and reinstalled my necessary software.
Then, being a father of a three as well as a database professional and a grad student, I proceeded to put that idea aside for some time. But I wanted numbers and a direct performance comparison, and thought they would be of interest to the broader database community. So, once I found some time, I began running some tests for direct comparisons.
II. A Brief Background on SSDs
Traditional harddrives store the data on a small array of platters using magnetic charges. In order to read the platters or write to them, the movable read/write heads must be properly positioned over the platters. Most solid state drives are based on flash memory and often involve no moving parts whatsoever.
These different techniques to store data create several broad differences between HDDs and SSDs:
- SSDs tend to have dramatically lower latencies since they can access the entire drive all the time without having to align the head with the part of the drive to be read.
- SSDs tend to be much more resistant to being moved around while doing read/write operations.
- HDDs tend to be substantially larger and have a much better size/price ratio.
I carefully hedged by using “tend” rather than a more definitive word. Both SSD and HDD technologies are still evolving and how large the differences are, or even if they hold true, can vary when comparing different SSDs with different HDDs and with the different circumstances they are used under. But those three broad trends will hold true the vast majority of the time.
SSDs, because of the way they write to the drive, can be subject to write-cycle exhaustion. This happens as parts of the drive are overwritten too many times and eventually cannot be written to any more. To dramatically oversimplify this topic:
- According to Tom’s Hardware, it is rarely something a consumer making standard use of their drive needs to worry about.
- Ars Technica points out that many SSDs targeted at the enterprise use SLC instead of the consumer grade MLC flash, and SLC is much less susceptible to write-cycle exhaustion, but more expensive.
- The risk of write-cycle exhaustion can be significant under certain circumstances though, such as using a consumer grade MLC drive in a write intensive enterprise-style environment
There are also hybrid drives which combine relatively large standard harddrives with a smaller cache of flash memory which is used to reduce latency.
As for my personal experience, the change in overall performance of my system when I went from using a HDD to using an SSD as my primary system drive with the HDD for bulk storage was dramatic. While I have not measured it precisely, the system boot time is now substantially shorter, as are load times for all programs. The shortened load time is particularly noticeable in Spyder, a python IDE I use that used to take nearly a minute to fully load and now takes just a few seconds. The computer as a whole is subjectively more responsive and “snappy”. I think that for many modern computers, moving to a SSD is perhaps one of the best ways to improve the overall performance of the system.
III. A Word on Methodology
Here, I am essentially looking at performance difference as they pertain to databases between the two drives in my home computer. While I think this generalizes relatively well to give an idea of the database performance implications of SSD’s generally, it is important to emphasize that in the end my sample size is one standard harddrive and one SSD. More than that, these are both consumer grade drives sitting in desktop computer. In short, while I think this comparison is instructive, it is not a rigorous comparison of enterprise class drives that would be used in corporate servers, much less a look at large scale SANs which can readily incorporate multiple types of storage medium in complex tiers.
I selected two slightly different tests to time. First, I took the Python script I wrote to compare the performance of SQL Server and SQLite and ran it with just the target drive changed. Then I ran some of the long running scripts from Jeff Moden’s excellent Performance Tuning: Concatenation Functions and Some Tuning Myths article.
In this setup, I am using the same installation of Windows and SQL Server 2012, both installed on the SSD, and only the location of the database is really changing. The relevant system specifications for reference are:
|Processor||Intel Core i7 3.4GHz|
|OS||Windows 7 Home|
|SQL||SQL Server 2012 Expres|
IV. The Tests
a. Rerunning the SQL Server and SQLite Tests
A little while ago I wanted to do a direct comparison between SQL Server and SQLite in a single user, desktop environment. Since it already had a system with speed measurements in place, I reran all the tests twice, once with the destinations for the database pointing to the HDD and once with them pointing to the SSD. The full details of the tests are in the previous article, but to briefly summarize, the test program creates a SQL Server and a SQLite table and then repeatedly inserts into the tables, sending a commit after each statement. Then it repeatedly inserts into the tables, sending a commit only at the end, and then attempts to read from both of the tables. The results are:
|0.917 secs for 5000 inserts SQL Server (commit each)10.386 secs for 5000 inserts SQLite (commit each)0.435 secs for 5000 inserts SQL Server (commit once)0.020 secs for 5000 inserts SQLite (commit once)1.556 secs for 5000 reads SQL Server
1.960 secs for 5000 reads SQLite
|2.120 secs for 5000 inserts SQL Server (commit each)538.364 secs for 5000 inserts SQLite (commit each)0.402 secs for 5000 inserts SQL Server (commit once)0.459 secs for 5000 inserts SQLite (commit once)1.518 secs for 5000 reads SQL Server
1.890 secs for 5000 reads SQLite
As expected, with a commit after each insert, which forces the server to write after every single insert, the SSD performs dramatically better. It surprised me somewhat that the HDD came out better on the SQL Server inserts with one commit and even slightly faster on the SQL server reads. Although I am not certain, I suspect this is because of the sophisticated caching that both the harddrive and SQL Server itself do. So, I cleared the cache before executing the reads and got:
|SSD (Cache cleared)||HDD (Cache cleared)|
|1.564 secs for 5000 reads SQL Server||1.655 secs for 5000 reads SQL Server|
This increased the time for reads on both drives slightly and shifted the advantage to the SSD where it was expected.
b. Jeff Moden’s Concatenation Tests
Next, I turned back to Jeff Moden’s concatenation article. This article came to mind because it helped give me more insight into performance tuning when I was much earlier in my SQL career and because it included a nice long running “Hog” Test that in his article took over 2 minutes and 50 seconds on the hardware he was using at the time. So, I ran his setup code, and then created the function in the article, each twice, once on a database sitting on the SSD and once on a database sitting on the HDD. Then I cleared the cache and ran
SELECT DISTINCT SomeID, dbo.fnConcatTest(SomeID) AS CSVString
Against each of them to get:
|SSD (Cache cleared)||HDD (Cache Cleared)|
|7 Secs||22 Secs|
Clearly the SSD helped substantially. This should not in any way to be taken to mean that improving the hardware can replace tuning the code though. The “STUFF” version he provided at the end of his article had sub-second response times on the HDD, which is a nice reminder that frequently you can get better improvements by optimizing the code than by improving the hardware.
As the numbers show, the SSD generally performs better. Though how much better varied by the circumstances, with SQL Server’s sophisticated caching system letting the HDD perform or even top the SSD under certain limited circumstances.
In most consumer and workstation systems currently in use, I suspect switching from a HDD to a SSD is one of the most effective upgrades currently available. Moving purely to SSDs is probably prohibitively expensive for most large database applications, however they can serve as an effective part of a SAN to hold the most active data or to hold selected parts of the database which particularly require low latency with the bulk data held on HDDs in a simpler storage configuration.
VI. More References
- The SSD Revolution – A series of feature articles on SSDs by ArsTechnica.
- In-depth on how SSDs really work- By Lee Hutchinson. This is part of ArsTechnica’s SSD Revolution series, but goes into great depth on SSDs work and what those inner workings imply. This is well worth reading.
- Is Your SSD More Reliable Than a Hard Drive by Andrew Ku
 It was significant to clear the cache here, without doing so the SSD and HDD times were identical.