Wednesday, March 19, 2014

HANA versus SQL Server

In previous posts, I reported some basic performance data I'd obtained from HANA.

I needed a frame of reference to understand what these numbers actually meant, so I set up a Microsoft SQL server in Azure, loaded the same weather data and timed the same query.  I repeated the query 8 times and computed the average CPU processing time.

On average, the query took 32.781 seconds.  The effective scan rate was 981,000 rows per second.

By comparison, the same query on the same data ran in 0.396 seconds in HANA.

In this experiment, HANA's scan rate was approximately 100 times greater than SQL Server.

I had expected the SQL Server to do better, so some further investigation seemed warranted.

Could differences in hardware explain the poor SQL Server result?

I used SQL Server 2012 Standard running on an Azure "Extra Large" VM. This configuration has 8 cores and 16GB RAM.  The number of cores matches HANA's configuration but has less RAM. Although my HANA instance has 60GB RAM, it has never used more than 18GB.  I monitored RAM usage and paging closely during the experiment.  The SQL Server topped out at about 12GB RAM usage and paging was negligible.

I also monitored disk utilization and noticed that the C: drive spent much of the time at 100% utilization. By default, SQL Server places database files on the C: drive, which means access to the database competes with the operating system for access to the hard drive. So, I moved the database to the D: drive and re-ran the experiment.  It didn't make any difference.

Azure's only promises about 60 mbps performance for disks attached to it's VM's (link). This is, at best a modest number and I believe significantly hampered the SQL Server.

A faster disk would make big difference.

By comparison, I found these results for Amazon EC2: they ranged from 100 - 200 mbps. The solid-state disk in my laptop ranged from 287 mbps to 476 mpbs. Building the SQL Server in Amazon may have doubled its performance.  Building a SQL Server with solid state disks or RAID (actual hardware), could quadruple performance.

Conclusion: Azure's limited disk performance significantly hampers SQL Server performance.

Would indexing make any difference?

I ran the query through the SQL Server optimizer, which suggested an index.  I created that index (which took over an hour) and repeated the experiment.  The index made the query run slower by a factor of about 3 times.

The index suggested by the query optimizer involved 5 columns. I tried two single-column indexes designed to make the join and "where clause" go faster.  These indexes also slowed the query down.

Conclusion: Indexing didn't help.  In fact, it made things worse.

No comments: