Archive for April, 2009

Access vs SQL Server: some stats (part 1)

Microsoft Access In the perspective of upsizing my current Access 2007 application, I have been trying to understand a bit more about the possible performance impact of various choices of Primary Keys.

My problem is that currently, the Access application uses autoincrement numbers as surrogate Primary Keys (PK). Since I will need to synchronise the data over multiple remote sites, including occasionally disconnected clients, I can’t use the current autoincrement PK and will need to change to GUID.

To see for myself what could be the impact, I made a series of benchmarks.
This first part is fairly simple:

  • Populate a Product table that contains 3 fields: ID, SKU and Designation with 1,000,000 records.
  • Test natively on SQL Server and Access 2007.
  • The records are inserted in transactions batches of 1000 records.
  • I collect the time taken for each of these transactions and plot it.

Test setup

Nothing much to say about that:

All tests are performed on a dedicated Windows Server 2008 x64 rack running Access 2007 and SQL Server 2008 Standard (SP1) x64.

Test database

In SQL Server, we created a database with two tables ProductGUID and ProductInt:

CREATE TABLE ProductGUID (
    ID UNIQUEIDENTIFIER DEFAULT NEWSEQUENTIALID(),
    SKU NVARCHAR(32) NOT NULL,
    Description NVARCHAR(255) NULL
);
CREATE CLUSTERED INDEX ProdGUIDix ON ProductGUID(ID);
GO
CREATE TABLE ProductINT (
    ID INT IDENTITY(1,1),
    SKU NVARCHAR(32) NOT NULL,
    Description NVARCHAR(255) NULL
);
CREATE CLUSTERED INDEX ProdINTix ON ProductINT(ID);
GO

For the table using a GUID, we use the NewSequentialID() instead of NewID() to create new keys. This is supposed to offer much better performance as the generated GUIDs are guaranteed to be sequential rather than random, resulting in better index performance on insertion.

For the Access version of the tables, we basically use the same definition, except that we used 4 tables:

  • ProductINT: let Jet/ACE autonumbering create the sequential integer Primary Key.
  • ProductINTRandom: let Jet/ACE autonumbering create the random integer Primary Key.
  • ProductGUIDRandom: let Jet/ACE use its own internal GenGUID() for the key which generates random GUIDs instead of sequential ones.
  • ProdcutGUIDSequential: call the Windows API (UuidCreateSequential) to create sequential ID instead.

SQL Server Test code

Using the SQL Server Management Studio, we performed the following test once for each table (resetting the database in-between):

SET NOCOUNT ON;
GO
DECLARE @i INT = 1;
WHILE (@i <= 1000)
BEGIN
    DECLARE @tstart DATETIME2 = SYSDATETIME();
    BEGIN TRAN
        DECLARE @a INT = 1;
        WHILE (@a <= 1000)
        BEGIN
            INSERT INTO ProductGUID (SKU,Description)
            VALUES ('PROD' + CONVERT(CHAR,@a), 'Product number ' + CONVERT(CHAR,@a));
            SELECT @a = @a + 1;
        END;
    COMMIT TRAN;
    SELECT DATEDIFF(MILLISECOND, @tstart, SYSDATETIME()) AS timespan;
SELECT @i = @i + 1;
END;
GO

Basically, we perform 1000 transactions each inserting 1000 records into the table ProductGUID or ProductINT.

Access 2007 Test code

To duplicate the same conditions, the following VBA code will perform 1000 transactions each inserting 1000 records.
Note that the recordset is opened in Append mode only.
The importance of this will be discussed in another article.

' Run this to inset 1,000,000 products in batches of 1000
' In the given table
Public Sub Benchmark(TableName as String, InsertSeqGUID  as Boolean)
    Dim i As Integer
    For i = 1 To 1000
        Insert1000Products TableName, InsertSeqGUID
    Next i
End Sub
' Insert 1000 products in a table
Public Sub Insert1000Products(TableName as String, InsertSeqGUID as boolean)
    Dim i As Long
    Dim db As DAO.Database
    Dim rs As DAO.Recordset
    Dim ws As DAO.Workspace
    Dim starttime As Long
    Dim timespan As Long
    Set ws = DBEngine.Workspaces(0)
    DoEvents
    starttime = GetClock ' Get the current time in ms
    ws.BeginTrans
    Set db = CurrentDb
    Set rs = db.OpenRecordset(TableName, dbOpenDynaset, dbAppendOnly)
    With rs
        For i = 1 To 1000
            .AddNew
                If InsertSeqGUID Then !ID = "{guid {" & CreateStringUUIDSeq() & "}"
                !SKU = "PROD" & i
                !Description = "Product number " & i
            .Update
        Next i
    End With
    ws.CommitTrans
    rs.Close
    timespan = GetClock() - CDbl(starttime)
    Set rs = Nothing
    Set db = Nothing
    ' Print Elapsed time in milliseconds
    Debug.Print timespan
    DoEvents
End Sub

We call this code to perform inserts on each of our Access tables:

  • ProductINT table: we just insert data in the ProductINT table, letting Access create autonumber IDs.
  • ProductINTRandom table: we just insert data in the ProductINTRandom table, letting Access create random autonumber IDs.
  • ProductGUIDRandom table: we let Access create the Random GUID for the primary key.
  • ProductGUIDSequential: we use the Windows API to create a sequential ID that we insert ourselves.

Test results

Without further ado, here are the raw results, showing the number of inserted record per second that we achieve for each test over the growing size of the database (here are only shown tests comapring Sequantial GUID and Autoincrement on SQL Server and Access, see next sections for the other results):

Inserts per second

What we clearly see here is that performance when using autoincrement and Sequential GUID stays pretty much constant over the whole test.
That’s good new as it means that using Sequential GUIDs do not degrade performance over time.

As a side note, in this particular test, Access offers much better raw performance than SQL Server. In more complex scenarios it’s very likely that Access’ performance would degrade more than SQL Server, but it’s nice to see that Access isn’t a sloth.

Using Sequential GUID vs Autoincrement in Access

The results show that we do take a performance hit of about 30% when inserting Sequential GUID vs just using autonumbers.
We’re still getting good results, but that’s something to keep in mind.

In terms of CPU consumption, here is what we get:

CPU load Access

Random PK, whether they are simple integer or GUID do consume substantially more CPU resources.

Using Sequential GUID vs Identity in SQL Server

Out-of-the box, SQL Server performs quite well and there is not much difference whether you’re using Sequential GUIDs or autoincrement PK.

There is however a surprising result: using Sequential GUIDs is actually slightly faster than using autoincrement!

There is obviously an explanation for this but I’m not sure what it is so please enlighten me :-)

CPU Consumption:

CPU load SQL Server

Using Random GUID vs Sequential GUID vs Random Autonumber in Access

So, what is the impact of choosing a Sequential GUID as opposed to letting Access create its own random GUIDs?

Inserts per second Random GUID vs Sequential GUID in Access

It’s clear that random GUIDs have a substantial performance impact: their randomness basically messes up indexing, resulting in the database engine having to do a lot more work to re-index the data on each insertion.
The good thing is that this degradation is pretty logarithmic so while it degrades over time, the overall performance remains pretty decent.
While GUIDs are larger than Random Integers (16 bytes vs 4 bytes) the actual performance of inserting records whose PK is a random integrer is actually worse than random GUID…

Provisional conclusions

Here we’ve check the baseline for our performance tests. In the next article, we’ll look exclusively at the performance of inserting data from a remote Access 2007 front end using our VBA code.

Having this baseline will allow us to check the performance overhead of using ODBC and letting Jet/ACE manage the dialogue with the SQL Server backend.

Feel free to leave your comments below, especially if you have any resources or information that would be useful.

Updates

  • 16APR2009: added test of random autonumber as PK in Access.
  • 13APR2009: Original Article

15 comments April 13th, 2009

Previous Posts


Most Recent Posts

Categories

Links

Posts by Month