Sunday, July 20, 2008

More on Flat File Bulk Import methods speed comparison in SQL Server 2005

1. Database recovery model was set to Bulk logged. (This improved performance by 2-3 seconds for each method)

2. I tried BULK INSERT and BCP with and without TABLOCK option.
3. I also measured general processor utilization and RAM. IO reads and writes were similar in all methods.

everything else was the same.

Method Time (seconds) CPU utilization (%) RAM change (MB)
Bulk insert with Tablock 4648 75 +5MB
SSIS FastParse 5812 75 +30Mb
BCP with Tablock 6300 100 +5MB
Bulk insert 7812 75 +5MB
OpenRowset 8750 95 +5MB
BCP 11250 100 +5MB

SSIS also has a bulk insert tablock option set to true in the SQL Server destination. So my guess is that approx 1 second overhead comes from package startup time. However SSIS uses more memory than Bulk insert.

So if you can use TABLOCK option Bulk insert is the way to go. If not SSIS is.

It must be taken into account that the the test flat file i used had only 4 integer columns and you could set the FastParse option to all of them. In real world there are no such ideal conditions. And since i had only integer type i don't have to specify collations.

Modified scripts are:

BULK INSERT testBulkInsert
FROM 'd:\work\test.txt'
WITH (
FORMATFILE='d:\work\testImport-f-n.Fmt',
TABLOCK
)

insert into testOpenRowset(c1, c2, c3, c4)
SELECT t1.c1, t1.c2, t1.c3, t1.c4
FROM OPENROWSET
(
BULK 'd:\work\test.txt',
FORMATFILE = 'd:\work\testImport-f-n.Fmt'
) AS t1(c1, c2, c3, c4);

exec master..xp_cmdshell 'bcp test.dbo.testBCP in d:\work\test.txt -T -b1000000 -fd:\work\testImport-f-n.Fmt -h"tablock"'

No comments:

Blog Archive