In the endless search for optimal performance in Bulk loading data, I have written a paper which describes how to reach top performance using the SqlBulkCopy class in .NET, trying to find the best combination of its parameter settings to load heaps, clustered table with and without indexes performing all the load in a highly parallelized environment.

In order to produce test, I have written a simple solution that implements a producer/consumer pattern where many producers and consumer can run in a parallel environment, that can be used to perform detailed tests by simply switching some parameters on and off. It is not a user friendly program, just a tool that you can use to test different bulk insert operations. Nevertheless, I found it very useful in understanding how bulk insert work and how to optimize it.

The paper does not go as deep as the Data Loading Performance Guide from Microsoft does in describing the internals of SQL Server and bulk loading. Nevertheless, the Microsoft paper describes some esoteric hardware configurations that are pretty difficult to find in the real world, while my paper is much closer to a real user experience. I think that reading both might help any ETL code writer to better understand how to boos his code performance. Mine is easier, Microsoft’s is far more complete.

If you are interested in the matter it please follow

I am very interested in feedback and, if you find time to make tests on your specific hardware, any result you find that are in some ways different than mine. So feel free to contact me to provide both, so that I can add different results to the paper and increase the whitepaper completeness.