Friday, March 30, 2012

Problem running package with 'larger' amount of data

Dear,

I created a package getting data from files and database sources, doing some transformations, retrieving dimension id's and then inserting it into a fact table.

Running this package with a limited amount of data (about a couple of 100.000 records) does not result in any errors and everything goes fine.

Now running the same package (still in debug mode) with more data (about 2.000.000 rows) doesn't result in any errors as well, but it just stops running. In fact, it doesn't really stop, but it doesn't continue as well. If I've only been waiting for some minutes or hours, I could think it's still processing, but I waited for about a day and it still is 'processing' the same step.

Any ideas on how to dig further into this in order to find the problem? Or is this a known problem?

Thanks for your ideas,

Jievie

Look at the BuffersSpooled perf counter. If it's really up there, you're thrashing the disk.

Also, just look task manager under the performance tab. Look at Physical Memory available. Chances are, you're running up against memory limitations.

Depending on what your pipeline contains, you should be able to optimize out the bottlenecks as well as eliminate any applications running that may be competing for resources.

This white paper is helpful:

http://www.microsoft.com/technet/prodtechnol/sql/2005/ssisperf.mspx

K

|||

Dear Jievie,

Could it be possible that the package tries to update a row locked by an other application?

Succes,

Paul

|||

Dear KirkHaselden and Paul,

In the meanwhile, I did some checks you both proposed me to do and can tell you the following:

I can't imagine another application is locking any rows or tables, as my sources are some flat files and two tables that are only used by my SSIS package. I also checked the system views (for the tables), no locks are listed...|||

There are three things you're not telling us ... how much memory you have, what transformations you are performing, and how big your rows are.

You are indicating you are memory constrained, but we don't know how much is involved.

The transformations are important - some transformations such as sort, or lookup, or aggregate, can require a lot of memory. Others, such as data conversion are very lightweight.

You can see the size of your rows by double clicking on the paths between transformations and looking at the metadata tab there - it will tell you what data types you are using for each row, and from there you can start to get the size of the row. (Remember folks, "row" is not a useful metric - we've seen rows from a few bytes to several megabytes being passed through SSIS.)

Once you get us this additional information, we should be able to help more. And do read the performance whitepaper - it will give you many useful suggestions.

thanks

Donald

No comments:

Post a Comment