KDC Problem: Keeping track of last time of extract

Friday, March 9, 2012

Keeping track of last time of extract

I am building an SSIS package in which I extract data from a number of transaction tables. The rows in each transaction table contain a timestamp. I would like to do the following.

* Every time I extract rows from a table I want to extract only the records added since last extract (no rows are modified or deleted - only added)
* In the extract process I want to find the maximum timestamp in a given table. This timestamp should be stored and updated in a table on the SQL Server. This way, the next time the extraction process is run only "new" rows are extracted

How would you go about doing this in SSIS? I have thought of the following approach, but I am unsure whether it is too cumbersome.

* A package variable is defined for each of the transaction tables
* A series of Execute SQL Tasks populates each of these variables with the timestamps from the table in SQL Server
* A series of data flow tasks extract the data (using the variables populated above in the WHERE-condition). Contained in the data flow tasks is a script component which records the maximum timestamp extracted and places this in yet another package variable (I am unsure how to do this, by the way...)
* A series of Execute SQL Tasks updates the table in SQL Server with the new timestamps

How would you do this?

Thanx! Smile This is how I'd do it. It isn't particularly a SSIS solution...just the way I'd do it!

Have 2 values stored in a table called (e.g.) tblConfig. They are:
LastExtractDate
ThisExtractDate

Step 1) UPDATE tblConfig SET LastExtractDate=ThisExtractDate, ThisExtractDate = GETDATE()

Step 2) SELECT * FROM <source_table> WHERE <some_tstamp_value> >= LastExtractDate AND <some_tstamp_value> < ThisExtractDate> (this would be inside a data-flow source adapter)

OK, column names may change etc...but you get the idea!!!

-Jamie|||Very nice... And simple. Smile One question though (for doing this in SSIS): Do you enlist the two tasks in the same transaction (using a sequence container for instance) so that if the data flow tasks fails, you will not "miss" any rows on the next run?

/Michael|||

Reckless wrote:
Very nice... And simple. One question though (for doing this in SSIS): Do you enlist the two tasks in the same transaction (using a sequence container for instance) so that if the data flow tasks fails, you will not "miss" any rows on the next run?
/Michael

Yeah. Good idea!

-Jamie|||Hi, folks:
Exactly how do we do this. Do we do a OLE data source > something > OLE data destination. Thanks in advance.|||

Al_chan wrote:
Hi, folks:
Exactly how do we do this. Do we do a OLE data source > something > OLE data destination. Thanks in advance.

Hi Al,
Step 1 I would do in an Exec SQL Task.
Step 2 most likely in a data flow - yes, using an OLE DB Data Source.

-Jamie

Friday, March 9, 2012

Keeping track of last time of extract

No comments:

Post a Comment

KDC Problem

Blog Archive

About Me