* Every time I extract rows from a table I want to extract only the records added since last extract (no rows are modified or deleted - only added)
* In the extract process I want to find the maximum timestamp in a given table. This timestamp should be stored and updated in a table on the SQL Server. This way, the next time the extraction process is run only "new" rows are extracted
How would you go about doing this in SSIS? I have thought of the following approach, but I am unsure whether it is too cumbersome.
* A package variable is defined for each of the transaction tables
* A series of Execute SQL Tasks populates each of these variables with the timestamps from the table in SQL Server
* A series of data flow tasks extract the data (using the variables populated above in the WHERE-condition). Contained in the data flow tasks is a script component which records the maximum timestamp extracted and places this in yet another package variable (I am unsure how to do this, by the way...)
* A series of Execute SQL Tasks updates the table in SQL Server with the new timestamps
How would you do this?
Thanx! This is how I'd do it. It isn't particularly a SSIS solution...just the way I'd do it!
Have 2 values stored in a table called (e.g.) tblConfig. They are:
LastExtractDate
ThisExtractDate
Step 1) UPDATE tblConfig SET LastExtractDate=ThisExtractDate, ThisExtractDate = GETDATE()
Step 2) SELECT * FROM <source_table> WHERE <some_tstamp_value> >= LastExtractDate AND <some_tstamp_value> < ThisExtractDate> (this would be inside a data-flow source adapter)
OK, column names may change etc...but you get the idea!!!
-Jamie|||Very nice... And simple. One question though (for doing this in SSIS): Do you enlist the two tasks in the same transaction (using a sequence container for instance) so that if the data flow tasks fails, you will not "miss" any rows on the next run?
/Michael|||
Reckless wrote:
Very nice... And simple. One question though (for doing this in SSIS): Do you enlist the two tasks in the same transaction (using a sequence container for instance) so that if the data flow tasks fails, you will not "miss" any rows on the next run?
/Michael
Yeah. Good idea!
-Jamie|||Hi, folks:
Exactly how do we do this. Do we do a OLE data source > something > OLE data destination. Thanks in advance.|||
Al_chan wrote:
Hi, folks:
Exactly how do we do this. Do we do a OLE data source > something > OLE data destination. Thanks in advance.
Hi Al,
Step 1 I would do in an Exec SQL Task.
Step 2 most likely in a data flow - yes, using an OLE DB Data Source.
-Jamie
No comments:
Post a Comment