Historical Data

**kulashaker** · 09-19-2005, 03:12 PM

Hi Everyone.

I need some advise on how to create a historical database.

What is the best way of doing this? For example, should I create a new row if a column is changed in a row and Time Stamp all record? What happens when I have child tables link to a Header table?

I have been looking on the NET for methods of creating a historical database, but I cant find any.

Thanks in advance

**rmiao** · 09-19-2005, 03:40 PM

What's your purpose? Keep all db changes in history database or just archive current db? If want to keep all db changes, then what kind of info you like to keep?

**kulashaker** · 09-20-2005, 01:38 AM

Hi rmiao, thanks for your post

For example I have a header table TableA and a child table TableB, they have a constrainst between them. Like this

TableA
------
id_key_pk
field1
field2

TableB
------
id_key_pk
tableA_id_key_fk
field1
field2

The constraint is between TableA.idkey_pk and TableB.tableA_id_key_fk.

Each row in Table A could have many rows in TableB.

If I use live database to store history I will have to remove my constrainst.

How do you recommend using a histroy database? Does this mean I need 2 tables for each table? ex. TableA and TableA_Hist ?

Thanks for you advise, I really appreciate it.

**mulhall** · 09-20-2005, 07:24 AM

As rmiao said, what are you trying to achieve.

Your post muddles between duplicating tables and 'historical' database.

Are you trying to:

Record all changes
Archive old data
???

**kulashaker** · 09-20-2005, 07:54 AM

Let me try to explain

tableA
------
id_pk_pk field1 field2 datefrom DateTo
--------- ------ ------ --------- ----------
1| data1 | data2 | 20/09/2005 | 31/12/9999

tableB
------
id_pk tableA_id_fk field1 field2 datefrom DateTo
----- ------------ ----- ----- --------- ----------
1 | 1 | data | data | 20/09/2005 | 31/12/9999
2 | 1 | data | data | 20/09/2005 | 31/12/9999

The row in tableA was in inserted on the 20/09/2005 and is valid till the 31/12/9999.

Next, a user changes the data in tableA and a new row is inserted with the new data, whilst closing the old data row by setting the DateTo to the date of the change...for ex:

id_pk_pk field1 field2 datefrom DateTo
--------- --------- ------ --------- ----------
1 | data1 | data2 | 20/09/2005 | 25/09/2005
2 | newdata1 | data2 | 25/09/2005 | 31/12/9999

Now, if you look at TableB, the child rows still points to Row 1 in TableA, and Row2 in TableA has no children.

This also goes against the constraint between the tables.

This is one method of keeping history but I see it very complicated and complex to maintain.

**rmiao** · 09-20-2005, 08:15 AM

If user inserts new row in tableA, you should have new rows in TableB to point to new row in TableA in general. This should not affect existing rows in TableB. But I don't know your business logic, can you explain why should modify existing rows in TableB to point to new row in TableA?

**kulashaker** · 09-20-2005, 06:39 PM

Hi rmiao, thanks for your comments.

No, I do not wish to add new rows to TableB because data in TableB did not change. This is exactly my problem.

Let me explain the business logic:

Health Insurance Application

Customer Table: This table stores information about a customer.

Columns of data - id_pk(Key identity column - auto generated by SQL2000), Name, Surname, date of birth.....etc

Address Table: This stores addresses of client. A client can have more than 1 address

Columns of data - id_pk(Key identity column), customer_id(this is the id_pk of the customer table), House No, Street, Locality, PostCode......etc

Now I wish to keep history of all tables. For ex. If a Customer row changes in anyway, I would like to know what it was at a point in time.

This is why I am using the DATEFROM and DATETO columns. With this method I can create a UDF or SP that will extract data by passing the POINT IN TIME as a parameter, by using the following criteria:

WHERE DATEFROM <= @POINTINTTIME AND DATETO > @POINTINTIME

The only problem I find in this logic is when I use child tables such as the Address Table. I do not wish to create redundant addresses whenever the Customer Data is changed.

To answer your question, I must some how modify existing rows in the Address Table because the id_pk of the amended Customer has changed (because it is Auto Generated by SQL2000)

I was thinking of making another Unique Identifier column for the Customer and link the Addresses using the Unique Identifier column. What do you think?

cheers

It is confusing, but I cant think of an easier way to explain it.

thanks all for your help.

**mulhall** · 09-21-2005, 03:47 AM

Okay, that's much clearer.

The tool for this job is a TRIGGER. I'd suggest using either AFTER UPDATE or INSTEAD OF UPDATE depending on whether you want to know simply that a record was changed, or if you also want to copy out the original data before the UPDATE occurs.

**kulashaker** · 09-21-2005, 07:05 AM

I have considered triggers to actually inserting row that were changed into a history database.

But still I will need to add a Unique Idendifier for each Customer and then link the Addresses to that Unique Identifier

I have attached a Work Document to this post with diagram, check it out.

**rmiao** · 09-21-2005, 08:16 AM

When you copy midified row to history db, customer id should be copied over as well. Don't know why you need to add another Unique Idendifier for each Customer in history db.

**mulhall** · 09-21-2005, 09:55 AM

Yup, I agree with rmiao; the customerid provides all the reference you need to report on the changes made to a client record. You could also insert a date column with a timestamp default to record the time the change was made.

**kulashaker** · 09-21-2005, 02:42 PM

Thanks a lot for your help guys, I will give it a try.

Thread: Historical Data

Thread Tools

Display

Historical Data

Posting Permissions