I need some advise on how to create a historical database.
What is the best way of doing this? For example, should I create a new row if a column is changed in a row and Time Stamp all record? What happens when I have child tables link to a Header table?
I have been looking on the NET for methods of creating a historical database, but I cant find any.
What's your purpose? Keep all db changes in history database or just archive current db? If want to keep all db changes, then what kind of info you like to keep?
tableB
------
id_pk tableA_id_fk field1 field2 datefrom DateTo
----- ------------ ----- ----- --------- ----------
1 | 1 | data | data | 20/09/2005 | 31/12/9999
2 | 1 | data | data | 20/09/2005 | 31/12/9999
The row in tableA was in inserted on the 20/09/2005 and is valid till the 31/12/9999.
Next, a user changes the data in tableA and a new row is inserted with the new data, whilst closing the old data row by setting the DateTo to the date of the change...for ex:
If user inserts new row in tableA, you should have new rows in TableB to point to new row in TableA in general. This should not affect existing rows in TableB. But I don't know your business logic, can you explain why should modify existing rows in TableB to point to new row in TableA?
No, I do not wish to add new rows to TableB because data in TableB did not change. This is exactly my problem.
Let me explain the business logic:
Health Insurance Application
Customer Table: This table stores information about a customer.
Columns of data - id_pk(Key identity column - auto generated by SQL2000), Name, Surname, date of birth.....etc
Address Table: This stores addresses of client. A client can have more than 1 address
Columns of data - id_pk(Key identity column), customer_id(this is the id_pk of the customer table), House No, Street, Locality, PostCode......etc
Now I wish to keep history of all tables. For ex. If a Customer row changes in anyway, I would like to know what it was at a point in time.
This is why I am using the DATEFROM and DATETO columns. With this method I can create a UDF or SP that will extract data by passing the POINT IN TIME as a parameter, by using the following criteria:
WHERE DATEFROM <= @POINTINTTIME AND DATETO > @POINTINTIME
The only problem I find in this logic is when I use child tables such as the Address Table. I do not wish to create redundant addresses whenever the Customer Data is changed.
To answer your question, I must some how modify existing rows in the Address Table because the id_pk of the amended Customer has changed (because it is Auto Generated by SQL2000)
I was thinking of making another Unique Identifier column for the Customer and link the Addresses using the Unique Identifier column. What do you think?
cheers
It is confusing, but I cant think of an easier way to explain it.
The tool for this job is a TRIGGER. I'd suggest using either AFTER UPDATE or INSTEAD OF UPDATE depending on whether you want to know simply that a record was changed, or if you also want to copy out the original data before the UPDATE occurs.
When you copy midified row to history db, customer id should be copied over as well. Don't know why you need to add another Unique Idendifier for each Customer in history db.
Yup, I agree with rmiao; the customerid provides all the reference you need to report on the changes made to a client record. You could also insert a date column with a timestamp default to record the time the change was made.