I have a large table that is completely denormalized. I'm averaging about 300,000 inserts per minute. (I didn't design this, I inherited it)

Due to the flat structure, its fast to write into, but takes up too much space so I can't maintain much before having to truncate. I want to normalize it a little bit. The nature of the data is that it contains repetitive strings, but each transaction is unique and does not have any kind of numeric "id"s that make it easy for me to normalize.

Based on repetitive sets of data, I have determine which tables I want to create, so the main table uses a single integer to reference each table/entry. The problem is, I don't know how to normalize this efficiently.

So here's an example...

temp_table:
id
time
serial
name
model
result

Needs to be normalized to:

main_table:
id
time
type_id
unit_id
status_id

type_table:
id
serial
name

unit_table:
id
model

status_table:
id
result

So I'm thinking of approaching this in 2 ways; the temp table can roll every hour, and each hour I will run my normalization routine, or I can normalize on the fly.

I was following the steps in this article:
http://www.freeopenbook.com/mysqlcoo...8-SECT-10.html

Until I realized that these examples depend on an "id" already existing and repeating. But I don't already have id's representing the data, and I don't know in advance what all the strings will be. Its too dynamic.

help?