If you're going to move forward with this, I have a few recommendations.
1) Data validation. LOTS OF IT. CMOD is an archive, and data that is improperly indexed (typos, blank fields, etc.) is lost forever.
2) Aggregate data (many files, one generic index file) and load with arsload. There are many technical advantages to this approach.
3) Don't move data to some write-once media for a few days. Corrections are bound to happen.
4) Try to prevent duplicates. Nothing is more confusing than two different files with identical index criteria. (Add a date-time field defaulted to load time!)
I'm sure there are others, but these are the ones off the top of my head. Good luck!
-JD.