Author Topic: Bulk loading??  (Read 4990 times)

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Bulk loading??
« on: March 13, 2013, 09:42:10 AM »
I know I've been hitting this support forum hard lately and I appreciate all the responses back...I have ANOTHER strange scenerio..We have an user who wants to send archived reports to ERR for storage from another system (TOPAS)...They are txt files...We setup an application using the Generic Indexer..So the Data Type is set as User Defined and the File Extension is set to DOC on the View Information tab in the Application setup...The user than send the 3 needed files (ie Trigger, Index, Data)...The file names are formatted as ApplicationName.ApplicationGroupName.ard (Trigger), ApplicationName.ApplicationGroupName.ard.ind (Index File) and ApplicationName.ApplicationGroupName.ard.out (Data File)..Under "Normal" circumstances the user would  be sending one file at a time...But in one case, they have 20,000+ files they would want to send to ERR at one time..And obviously single threading them one at a time by sending these 3 files for each file isn't feasible..Anyone know of a way to somehow "bulk load" this many files in a more efficient manner????

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Bulk loading??
« Reply #1 on: March 13, 2013, 09:43:06 AM »
To add, these files are being sent to our server via the NDM process to an ARSLOAD directory.....

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Bulk loading??
« Reply #2 on: March 13, 2013, 09:51:14 AM »
Forgot to specify, within the Index file they send contains the data file path:GROUP_FILENAME:/prod/ode/cmod/arsload4/XTS00001.XTS00001.ARD.OUT..My question is, can these .OUT file names be called anything? or do they have to match a certain format related to the ApplicationName and ApplicationGroupName?

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Bulk loading??
« Reply #3 on: March 13, 2013, 10:16:18 AM »
I think I found an answer to my question...I finally realized the .out file could be called anything, and all that needed to be done was duplicate the GROUP_OFFSET and GROUP_LENGTH lines (this report isn't doing any indexing) and then for the GROUP_FILENAME use the unique name for each file within the .IND (index) file:

CODEPAGE:1252
GROUP_OFFSET:0
GROUP_LENGTH:0
GROUP_FILENAME:/prod/ode/cmod/arsload4/XTS00001.XTS00001.ARD.OUT
GROUP_OFFSET:0
GROUP_LENGTH:0
GROUP_FILENAME:/prod/ode/cmod/arsload4/XTS00002.XTS00002.ARD.OUT

I then tested by first sending the two .OUT files (XTS00001.XTS00001.ARD.OUT &  XTS00002.XTS00002.ARD.OUT) to our ARSLOAD directory first and then sending the TRIGGER and Index files...Both .OUT files  ingested into the system successfully at the same time...

Alessandro Perucchi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1002
    • View Profile
Re: Bulk loading??
« Reply #4 on: March 14, 2013, 03:20:43 PM »
Hello DDP021,

nice that you've found the solution :-)

just one comment, if I may.
It is better for disk space efficiency to have 1 index file with X out files. So CMOD will store the x out files as one big chunk in TSM.
Instead of having 1 ind per out (especially if the out is 1 single document).

Sincerely yours,
Alessandro
Alessandro Perucchi

#Install #Migrations #Conversion #Educate #Repair #Upgrade #Migrate #Enhance #Optimize #AIX #Linux #Multiplatforms #DB2 #Windows #Oracle #TSM #Tivoli #Performance #Audits #Customizing #Availability #HA #DR #JavaApi #ContentNavigator #ICN #WEBi #ODWEK #Services #PDF #AFP #XML

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Bulk loading??
« Reply #5 on: March 21, 2013, 08:09:37 AM »
Thanks Alessandro....We're still trying to determine how many .out files to send at one time...They are basically 'archiving' data from years back into CMOD...Different reports will have different retention requirements so we are setting up different Application Groups accordingly...Also not sure how large these files will be..Guessing they are varying in size...We don't want to back ingestion up for other applications using this ARSLOAD directory but having them send TOO many .out files at one time...We did see that sending more that one file at at time results in one LOAD ID..Hoping this won't cause a problem in the event going forward if they decided they wanted one version removed..By having multiple files stored under one LOAD ID, it wouldn't be possible to unload one particular report/version without unloading ALL the associated reports...At least I think that's how things work????

Alessandro Perucchi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1002
    • View Profile
Re: Bulk loading??
« Reply #6 on: July 03, 2013, 12:16:29 AM »
Exactly... that's how it works... you need to find the balance between too many files in 1 load, and not enough...
Alessandro Perucchi

#Install #Migrations #Conversion #Educate #Repair #Upgrade #Migrate #Enhance #Optimize #AIX #Linux #Multiplatforms #DB2 #Windows #Oracle #TSM #Tivoli #Performance #Audits #Customizing #Availability #HA #DR #JavaApi #ContentNavigator #ICN #WEBi #ODWEK #Services #PDF #AFP #XML

LWagner

  • Guest
Re: Bulk loading??
« Reply #7 on: October 15, 2013, 07:07:44 AM »
If each report is a separately documented index within the same load id, then individual documents can be deleted.

You would use an arsdoc -query with the SQL specified to show the documents to delete, then change to arsdoc -delete .

Alessandro Perucchi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1002
    • View Profile
Re: Bulk loading??
« Reply #8 on: November 01, 2013, 07:46:35 AM »
If each report is a separately documented index within the same load id, then individual documents can be deleted.

You would use an arsdoc -query with the SQL specified to show the documents to delete, then change to arsdoc -delete .

Hello LWagner,

using this method works, but you will end up deleting ONLY the index, and NOT the data behind the indexes.
So depending on how you have configured TSM, and how is the compliance defined for archiving documents, it might be a legal problem that the document itself is still in the archive storage...

Sincerely yours,
Alessandro
Alessandro Perucchi

#Install #Migrations #Conversion #Educate #Repair #Upgrade #Migrate #Enhance #Optimize #AIX #Linux #Multiplatforms #DB2 #Windows #Oracle #TSM #Tivoli #Performance #Audits #Customizing #Availability #HA #DR #JavaApi #ContentNavigator #ICN #WEBi #ODWEK #Services #PDF #AFP #XML