Author Topic: ARSDOC GET with complex requirement  (Read 4418 times)

dakun

  • Guest
ARSDOC GET with complex requirement
« on: May 16, 2014, 08:21:19 AM »
Hi Folks,

I received a requirement that retrieving multiple document(PDF) with IND and output to unique output name.
environment: z/OS, CMOD 9.0
Data fact:
- report were loaded as PDF by generic indexer to 3 different AG with same index.
- duplicate reports are existing.
e.g.
docid docdate      doclang
ABC   01/01/2013 EN  (loaded on 01/01/2013)
ABC   01/01/2013 EN  (loaded on 01/02/2013)
ABC   01/01/2013 EN  (loaded on 01/03/2013)
DEF   02/01/2013 EN  (loaded on 02/01/2013)
DEF   02/01/2013 EN  (loaded on 02/02/2013)
DEF   02/01/2013 SP  (loaded on 02/01/2013)

Requirement:
- Need IND and data files for each record by given index value
- For duplicated records, need the last loaded document to be retrieved

Expectation of example above
output records will be
ABC   01/01/2013 EN  (loaded on 01/03/2013)
DEF   02/01/2013 EN  (loaded on 02/02/2013)
DEF   02/01/2013 SP  (loaded on 02/01/2013)


Here is what I have tested:
- arsdoc get -h ARCHIVE -f "folder name" -acg -o output file -i "where doc_id='ABC'" -N -v
 I retrieved files and concatenate 3 hits into 1 file with.ind, the output file name is defaulted to -o and AG name
- arsdoc get -h ARCHIVE -f "folder name"  -a -o "(docid)(docdate)(doclang)".SEQ -i "where docid='ABC'" -v
 I retrieved files and 3 hits output to 3 file withOUT.ind, the output file name is combination of database field with sequential number.

ARSDOC doesn't allow to use -c -g with database field, so I couldn't figure out how to generate unique output name and only the latest data will be retrieved.

please share your opinion.

LWagner

  • Guest
Re: ARSDOC GET with complex requirement
« Reply #1 on: May 23, 2014, 07:35:47 AM »
Could you identify the application group field/column names ?

I have a load date field, LOAD_DATE in all application groups.
If you do also, you can specify the load date value you want.
If you want that in the SQL statement, that will be beyond my current SQL abilities, but may be possible.

I think the WHERE clause would need to have a MAX(LOAD_DATE) among its selection criteria.

And add (LOAD_DATE)  to the output name file structure.

LWagner

  • Guest
Re: ARSDOC GET with complex requirement
« Reply #2 on: July 17, 2014, 12:14:07 PM »
You may need to specify the documents individually for retrieval using a parameter file.
I recently did that.  The use of column names then works.

I retrieved a total of 19,000 PDFs in this manner. But the -o will include spaces in the values with the file names.  Use of the JAVA API calls may be a more effective retrieval method.