OnDemand User Group

Support Forums => Report Indexing => Topic started by: DDP021 on June 13, 2012, 09:47:42 AM

Title: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 13, 2012, 09:47:42 AM
Can anyone provide the procedure for ingesting a NON indexed EXCEL spreadsheet into OnDemand8.4?....Basically all we're trying to do is 'dump' a spreadsheet into CMOD...No indexing....Currently we are using ARSLOAD to ingest both text, AFP and PDF documents...We had an inquiry from a user to load an excel (XLS) spreadsheet...any help or specific direction would be appreciated.
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: jeffs42885 on June 13, 2012, 09:52:21 AM
Can the file be saved as a CSV file? If so, I would assume you would just index it like a text file.
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 13, 2012, 09:54:02 AM
That I'm not sure....The question was just raised if we could ingest a true EXCEL spreadsheet...my guess is they don't want to do anything on there end other than providing us the spreadsheet itself.
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: jeffs42885 on June 13, 2012, 11:51:09 AM
I just tried doing this, It did not open properly with the indexer. (xls file)

My advice would be to load the file with a generic index file that contains rname, rdate...etc..and tailor your ARSLOAD to it.

If not, some type of third party product might need to be used..I think XENOS can do this, though I have never done it.
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: Alessandro Perucchi on June 13, 2012, 09:44:50 PM
Hello,

you can ingest whatever format you want in OnDemand, just be sure for such non indexed data to choose "User Defined Data" with the extension "xls" or "xlsx" in the application, and choose "Generic Index".
Meaning that you must provide a minimum index with it like, "date" and "report type" for example.

Sincerely yours,
Alessandro
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 14, 2012, 04:27:42 AM
Alessandro,  I had attempted previously to do what you suggested..I defined the application with the view information as 'user defined' and xls as the application...I also set the indexer as Generic..I've read in other posts when the generic indexer is used, a separate index file must be created...I'm just not sure how it is to be formatted or where it is stored...Currently we have applications who send pdf files directly to our OnDemand server under an ARSLOAD library which then automatically gets ingested into Content Manager OnDemand by ARSLOAD...A majority of the data going to OnDemand currently is coming from the mainframe and the files all end in .ARD...I believe out systems area had to update something in ARSLOAD to allow the automatic ingestion of PDF's...The format of the files are Application.ApplicationGroup.pdf..I'm assuming with the XLS files, something will need to be updated in ARSLOAD to accept a file formatted as Application.ApplicationGroup.xls...But again, the question I have is regarding how to provide this minimum index file needed, its format and where it's stored.
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 14, 2012, 04:34:08 AM
Alessandro, here is a copy of the 88 message I receive with just having the application defined with "user defined" and the generic indexer...It seems to point back to the indexer file but as my previous post indicated, not sure how to define that piece...

arsload: 06/06/12 09:18:36 -- Indexing started, 432128 bytes to process
arsload: Unknown Indexer ><
arsload: Output/Indexer file was not created
arsload: 06/06/12 09:18:36 Indexing failed
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: rstockton on June 14, 2012, 08:52:16 AM
We use the method defined by Alessandero - user defined, xls and generic indexer - with .ARD, .IND and .OUT files with the .IND containing the index inofrmation.

Regards,
Russell
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 14, 2012, 09:50:45 AM
Russell, do you have any examples?..Currently we have users sending PDF files right to our PROD ERR server (to an ARSLOAD) directory...They format their files as ApplicationName.ApplicationGroupName.pdf...Once the file gets to the ARSLOAD directory, it is automatically ingested...We obviously define the ApplicationName and ApplicationGroupName...The difference being with it being a PDF, the application contains the indexing information which we obtain by running the data through the report wizard on the ADMIN client..We would like to accomplish the same thing with an XLS file..(ex ApplicationName.ApplicationGroupName) minus the indexing......I'm just not sure how you incorporate all these files...Especially the .IND file that contains the indexing...I've not been able to find a true example of this kind of file and how it would get incorporated in the process...Specifically where it's defined and stored...
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: rstockton on June 14, 2012, 11:27:19 AM
Following is an example of how we load files to OnDemand.


Our user application sends us the following files.  The .ARD is the trigger file, the .IND is the Index file and the .OUT file is the data file.  These three files are dropped into a loading zone and picked up by arsload job for the application group.

ADVCLAIM.PD.ACLAIMINB.ACLAIMINB.D030312.T18300200901207908.ARD
ADVCLAIM.PD.ACLAIMINB.ACLAIMINB.D030312.T18300200901207908.ARD.IND
ADVCLM/ADVCLAIM.PD.ACLAIMINB.ACLAIMINB.D030312.T18300200901207908.ARD.OUT


.IND file example

CODEPAGE:850
GROUP_FIELD_NAME:InboundDataType
GROUP_FIELD_VALUE:1
GROUP_FIELD_NAME:File_type
GROUP_FIELD_VALUE:BIN
GROUP_FIELD_NAME:CustomerID
GROUP_FIELD_VALUE:xxxxxx
GROUP_FIELD_NAME:Processed_date
GROUP_FIELD_VALUE:03/03/12
GROUP_FIELD_NAME:Processed_time
GROUP_FIELD_VALUE:183000000002
GROUP_FIELD_NAME:FileID
GROUP_FIELD_VALUE:36355375
GROUP_FIELD_NAME:TransID
GROUP_FIELD_VALUE:TRB3B2F0821000526
GROUP_OFFSET:000000000
GROUP_LENGTH:213
GROUP_FILENAME:/arsacif/load3/ADVCLM/ADVCLAIM.PD.ACLAIMINB.ACLAIMINB.D030312.T18300200901207908.ARD.OUT
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 18, 2012, 03:16:46 AM
Thanks Russell.....What I'm attempting to do for now is just ingest a sample Excel spreadsheet with NO indexing...So on my Application Group the only fields I have defined are POSTING DATE, REPORT NAME, REPORT DESCRIPTION, LOAD DATE and PAGE COUNT.  These are all FILTERS.  For our normal text files the .ARD is the actual data file. From what you indicated, the .OUT is the actual spreadsheet data file... Forgive my ignorance, but not familiar with a trigger file..with never having used a separate index file (.IND), just not sure how/where it's defined.  And once i get past that point, will it have to be supplied to the end user who is actually sending the spreadsheet data to our server to be ingested by ARSLOAD?....
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 18, 2012, 06:53:19 AM
Russell,  what fields need to be defined in the .ind file?...As indicated, this is an excel spreadsheet with NO indexing...I'm basically attempting to ingest it into Content Manager OnDemand8.4 as is...Does the .ind file need to contain all the fields defined under the application group?...It appears from your example you are specifying values for these fields...The 'fields' I have defined in the application group are:
POSTING_DATE
REPORT_NAME  <  Report name defined
DESCRIPTION  <  Description of the report
LOAD_DATE
PAGE_COUNT

For the POSTING_DATE and LOAD_DATE we set the Default Value with a lower case 't' on Load Information tab on the Application definition.

I just don't know what I have to have specified on this .ind file...

Also, when you define it, do you simple use notepad and then save it in the name format needed, (ie ApplicationGroup.Application.ARD.IND).?
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: Stephen McNulty on June 18, 2012, 08:59:47 AM
If you look in the Indexing Reference manual (on any platform)  look to the chapters on Generic indexer reference.  You will see the parameters listed, as well as sample index files.  and yes, any editor will do.
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: Alessandro Perucchi on June 18, 2012, 01:24:12 PM
DDP021,

CMOD is a structure archiving system, meaning that you MUST provide an index with your file.
If your user don't provide an index, then you must provide a mechanism that will create the generic index that Russel provided as an example.

What I do sometimes with my customer is create a script that at the end launch "arsload" manually.
And before launching arsload, it will create an index automagically for the files that doesn't have indexes. But to create such index, then you must have some naming convention, index convention, etc... and then you can create your script that create the index for you.

But basically, no index, no archiving of documents.

The ONLY thing I might imagine is that, with the explanation I gave you (APP with User Type : xls  + Generic Index)
Then you give default values for "Report Type" like "Excel report"
and for the date you give the value "t"   <- not a typo, really the letter "t", which means "today"
If you want for the date, you might set it up in the application group as a date/time... so it might be easier to find the document you want.

1) create a script that check every minute if a file is in the input directory.
2) check if this is a .xls
3) create a index with that template:

CODEPAGE:923
GROUP_OFFSET:0
GROUP_LENGTH:0
GROUP_FILENAME:<your_file_that_you_have_found_in_the_input_directory.xls>

Name it for example:   myfile-xls-20120618221902.ind
4) call arsload:
      arsload -I archive -u admin -p password -g myAG-XLS -a myAPP-XLS -nvf myfile-xls-20120618221902

it will archive automatically without any index, because they are automatically generated in the application with default values.

But again, you need something... without anything it won't work.

If it's not clear (it could be... It's quite late for me right now!!) tell me, and I will do a full explanation with all details.

Sincerely yours,
Alessandro
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 20, 2012, 04:36:37 AM
Alessandro,  thank you (and Russell) for all the information...I finally was able to get things to work...I didn't realize at first the trigger file was simply a blank file...Once I realized that I created it, along with the index file (.ind)...My index file simply contains just the Codepage, Group offset, Group length and Group file name information.  Currently our systems group have files ending with .pdf configured to automatically ingest with ARSLOAD...So here's what configured:

TEST.TEST.PDF  < Empty file
TEST.TEST.PDF.IND < within this file I have the complete path and data file name (TEST.XLS)
TEST.XLS < Actual data file

I copy the TEST.XLS to one of our ARSLOAD directories on our server first
I then copy both the TEST.TEST.PDF and TEST.TEST.PDF.IND to the same directory as the TEST.XLS file.
ARSLOAD then automatically ingests the file.  After ingestion is complete, the TEST.TEST.PDF and TEST.TEST.PDF.IND files are deleted.
We just need to work with our systems group to configure ARSLOAD to also delete the TEST.XLS file after ingestion

We loaded not only XLS but also TXT and CSV file types configuring the view information on the Application setup with the file extension it should be opened with in CMOD.

Again, thanks to everyone for all the valuable information you provided. 

Take care.

Dave
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: Alessandro Perucchi on June 20, 2012, 05:47:32 AM
what you could simply do... is to have

TEST.TEST.PDF = TEST.XML

and TEST.TEST.PDF.IND has the complete path of TEST.TEST.PDF

so at the end you will have that:

TEST.TEST.PDF   (which in fact contains the whole XLS but with a name without ending with .xls :-D)
TEST.TEST.PDF.IND (which is the same index but with the reference path to TEST.TEST.PDF)

and that's it... So at the end, both files are removed.

Sincerely yours,
Alessandro
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 20, 2012, 07:13:42 AM
Correct...We tested that and both of those files get removed...The issue is we are loading .XLS files, and the actual data file (ie test.test.xls) file ISN'T being removed from the ARSLOAD directory after it's ingested....Currently, with our PDF's, they send one file (ex test.test.pdf) and once it's ingested, it's removed....Guessing systems needs to define something in ARSLOAD to automatically remove .xls files after ingestion.
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: Alessandro Perucchi on June 21, 2012, 10:32:00 PM
Hello DDP021,

by reading your answer, I get the impression I was not clear in my suggestion... so if I'm repeating myself and you've understood the first time, please forgive me in advance!!

what I wanted to say is that you don't need 3 files

Quote
TEST.TEST.PDF  < Empty file
TEST.TEST.PDF.IND < within this file I have the complete path and data file name (TEST.XLS)
TEST.XLS < Actual data file

You could do like that:

Remove TEST.TEST.PDF   (since it's empty you won't have any issues)
rename TEST.XML into TEST.TEST.PDF
Change the reference in TEST.TEST.PDF.IND so it points with GROUP_FILENAME: to TEST.TEST.PDF instead of TEST.XML.

that way, you will have exactly 2 files:

TEST.TEST.PDF
TEST.TEST.PDF.IND

and nothing else.

Is it clearer now?
Concerning the delete of files after the load is successful, maybe that's a bug, you might open a PMR by IBM for that.


Now, I'm just wondering, why do you want to use the name TEST.TEST.PDF ??? I mean if it's PDF, then it's not XLS, or anything else... why is that? That is just a test? Or is there something else behind?
Can't you create a new Application in your ApplicationGroup in order to be able to handle better XLS and have a better naming convention, like TEST.TEST.XLS and TEST.TEST.XLS.IND without faking the system with some PDF extension somewhere?
Or have I missed something???

Sincerely yours,
Alessandro
Title: Re: Ingesting an EXCEL spreadsheet into ERR
Post by: DDP021 on June 25, 2012, 04:06:11 AM
Alessandro,  I believe we finally have everything in place we need...There a basically 3 file types we are loading (CSV, XLS and TXT)...All 3 need to be opened in OnDemand using Excel...So I've defined the Application as User Defined with the XLS extension....I'm then using 3 files for the ARSLOAD funtion:

Example:

TEST.TEST.ARD  (Blank "Trigger" file)
TEST.TEST.ARD.IND (Index File)
TEST.TEST.ARD.OUT (Actual data file)

These naming conventions allow the ARSLOAD process to automatically run and delete all 3 files after successful ingestion...

I do have an off the wall question...This hasn't come up yet, but if the Application would want "breaks" inserted on the data when viewing the spreadsheets in OnDemand, is that something that is possible?...We enable the "Page Break" function on the Application Group for Line data which utilizes Carriage Control (1 in column 1 at the top of every page)...Obviously an Excel spreadsheet or txt files don't have that...So right now we are just loading the entire spreadsheet without and breaks...Just curious if we could the application would request we break after so many lines on the spreadsheet, if that default could be defined/set somewhere?...