Author Topic: Ingesting an XML document into OnDemand  (Read 5951 times)

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Ingesting an XML document into OnDemand
« on: March 15, 2012, 07:01:00 AM »
Hi, We have some users who want to load XML documents into OnDemand...We've successfully been able to load PDF, AFP and text documents...I've read from various posts that in order to import other formats into you use the GENERIC indexer....When doing so we get the following msg, "arsload: Generic Indexer requires the data to have already been indexed.  Index the data or verify your input file and resubmit the job"..I've read that an index file must be created...I'm just not sure how to create this and where it is stored...We are looking for the users to send their XML documents directly to our OnDemand Server and be collected by ARSLOAD...This is how we are currently doing our PDF's....Any help would be appreciated...Thanks!

demaya

  • Guest
Re: Ingesting an XML document into OnDemand
« Reply #1 on: March 16, 2012, 12:19:47 AM »
Hi,

you can also use the normal indexer. I have here some slides about an xml archiving project of the ibm germany. They added a "information" line on the top of the XML so the indexer can load the index values properly. And this for "some" millions of sepa transactions.

Maybe the head-line is an approach you can use?

Otherwise you can use index files. InfoCenter: http://publib.boulder.ibm.com/infocenter/cmod/v8r5m0/index.jsp?topic=%2Fcom.ibm.ondemand.indexingmp.doc%2Fars1d171307.htm

Since the xml isn't valid (head line) they rendered it before the user can view it.

Cheers

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Ingesting an XML document into OnDemand
« Reply #2 on: March 16, 2012, 05:47:21 AM »
Thanks for the info....From what I've read, this would require me to use the GENERIC INDEXER...Attempting to run it via the wizard the only options are LINE, AFP and PDF..When choosing LINE and using my excel spreadsheet as my sample, the data isn't rendered readable....I had reviewed the info from IBM's site on the generic indexer and just not sure how to define the parameter file, and once defined, where this file should be stored....Is it on the OnDemand server?...I was hoping there was someone out there that has already loaded an EXCEL document to OnDemand but may not....

demaya

  • Guest
Re: Ingesting an XML document into OnDemand
« Reply #3 on: March 16, 2012, 05:56:29 AM »
Mhhh Excel or pure XML?

Frederick Tybalt

  • Full Member
  • ***
  • Posts: 124
    • View Profile
    • Personal Website
Re: Ingesting an XML document into OnDemand
« Reply #4 on: March 16, 2012, 06:12:48 AM »
This is the common format of the index file to be created. Any type of document can be loaded using generic indexing. The index file name should be filename.ind, the file with content should have filename.out and there should be trigger file name named as filename

Code: [Select]
COMMENT:
CODEPAGE:
COMMENT:
GROUP_FIELD_NAME:INDEX1
GROUP_FIELD_VALUE:Value1
GROUP_FIELD_NAME:INDEX2
GROUP_FIELD_VALUE:Value2
GROUP_FIELD_NAME:INDEX3
GROUP_FIELD_VALUE:Value3
GROUP_OFFSET:0
GROUP_LENGTH:100
GROUP_FILENAME:filename.out
GROUP_FIELD_NAME:INDEX1
GROUP_FIELD_VALUE:Value1
GROUP_FIELD_NAME:INDEX2
GROUP_FIELD_VALUE:Value2
GROUP_FIELD_NAME:INDEX3
GROUP_FIELD_VALUE:Value3
GROUP_OFFSET:101
GROUP_LENGTH:200
GROUP_FILENAME:filename.out

In application, the indexer should be chosen as "Generic". Hope this helps.
rIcK
======------------------======
www.rick.co.in | www.tekbytz.com

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Ingesting an XML document into OnDemand
« Reply #5 on: March 16, 2012, 08:13:04 AM »
Thanks again for the responses...These believe will be true XML documents....I have defined the application group and application to OnDemand...Forgive me for my ignoranance, but where would the files you mention in line, "The index file name should be filename.ind, the file with content should have filename.out and there should be trigger file name named as filename"...From what I understand the filename.out would be the actual XML document???...We are hoping the application can send these XML documents directly to our OnDemand server...They are currently doing that with their PDF documents...We have them name them Application.ApplicationGroup.pdf...They send them to a specific arsload directory on the server and they get ingested automatically....I'm not sure how this same process would work with what you describe...This is all new to us and we haven't found an examples to go from as far as what needs to be defined...Currently all of our line data files come from mainframe jobs...Do these XML files have to be sent first to the mainframe prior to ingestion into OnDemand?...


demaya

  • Guest
Re: Ingesting an XML document into OnDemand
« Reply #6 on: March 16, 2012, 09:15:06 AM »
Hi,

what Frederick is talking about is that you give OnDemand all the information it needs to archive the data in the ind file. Instead of let arsload find the information by itself.

OUT = XML file
IND = all the Values (see the code box)
ARD = trigger = OUT and IND are fully copied to the directory

You can send the full triple (OUT + IND + ARD) to a directory (copy and paste / samba) and let arsload run against that directory. From what I understand the same process you do with line data.

Cheers

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2230
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Ingesting an XML document into OnDemand
« Reply #7 on: March 16, 2012, 09:33:41 AM »
The short answer is that you must write a program to create the Generic Index file (possibly based on the contents of the XML file itself) for arsload to process.

You can write this program in any language, but a language optimized for text processing (like Perl, especially with XML modules) will be your best choice.

Once your program is complete, and produces the generic index files for arsload, you need to create an Application with a 'User Defined' type, and ensure that your users have the ability to view XML files on their PCs.

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR