Author Topic: PDF Indexing Issue  (Read 4139 times)

subbu

  • Guest
PDF Indexing Issue
« on: February 21, 2014, 12:38:16 AM »
Hi,

Has anyone encountered the below issue during PDF indexing. Input file size is 350 MB.  We are using CMOD 7.1.2.11

ARS4902 Number of input pages = 238320
ARS4918 Page extraction failed!
Exception raised: 1073807393, Bad master index.
ARS4922 ARSPDOCI completed code 1
arsload: 02/19/14 14:13:42 Indexing failed

Any suggestions are welcome.


cheers
Subbu

Frederick Tybalt

  • Full Member
  • ***
  • Posts: 124
    • View Profile
    • Personal Website
Re: PDF Indexing Issue
« Reply #1 on: February 21, 2014, 12:48:40 AM »
This could be the problem with large file. Try to break the file and load else CMOD version need to be upgraded to 8.4.1.6 where this issue is fixed.
« Last Edit: February 21, 2014, 12:54:53 AM by Frederick Tybalt »
rIcK
======------------------======
www.rick.co.in | www.tekbytz.com

subbu

  • Guest
Re: PDF Indexing Issue
« Reply #2 on: February 21, 2014, 01:41:11 AM »
Thanks Rick.
Just want to see if there are any options other than breaking the file. We will be upgrading to CMOD 8.5.0.1 in about 4 weeks from now.

Is there a limit on the input file size for PDF indexer for every CMOD version. if so, what it is for CMOD 7.1.2.11.


Cheers
Subbu

subbu

  • Guest
Re: PDF Indexing Issue
« Reply #3 on: February 21, 2014, 01:57:00 AM »
I also checked on the IBM website and here it what it says:


http://www-01.ibm.com/support/docview.wss?uid=swg1PM16874

Workaround is to break the input file into smaller pieces, then
load each individually.

Local fix
Break input file into smaller pieces, load each individually.

Problem summary
 Release 8.4.1.6

Problem conclusion
 Release 8.4.1.6


Cheers
Subbu

jeffs42885

  • Guest
Re: PDF Indexing Issue
« Reply #4 on: February 21, 2014, 07:01:57 AM »
Check the ulimits on the account running arssockd/arsload, as well as root. I have seen this happen before with much smaller files (100MB) and the ulimits fixed it once we set them all to unlimited (granted we were not using the PDF indexer, but its still worth looking into) Of course do this in your lower tiers first.

If that doesnt work, upgrade a development or sandbox to 8.5.0.x (i think 8.5.0.8 is latest version) and then try loading it there.

I however suspect the ulimits.

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2230
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: PDF Indexing Issue
« Reply #5 on: February 24, 2014, 10:35:00 AM »
And just a reminder...  If you're using cron to start the load process, you need to set root's ulimits to be unlimited as well.

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

LWagner

  • Guest
Re: PDF Indexing Issue
« Reply #6 on: April 03, 2014, 11:19:41 AM »
Was this resolved ?

z/OS CMOD 8.4.0.3 had a limit of about 250 Mb input file documented.

The bigger issue was that PDF compression is poor or non-existent for multiple same graphics. We saw PDF file expansion by a factor of 19 (!), so we limited our PDF input files to under 10 Mbs.

In discussion with IBM, we learned that the compression was not resolved until CMOD 8.5.  In addition, IBM recommended we index from Windows servers, with final upload to the z/OS server.  That gave us the best performance at that time.

We now have a second CMOD system,, v 8.5 on AIX. All our PDFs go to that box, indexed from 5 Windows servers.

On that system, we had load failures of large report PDFs of about 60 Mbs. Bigger problem was that they took forever to load if they needed to be viewed. One user documented a load of one very large PDF at 6 hours.  I suggested she might have a network configuration issue.

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2230
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: PDF Indexing Issue
« Reply #7 on: April 06, 2014, 03:28:35 AM »
I've been told very recently that PDF indexing & storage has been dramatically improved in all aspects in the latest versions of CMOD -- speed, compression, etc.  I'll try and get more information for you.

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

subbu

  • Guest
Re: PDF Indexing Issue
« Reply #8 on: April 10, 2014, 02:27:54 AM »
hi,

we have upgraded to CMOD 8.5.0.7 and the issue is resolved.


cheers
Subbu