Author Topic: Error Using PPD for PDF Indexing for the first time  (Read 2775 times)

JJeffrey

  • Guest
Error Using PPD for PDF Indexing for the first time
« on: October 24, 2017, 09:47:47 AM »
I'm trying to index PDF's using Page Piece Dictionary for the first time. The input is vendor generated. I keep getting this error on the load attempt:

ARS4334I Load Version <9.5.0.9>  Operating System <Linux> <#1 SMP Thu Jul 6 19:56:57 EDT 2017.3.10.0-693.el7.x86_64>  OS Userid <ADMIN>  Install Location </opt/ibm/ondemand/V9.5/> Data(unlimited KB) Stack(8192
ARS4335I Server Version <9.5.0.9>  Operating System <Linux> <#1 SMP Thu Jul 6 19:56:57 EDT 2017.3.10.0-693.el7.x86_64>  Database <DB2> <10.05.0006>
ARS4339I Application Group >MLETTERS<
ARS4340I Application >DENMEM<
ARS4341I Storage Set >Cache Only - Library Server<
ARS4342I Storage Node >Cache Only - Library Server<
ARS4302I Indexing started, 472376 bytes to process
ARS4901I INDEXSTARTBY=1
ARS4901I RESTYPE=all
ARS4901I INDEXMODE=INTERNAL
****
ARS4902I Number of input pages = 4
ARS4918E Page extraction failed!
Exception raised: 536936466, Expected a name object.
ARS4922I ARSPDOCI 9.5.0.9 completed code 1
ARS4309E Indexing failed

IBM Technical says that the pdf input is corrupt. If it is why can I open it? Consulting the google oracle has been of no help. Any ideas from the usergroup would be welcome.

We're running CMOD 9.5.0.9 on a RHEL7 server. 

jsquizz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 576
    • View Profile
Re: Error Using PPD for PDF Indexing for the first time
« Reply #1 on: October 24, 2017, 10:09:03 AM »
A quick glance at this - https://community.spiceworks.com/topic/1464881-adobe-acrobat-expected-a-number-object

Maybe someone can chime in, I am not 100% sure how the process works under the covers but it sonuds like your source system may have messed up when adding PPD's.
#CMOD #DB2 #AFP2PDF #TSM #AIX #RHEL #AWS #AZURE #GCP #EVERYTHING

JJeffrey

  • Guest
Re: Error Using PPD for PDF Indexing for the first time
« Reply #2 on: November 01, 2017, 06:24:26 AM »
The PDF had to be analyzed by Datalogics via IBM. This is what the problem:
Datalogics reports that there are problems with the Annotation arrays in the document.  Some of the entries in the Annots array contain Parent entries that point to empty objects.

I'm not a PDF expert but this issue doesn't affect opening the PDF to view. The findings were sent back to the vendor and I received a correct document yesterday and successfully loaded the documents using PPD in both V9.5.0.9 and V9.0.0.7. (Don't ask, we're migrating to new servers,  a new CMOD version and converting our TSM 5.5 and migrating that off the old AIX servers).