Author Topic: Statement Indexing Troubleshooting  (Read 2800 times)

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
Statement Indexing Troubleshooting
« on: April 09, 2015, 07:24:46 AM »
We are outsourcing the createment of statements and we're using PDF files and PDF Indexer.

My vendor provided me a file of PDF statements that they said had 3039 statements.

When I load the statements into CMOD via PDF Indexer, I only get 3038??

We are using "Page 1 of" as our trigger to uniquely identify the first page of a statement.

When I open their file in Adobe X and do an advanced search I find 3039 instances of the phrase "Page 1 of"

But PDF Indexer only loads 3038 statements??

I've looked at this phrase across all 3039 statements but can only get 3038 to load.

Has anybody ever encountered this?  How could I troubleshoot?  I'm baffled!! :(

jeffs42885

  • Guest
Re: Statement Indexing Troubleshooting
« Reply #1 on: April 09, 2015, 07:36:24 AM »
I've seen this before and it was a headache to troubleshoot.

You mentioned that you've looked at this phase across all 3039 statements, but just out of curiousity..tucked Away in this document, could there be a statement with two pages? In the case I saw, it was the to: address had extra lines..

Examples

This would load one page/statement working as expected

Jeff S
1234 Main St
Anytown CA, 90210

Something like this would cause the statement to pour over into the next page:

Jeff S
CMOD Person
1234 Main St
Anytown CA, 92010

Jeff S
1234 Main St
Suite 234
Anytown CA, 92010

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
Re: Statement Indexing Troubleshooting
« Reply #2 on: April 09, 2015, 01:36:31 PM »
I think there's somethign to the first statement rolling over onto the first page of the duplicate statement and indexer not interpreting the first page of the 2nd statement as a new statement.

Just so I'm straight on uniqueness, there's nothing ensuring uniqueness of statements right?

The statement that isn't loading is completely identical to the one that precedes it.

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2230
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Statement Indexing Troubleshooting
« Reply #3 on: April 09, 2015, 02:35:17 PM »
Depending on your version of CMOD, uniqueness may be enforced automatically.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
Re: Statement Indexing Troubleshooting
« Reply #4 on: April 13, 2015, 03:23:32 PM »
So uniqueness isn't our issue - as there are some accounts for which we create two statements for in CMOD.

What's happening is that there are a couple of accounts for which we generate the same statement multiple times and create in CMOD.

For some instances of these, the "Page 1 of" trigger isn't being interpreted as the beginning of a new statement but instead as the continuation of an existing statement.

In our 167,000 statements this happens 5 times and the result is for these 5 statements, they're combined with the statement ahead of them :(

Is there a way to analyze the PDF file to see control characters that may not be visible in the PDF file when viewing it?  Is it even possible for a control character to be in a PDF file and that not be visible?

pankaj.puranik

  • Sr. Member
  • ****
  • Posts: 374
    • View Profile
Re: Statement Indexing Troubleshooting
« Reply #5 on: April 15, 2015, 01:39:55 AM »
Is it possible for you to share the indexing script/parameter information that you used?