OnDemand User Group

Support Forums => Report Indexing => Topic started by: tjspencer2 on April 09, 2015, 07:24:46 AM

Title: Statement Indexing Troubleshooting
Post by: tjspencer2 on April 09, 2015, 07:24:46 AM
We are outsourcing the createment of statements and we're using PDF files and PDF Indexer.

My vendor provided me a file of PDF statements that they said had 3039 statements.

When I load the statements into CMOD via PDF Indexer, I only get 3038??

We are using "Page 1 of" as our trigger to uniquely identify the first page of a statement.

When I open their file in Adobe X and do an advanced search I find 3039 instances of the phrase "Page 1 of"

But PDF Indexer only loads 3038 statements??

I've looked at this phrase across all 3039 statements but can only get 3038 to load.

Has anybody ever encountered this?  How could I troubleshoot?  I'm baffled!! :(
Title: Re: Statement Indexing Troubleshooting
Post by: jeffs42885 on April 09, 2015, 07:36:24 AM
I've seen this before and it was a headache to troubleshoot.

You mentioned that you've looked at this phase across all 3039 statements, but just out of curiousity..tucked Away in this document, could there be a statement with two pages? In the case I saw, it was the to: address had extra lines..

Examples

This would load one page/statement working as expected

Jeff S
1234 Main St
Anytown CA, 90210

Something like this would cause the statement to pour over into the next page:

Jeff S
CMOD Person
1234 Main St
Anytown CA, 92010

Jeff S
1234 Main St
Suite 234
Anytown CA, 92010
Title: Re: Statement Indexing Troubleshooting
Post by: tjspencer2 on April 09, 2015, 01:36:31 PM
I think there's somethign to the first statement rolling over onto the first page of the duplicate statement and indexer not interpreting the first page of the 2nd statement as a new statement.

Just so I'm straight on uniqueness, there's nothing ensuring uniqueness of statements right?

The statement that isn't loading is completely identical to the one that precedes it.
Title: Re: Statement Indexing Troubleshooting
Post by: Justin Derrick on April 09, 2015, 02:35:17 PM
Depending on your version of CMOD, uniqueness may be enforced automatically.
Title: Re: Statement Indexing Troubleshooting
Post by: tjspencer2 on April 13, 2015, 03:23:32 PM
So uniqueness isn't our issue - as there are some accounts for which we create two statements for in CMOD.

What's happening is that there are a couple of accounts for which we generate the same statement multiple times and create in CMOD.

For some instances of these, the "Page 1 of" trigger isn't being interpreted as the beginning of a new statement but instead as the continuation of an existing statement.

In our 167,000 statements this happens 5 times and the result is for these 5 statements, they're combined with the statement ahead of them :(

Is there a way to analyze the PDF file to see control characters that may not be visible in the PDF file when viewing it?  Is it even possible for a control character to be in a PDF file and that not be visible?
Title: Re: Statement Indexing Troubleshooting
Post by: pankaj.puranik on April 15, 2015, 01:39:55 AM
Is it possible for you to share the indexing script/parameter information that you used?