OnDemand User Group

Support Forums => MP Server => Topic started by: fnb4321 on June 06, 2017, 09:54:50 AM

Title: Calling PDF Indexing Guru's
Post by: fnb4321 on June 06, 2017, 09:54:50 AM: I have a file (attached and is not live data) that I need to only create an index for the MEMOSTMT pages (which is only 2 pages in the file) and not have the rest of the data in the index. I have tried a couple things and I can get the one index but it contains more than just the 2 MEMOSTMT pages. I am indexing on the small gray font at the bottom of Page 1. I have also attached some indexing parameters I have tried.

Does anyone have any suggestions on how I can filter out everything except for the 2 MEMOSTMT pages ? Is it even possible?

I am using the PDF indexer with X/Y coordinates and we are at 9.5.0.2

Any assistance or insight would be greatly appreciated.
Title: Re: Calling PDF Indexing Guru's
Post by: swat_is_back on June 07, 2017, 12:37:58 PM: Did You try PPD stuff?
Title: Re: Calling PDF Indexing Guru's
Post by: fnb4321 on June 07, 2017, 03:28:44 PM: Unfortunately it is a file coming from a vendor
Title: Re: Calling PDF Indexing Guru's
Post by: Greg Ira on June 08, 2017, 05:28:28 AM: Does the vendor have the ability to reformat that footer? Possibly move MEMOSTMT to just after OnDemand?
Title: Re: Calling PDF Indexing Guru's
Post by: jsquizz on June 08, 2017, 07:22:59 AM: Can you index off of "Mail statement to" or is that an image instead of text.

I haven't had a requirement to index a PDF graphically in a few years, but that's how I would have done it
Title: Re: Calling PDF Indexing Guru's
Post by: sisusteve on August 23, 2017, 11:09:52 AM: How about a single trigger 'OnDemand' and then have a doc_type index of 'CARDSTMT' and 'MEMOSTMT' This should break the PDF into separate documents
Title: Re: Calling PDF Indexing Guru's
Post by: fnb4321 on August 24, 2017, 05:33:49 AM: I reached out to IBM support and they stated there was no way to do what I was trying to accomplish (remember I didn't want it broken out into separate documents as some of you suggested because the CARDSTMT customers had to be able to view the MEMOSTMT data).

So, what I ended up doing was processing the input file twice. The first time I indexed on "CARDSTMT" only which allowed all the MEMOSTMT data following to be viewed by the CARDSTMT user.

I created a different app group for the 2nd time I processed it and indexed by the string "OnDemand" which indexed it into separate documents but I then added a SQL query restriction so that CARDSTMT index could not be viewed by any users (for this 2nd app group).

So - the end result was what I wanted to achieve but I am unfortunately storing some redundant data that is just not viewable by users.
Title: Re: Calling PDF Indexing Guru's
Post by: Justin Derrick on August 24, 2017, 09:27:21 AM: Ugh. Yeah, this isn't optimal.

Can you write something up about this requirement and submit it to the Enhancement Forum? You aren't likely to be the only person with this (or a similar) requirement. Spend some time thinking about how it should work to be the most flexible, and write down your thoughts on that too, so the developers have a better idea of what you're trying to achieve.

Thanks.

-JD.