Author Topic: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved  (Read 2835 times)

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
I posted this thread on the ICN board, but really this is a CMOD MP issue.

Our primary use of our CMOD 9.5 MP platform is for storing PDF statements for internal and external customers.  We've encountered an issue where large PDF statement files > 200 MB will completely hang our platform up if a user attempts to view via Navigator or OnDemand thick client.  While these LARGE statement sizes are few and far between, they do exist, and I've had my platform compromised/completely hung up a couple of times recently by this use case.   

I am using the Standard Daeja ViewONE viewer instead of associating the PDF file to Adobe.  I've tried associating the PDFs with Adobe to help performance, but it only helps minimize the loading up of the PDF into the viewer - still major degradation. 

Unfortunately, I don't have much say in the files that get loaded to the platform - they are what they are - most of the millions of statements are just a couple of pages, but I have instances of the exception all the way up to 5,000+ pages and 500 MB, which just hang my platform.

Perhaps I query the platform for these large files, export them, and remove them from the platform?

Other thoughts?

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2229
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #1 on: August 07, 2019, 05:32:17 PM »
I'd start by fetching one of these PDFs, and finding someone with Adobe Acrobat Professional.  There's an option in Acrobat Pro to audit a PDF file and determine how much space is allocated to different data in the file itself (like images, fonts, text, etc.) once you find the offenders (often images or fonts) it might be possible to reduce their size or remove them.  Alternately, there's another option in Acrobat called 'optimize' -- and it may substantially reduce the size of a PDF by re-compressing images, substituting fonts, or enabling compression.

Compression is a mixed-bag, depending on which PDF indexer you're using.

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

Stephen McNulty

  • Jr. Member
  • **
  • Posts: 57
    • View Profile
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #2 on: August 09, 2019, 05:11:09 AM »
which component is hanging?  is it the server or the client workstation.  Also what exact levels of 9.5.*  are you using at server and client?
#ISERIES #ODWEK #XML

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #3 on: August 09, 2019, 09:39:41 AM »
Justin -

For sure the images are what is killing me.

I'm trying to determine a way to stop these from being loaded on the front side of the equation so that I don't have to reactively address.

Is there some guidance on ways to search and retrieve statement sizes from the DB2 tables that my large statements are stored in to identify potential offenders?

Stephen -
To your question about what's hanging, the CMOD Library Server memory goes to 100% and compromises connections to CMOD via ICN as well as external web services.

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #4 on: August 09, 2019, 01:27:42 PM »
Adding a little more information here, when I query DB2 for my AG table, I see the following for an offending file

STMT_DT RESOURCE    DOCSIZE     PAGECOUNT
------- ----------- ----------- -----------
  17928         854    14947245        3696

When I look at the 854 resource, it's 277MB

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2229
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #5 on: August 10, 2019, 09:59:40 AM »
I only have one thing to say in response to the resource size:  That's insane.  :)

Consider asking the people generating the statements run the images through an optimizer (zopflipng), reduce the resolution, substitute fonts, etc.

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #6 on: August 12, 2019, 02:47:37 PM »
Justin -

In this example, I have 883 pages of text and 2,812 of images - 10 images per page.

Do you think the issue is the optimization of the images OR just the sheer volume of them?

Also, is there a good query to identify my large statements so I can determine the disposition of them?

When I have such large resource files like this, just pulling the statement size from my AG table is insufficient.

Thanks.

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2229
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #7 on: August 13, 2019, 05:53:04 AM »
10 images per page is a little heavy, but it also depends on what they are.  If it's an image of text / fine print, that's bad.  If it's a company logo or some icons, it shouldn't be too bad.

But if you have the same 10 images over and over again, 280+ times, then there's an optimization issue somewhere.  You're going to have to meet with the people who produce your documents and see what they can do for you.

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

tjspencer2

  • Jr. Member
  • **
  • Posts: 80
    • View Profile
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #8 on: August 14, 2019, 08:42:39 AM »
Our issue is that we have thousands of pages with 10 images (checks) on each page - 3000 such pages in just this example.

I'll talk to the folks that are preparing these to see if there's optimization they can do, but with examples like this, I think the sheer volume of image pages alone - even if optimized - would be too much to store in a resource file and probably not how CMOD was intended to be used, right?

Also, I thought the PDF Indexer only separated the header graphics (logos, etc.) from the statement and put those in resource files, BUT it appears PDF Indexer is also separating the check images and storing those in the resource files too - is that accurate?

Finally, my biggest need is a query to be able to identify all my reports/statements that takes into account the size of the resource files that will accompany them?  If I can run this, I can sort by my problem statements and start addressing them by size perhaps even prioritizing by resource file size.  If the resource file is not too large, I can probably return those in good order - even if many text pages.

Thanks so much for all you do for the ODUG Community!!!!

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2229
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Library Server Hangs when LARGE PDF statements (> 200 MB) Retrieved
« Reply #9 on: August 15, 2019, 06:58:40 AM »
Ah.  Yeah.  Cheque images are a monster.  If anything, they should probably be stored individually as TIFF/GIF/PNGs in CMOD, but you'll probably have to twist the arm of the line of business to get it done.

I'd say you have a rare (not unique) requirement, and you should probably talk to the developers about how this could be addressed -- likely by excluding specific types of graphics, or graphics with a particular tag.

As for the optimizing the images...  You could probably only reduce their size / quality so much before customers would complain about readability, and your print vendor would complain about the extra CPU you'd be chewing up to create those optimized versions.

And as for help with your query, send me a DM.

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR