Author Topic: PDF Page Piece info - help  (Read 3509 times)

Douglas

  • Jr. Member
  • **
  • Posts: 27
    • View Profile
PDF Page Piece info - help
« on: August 08, 2018, 02:02:40 PM »
Hello

I am getting a 1000 PDFs and an Index file.  I can successfully store the 1000 PDFS using the Generic Indexer, it is quick and dirty.  (1000 PDFs, totaling 182MBs, stores in 26 seconds and compresses to 116MBs)

The PDFs also have the Page Piece Info.  So when I run a test using the PDF indexer on a batch of 100 Documents compressed into 1 PDF file (totaling 14MBs, it stores in 73 seconds, and compresses to 18 MBs)

For a batch of 250 documents compressed into 1 PDF file (input PDF is 35 MBs, it stores in 512 seconds and compresses to 99Mbs)

Why are my PDFs drastically increasing their size after they are stored when I am using Page Piece Info?
And why are they taking do long to store?

Before                                              After                  Time
1000 PDFs                        182Mbs    116 MBs              26
1 PDF (100 Documents)    14MBS      18 MS              73
1 PDF (250 Documents)    35MBS      99 MS              512

Nolan

  • Full Member
  • ***
  • Posts: 152
    • View Profile
Re: PDF Page Piece info - help
« Reply #1 on: August 08, 2018, 05:28:26 PM »
If I remember what Bud said, you should not have compression enabled.  PDF is already compressed so try turning it off and run your tests again.   
J.

#zOS #AIX #Windows #Multiplatforms
#DB2 #TSM #ODF #zODF #ODWEK
#CapacityPlanning #AFP #ReportDistribution
#Finance #ICN

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2228
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: PDF Page Piece info - help
« Reply #2 on: August 09, 2018, 03:52:55 AM »
I think the compression that Bud was talking about is the internal PDF compression, not the data & resource compression settings in the Application Definition.

I'd be opening a PMR with IBM if my files tripled in size during a load.  :)

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

Douglas

  • Jr. Member
  • **
  • Posts: 27
    • View Profile
Re: PDF Page Piece info - help
« Reply #3 on: August 09, 2018, 05:26:04 AM »
The PDFs are not compressed, they are just merged together

I merge them to:

1)   treat the load as 1 load instead of multiple loads
2)   hopefully maximize resource stripping and compression when CMOD stores the PDF
 

Nolan

  • Full Member
  • ***
  • Posts: 152
    • View Profile
Re: PDF Page Piece info - help
« Reply #4 on: August 09, 2018, 06:06:51 AM »
I still think you need to try running the load with Compression turned off in the Load tab of the Application id.   We also had that issue of increasing storage size but that was back on Z/OS. 
J.

#zOS #AIX #Windows #Multiplatforms
#DB2 #TSM #ODF #zODF #ODWEK
#CapacityPlanning #AFP #ReportDistribution
#Finance #ICN

Douglas

  • Jr. Member
  • **
  • Posts: 27
    • View Profile
Re: PDF Page Piece info - help
« Reply #5 on: August 09, 2018, 07:33:44 AM »
I did run a load test with DOC compressions set to None and Disabled, and others

I am running on ZOS.... 9.5.0.1

Comp type   Input        Output         Rows
OD77        104964644    98569326         250
OD77         19441637     18070032       100
None         602067          602675          10
Disable      602067         602676         10
Disable      19441645      19445485       100
LZW12      19441649      25353227      100
LZW16      19441634      24814181       100

Ed_Arnold

  • Hero Member
  • *****
  • Posts: 1199
    • View Profile
Re: PDF Page Piece info - help
« Reply #6 on: August 09, 2018, 09:09:39 AM »


Comp type   Input          Output         Rows
OD77        104,964,644    98,569,326         250
OD77         19,441,637    18,070,032         100
None            602,067       602,675          10
Disable         602,067       602,676          10
Disable      19,441,645    19,445,485         100
LZW12        19,441,649    25,353,227         100
LZW16        19,441,634    24,814,181         100



Those are the types of numbers I'd expect to see if the input was already compressed.

Ed
« Last Edit: August 09, 2018, 09:11:42 AM by Ed_Arnold »
#zOS #ODF

Nolan

  • Full Member
  • ***
  • Posts: 152
    • View Profile
Re: PDF Page Piece info - help
« Reply #7 on: August 09, 2018, 09:30:34 AM »
Two comments.  You should patch up to 9.5.0.10 and I am certain that PDF indexing on Z/OS is not supported at V9.5.  You should be running the PDF indexing on a Windows or Linux box and pushing it in to the Z/OS server. 
J.

#zOS #AIX #Windows #Multiplatforms
#DB2 #TSM #ODF #zODF #ODWEK
#CapacityPlanning #AFP #ReportDistribution
#Finance #ICN

Douglas

  • Jr. Member
  • **
  • Posts: 27
    • View Profile
Re: PDF Page Piece info - help
« Reply #8 on: August 09, 2018, 11:50:55 AM »
I will take a look at v9.5.0.10

I am loading from Windows -> to ZOS

Nolan

  • Full Member
  • ***
  • Posts: 152
    • View Profile
Re: PDF Page Piece info - help
« Reply #9 on: August 13, 2018, 12:18:01 PM »
Rereading your original post makes me wonder how you are 'merging' your PDFs into one file.  It sounds like the tool you are using is not really merging but just concatenating the PDFs.  I suspect that is what is throwing the indexer off.    I haven't used such tools so I can be sure but I believe the recommendation from Bud is that the source application should create 1 large PDF document with the PPD details to allow for the maximum compression.
J.

#zOS #AIX #Windows #Multiplatforms
#DB2 #TSM #ODF #zODF #ODWEK
#CapacityPlanning #AFP #ReportDistribution
#Finance #ICN