OnDemand User Group

Support Forums => Report Indexing => Topic started by: Douglas on August 08, 2018, 02:02:40 PM

Title: PDF Page Piece info - help
Post by: Douglas on August 08, 2018, 02:02:40 PM
Hello

I am getting a 1000 PDFs and an Index file.  I can successfully store the 1000 PDFS using the Generic Indexer, it is quick and dirty.  (1000 PDFs, totaling 182MBs, stores in 26 seconds and compresses to 116MBs)

The PDFs also have the Page Piece Info.  So when I run a test using the PDF indexer on a batch of 100 Documents compressed into 1 PDF file (totaling 14MBs, it stores in 73 seconds, and compresses to 18 MBs)

For a batch of 250 documents compressed into 1 PDF file (input PDF is 35 MBs, it stores in 512 seconds and compresses to 99Mbs)

Why are my PDFs drastically increasing their size after they are stored when I am using Page Piece Info?
And why are they taking do long to store?

Before                                              After                  Time
1000 PDFs                        182Mbs    116 MBs              26
1 PDF (100 Documents)    14MBS      18 MS              73
1 PDF (250 Documents)    35MBS      99 MS              512
Title: Re: PDF Page Piece info - help
Post by: Nolan on August 08, 2018, 05:28:26 PM
If I remember what Bud said, you should not have compression enabled.  PDF is already compressed so try turning it off and run your tests again.   
Title: Re: PDF Page Piece info - help
Post by: Justin Derrick on August 09, 2018, 03:52:55 AM
I think the compression that Bud was talking about is the internal PDF compression, not the data & resource compression settings in the Application Definition.

I'd be opening a PMR with IBM if my files tripled in size during a load.  :)

-JD.
Title: Re: PDF Page Piece info - help
Post by: Douglas on August 09, 2018, 05:26:04 AM
The PDFs are not compressed, they are just merged together

I merge them to:

1)   treat the load as 1 load instead of multiple loads
2)   hopefully maximize resource stripping and compression when CMOD stores the PDF
 
Title: Re: PDF Page Piece info - help
Post by: Nolan on August 09, 2018, 06:06:51 AM
I still think you need to try running the load with Compression turned off in the Load tab of the Application id.   We also had that issue of increasing storage size but that was back on Z/OS. 
Title: Re: PDF Page Piece info - help
Post by: Douglas on August 09, 2018, 07:33:44 AM
I did run a load test with DOC compressions set to None and Disabled, and others

I am running on ZOS.... 9.5.0.1

Comp type   Input        Output         Rows
OD77        104964644    98569326         250
OD77         19441637     18070032       100
None         602067          602675          10
Disable      602067         602676         10
Disable      19441645      19445485       100
LZW12      19441649      25353227      100
LZW16      19441634      24814181       100
Title: Re: PDF Page Piece info - help
Post by: Ed_Arnold on August 09, 2018, 09:09:39 AM


Comp type   Input          Output         Rows
OD77        104,964,644    98,569,326         250
OD77         19,441,637    18,070,032         100
None            602,067       602,675          10
Disable         602,067       602,676          10
Disable      19,441,645    19,445,485         100
LZW12        19,441,649    25,353,227         100
LZW16        19,441,634    24,814,181         100



Those are the types of numbers I'd expect to see if the input was already compressed.

Ed
Title: Re: PDF Page Piece info - help
Post by: Nolan on August 09, 2018, 09:30:34 AM
Two comments.  You should patch up to 9.5.0.10 and I am certain that PDF indexing on Z/OS is not supported at V9.5.  You should be running the PDF indexing on a Windows or Linux box and pushing it in to the Z/OS server. 
Title: Re: PDF Page Piece info - help
Post by: Douglas on August 09, 2018, 11:50:55 AM
I will take a look at v9.5.0.10

I am loading from Windows -> to ZOS
Title: Re: PDF Page Piece info - help
Post by: Nolan on August 13, 2018, 12:18:01 PM
Rereading your original post makes me wonder how you are 'merging' your PDFs into one file.  It sounds like the tool you are using is not really merging but just concatenating the PDFs.  I suspect that is what is throwing the indexer off.    I haven't used such tools so I can be sure but I believe the recommendation from Bud is that the source application should create 1 large PDF document with the PPD details to allow for the maximum compression.