Author Topic: PDF Indexing on Z/Os (Read 5719 times)

Senthil · « **on:** March 10, 2011, 05:01:21 AM »

I am new to PDF indexing. I have few questions related to the PDF indexing.

1. Installing Adobe type 1 fonts on Z/Os - Can any one share more info on this if you have already done. I can't see much information in the guides. Adobe is using open type fonts. IBM guides are still referring type 1 fonts?

2. any one know about the DCB parameters ( length, FB/VB, PS/PDS) for the ADOBERES,ADOBEFNT and TEMPATTR DDs and also about the font mapping table

3) Input PDF data sets- I guess that the input should be in binary format. whether the file should be in fixed or variable file

I have installed acrobat and generarted the indexing parameters via graphical indexer. But it is failing afterwards..

I am geting below error

LSCX870 **** WARNING **** ERRNO = EARG
Generated in FOPEN called from line 128 of @@022871(ASC0806), off
Extended name: __fopen_a
Invalid "dsn " style file name: PS-Resources-1.0?
Interrupted while: Opening file "dsn:PS-Resources-1.0?"

LWagner · « **Reply #1 on:** March 10, 2011, 09:11:25 AM »

Congratulations, you are about to be saved considerable grief !

We just tried to do this for our customer billing files (40,000 to 70,000 accounts daily).

In trying to get this resolved, IBM admitted that their gentle admonition that PDF Indexing on z/OS "does not perform very well", should be SERIOUSLY downgraded to "performs very poorly". Mind you, this was for version 8.4.0 of OnDemand.

For Version 8.4.1 requiring DB2 9.1, indexing is better, but may still perform far faster in Windows, where processing time is in minutes per file instead of measured in hours. Plus, the PDF rasterizer is a serious CPU hog and does not swap.

The later versions of OnDemand are able to save the common resources of container PDFs, and call them to load with the particular elemental PDF of the loading document.

In earlier versions, PDF Indexing produces and EXPANDED output PDF in which the common resources are written out with EACH elemental PDF document. We saw input PDF files expand by factors of 8 to 16.

For other questions:
SYS2.CMOD.ADOBE.FONTS    PO VB 255 27998
SYS2.CMOD.ADOBEPDF.TEMPATTR    PS VB 256 27998

I have two members in USERPARMS
ADOBERES with contents: SYS2.CMOD.USERPARM(ADOBEMAP)
ADOBEMAP contains lines as below, top 5 lines a required header format.
PS-Resources-1.0
FontOutline
.
FontOutline
EX_CFF_OCR_A_Extended=SMPE.OD840.ADOBE.FONTS(OCRAEXT)
Times-Roman=SMPE.OD840.ADOBE.FONTS(TIMES)
EX_CFF_Swis721_BT=SMPE.OD840.ADOBE.FONTS(SWISS)
EX_CFF_Swis721_BTBold=SMPE.OD840.ADOBE.FONTS(SWISSBO)
EX_CFF_Swis721_BlkCn_BT=SMPE.OD840.ADOBE.FONTS(SWISSBLC)
EX_CFF_Swis721_Cn_BT=SMPE.OD840.ADOBE.FONTS(SWISSC)
ADOBERES with contents SYS2.CMOD.USERPARM(ADOBEMAP)

format:
SMPE.OD840.ADOBE.FONTS     PO VB 255 27998

LWagner · « **Reply #2 on:** March 10, 2011, 09:47:10 AM »

And PDF format, load as binary, allocated as:
Dsorg Recfm Lrecl Blksz
--------------------------
PS FB 80 27920

Senthil · « **Reply #3 on:** March 11, 2011, 08:36:23 AM »

Thanks for the update. Does OnDemand v8 supports pdf versions below 1.6 or it supports only pdf version 1.6?

Adobe fonts - we don't have adobe type1 fonts on z/os. Do we need to purchase any specific type 1 font library from adobe..

LWagner · « **Reply #4 on:** March 11, 2011, 08:47:13 AM »

No point in it.

We loaded a test Windows Object Server yesterday; no difference in what would load, just run time.

A failing abend that took 109 minutes on z/OS failed in under two minutes.

A load that took about an hour or so completed in 11 minutes. Our calculations show the max load time we will see of 70,000 bills at that speed will take 2330 minutes (almost 40 hours), so to load in three hours will take about 8 load balanced servers. The succssful PDF file only had 339 bills in it. A 2000 bill PDF load abended.

We are re-testing today with fewer CPUs and memory, to see if we get the same performance.

LWagner · « **Reply #5 on:** March 29, 2011, 07:39:48 AM »

Senthil:

Try a PDF load, regardless of version. Acrobat is supposed to load all previous versions of PDFs. Are you loading simple PDFs, or stacked PDFs, with multiple PDF files embedded.

But you need to use ARSPDUMP first to locate your index strings, then place those values in your INDEXER PARAMETERS. Or build a manual index file and reference it with the generic indexer.

Larry

LWagner · « **Reply #6 on:** May 26, 2011, 06:49:32 AM »

We initially accepted the premise from IBM that OnDemand could load any file, including PDFs, up to 4 Gb in OnDemand 8.4.0.3.

Then it turned out that this version does not handle common resources in PDF separately, so it produces an output PDF with all common resources added back to each elemental PDF in the container PDF. This increased the size of our PDFs by a factor of 10 to 16, from about 32 Mb on input to anywhere from 300 to over 500 Mb on output. This was with a 1900 page internal limit to get consistent successful loads on our test system.

Once we went to production, we had a 75% "unsuccessful load" rate. The files seemes to load, but we were not quite sure. Then we came upon documentation that stated PDFs should be no larger than 50 Mb, but that with a special object parameter setting, they could be processed up to 256 Mb.

We cut the max page limit to 950 pages, and the load success rate jumped to 75% from 25%.

We then further cut the max page limit to 400, and we now get 100% of files loading on our production system without errors of any kind. With leads to files sizes that max out at about 9 Mb. That's nine megabytes, in case you think I might have a typo.

Our daily bill runs of 40,000 to 70,000 accounts will now be producing from about 300 to 700 files daily. The processing is also faster with many smaller files, going from nearly 5 hours to about 90 minutes total end to end processing of the same amount of data.

geoffwilde · « **Reply #7 on:** May 27, 2011, 11:14:15 AM »

Considerations too: how many files are processing at one. We used to have 16 files at one time, but would experience sporadic load failures. We cut that to 4 at one time and have had no failures in almost a full week. Jury is still out, but it looks good so far.

LWagner · « **Reply #8 on:** May 27, 2011, 11:38:46 AM »

Thanks, Geoff.

With smaller files loading faster, we could decrease the server count to reduce any contention based errors.

OnDemand User Group

News:

Author Topic: PDF Indexing on Z/Os (Read 5719 times)

Senthil

PDF Indexing on Z/Os

LWagner

Re: PDF Indexing on Z/Os

LWagner

Re: PDF Indexing on Z/Os

Senthil

Re: PDF Indexing on Z/Os

LWagner

Re: PDF Indexing on Z/Os

LWagner

Re: PDF Indexing on Z/Os

LWagner

Re: PDF Indexing on Z/Os

geoffwilde

Re: PDF Indexing on Z/Os

LWagner

Re: PDF Indexing on Z/Os