Author Topic: PDF Indexing on Z/Os  (Read 5719 times)

Senthil

  • Guest
PDF Indexing on Z/Os
« on: March 10, 2011, 05:01:21 AM »
I am new to PDF indexing. I have few questions related to the PDF indexing.

1. Installing Adobe type 1 fonts on Z/Os - Can any one share more info on this if you have already done. I can't see much information in the guides. Adobe is using open type fonts. IBM guides are still referring type 1 fonts?

2. any one know about the DCB parameters ( length, FB/VB, PS/PDS) for the ADOBERES,ADOBEFNT and TEMPATTR DDs and also about the font mapping table

3) Input PDF data sets- I guess that the input should be in binary format. whether the file should be in fixed or variable file

I have installed acrobat and generarted the indexing parameters via graphical indexer. But it is failing afterwards..

I am geting below error

 LSCX870 **** WARNING **** ERRNO = EARG                                         
         Generated in    FOPEN called from line    128 of @@022871(ASC0806), off
         Extended name: __fopen_a                                               
         Invalid "dsn " style file name: PS-Resources-1.0?                     
 Interrupted while: Opening file "dsn:PS-Resources-1.0?"                       

LWagner

  • Guest
Re: PDF Indexing on Z/Os
« Reply #1 on: March 10, 2011, 09:11:25 AM »
Congratulations, you are about to be saved considerable grief !

We just tried to do this for our customer billing files (40,000 to 70,000 accounts daily).

In trying to get this resolved, IBM admitted that their gentle admonition that PDF Indexing on z/OS "does not perform very well", should be SERIOUSLY downgraded to "performs very poorly".  Mind you, this was for version 8.4.0 of OnDemand.

For Version 8.4.1 requiring DB2 9.1, indexing is better, but may still perform far faster in Windows, where processing time is in minutes per file instead of measured in hours.  Plus, the PDF rasterizer is a serious CPU hog and does not swap.

The later versions of OnDemand are able to save the common resources of container PDFs, and call them to load with the particular elemental PDF of the loading document.

In earlier versions, PDF Indexing produces and EXPANDED output PDF in which the common resources are written out with EACH elemental PDF document.  We saw input PDF files expand by factors of 8 to 16.

For other questions:
SYS2.CMOD.ADOBE.FONTS                           PO    VB       255  27998
SYS2.CMOD.ADOBEPDF.TEMPATTR                     PS    VB       256  27998

I have two members in USERPARMS
ADOBERES with contents: SYS2.CMOD.USERPARM(ADOBEMAP)
ADOBEMAP contains lines as below, top 5 lines a required header format.
PS-Resources-1.0                                         
 FontOutline                                             
 .                                                       
 FontOutline                                             
 EX_CFF_OCR_A_Extended=SMPE.OD840.ADOBE.FONTS(OCRAEXT)   
 Times-Roman=SMPE.OD840.ADOBE.FONTS(TIMES)               
 EX_CFF_Swis721_BT=SMPE.OD840.ADOBE.FONTS(SWISS)         
 EX_CFF_Swis721_BTBold=SMPE.OD840.ADOBE.FONTS(SWISSBO)   
 EX_CFF_Swis721_BlkCn_BT=SMPE.OD840.ADOBE.FONTS(SWISSBLC)
 EX_CFF_Swis721_Cn_BT=SMPE.OD840.ADOBE.FONTS(SWISSC)     
ADOBERES with contents SYS2.CMOD.USERPARM(ADOBEMAP)

format:
 SMPE.OD840.ADOBE.FONTS                          PO    VB       255  27998

LWagner

  • Guest
Re: PDF Indexing on Z/Os
« Reply #2 on: March 10, 2011, 09:47:10 AM »
And PDF format, load as binary, allocated as:
 Dsorg  Recfm  Lrecl  Blksz
--------------------------
PS    FB        80  27920

Senthil

  • Guest
Re: PDF Indexing on Z/Os
« Reply #3 on: March 11, 2011, 08:36:23 AM »
Thanks for the update. Does OnDemand v8 supports pdf versions below 1.6 or it supports only pdf version 1.6?

Adobe fonts - we don't have adobe type1 fonts on z/os. Do we need to purchase any specific type 1 font library from adobe..

LWagner

  • Guest
Re: PDF Indexing on Z/Os
« Reply #4 on: March 11, 2011, 08:47:13 AM »
No point in it.

We loaded a test Windows Object Server yesterday; no difference in what would load, just run time.

A failing abend that took 109 minutes on z/OS failed in under two minutes.

A load that took about an hour or so completed in 11 minutes.  Our calculations show the max load time we will see of 70,000 bills at that speed will take 2330 minutes (almost 40 hours), so to load in three hours will take about 8 load balanced servers.  The succssful PDF file only had 339 bills in it. A 2000 bill PDF load abended.

We are re-testing today with fewer CPUs and memory, to see if we get the same performance.

LWagner

  • Guest
Re: PDF Indexing on Z/Os
« Reply #5 on: March 29, 2011, 07:39:48 AM »
Senthil:

Try a PDF load, regardless of version.  Acrobat is supposed to load all previous versions of PDFs.  Are you loading simple PDFs, or stacked PDFs, with multiple PDF files embedded.

But you need to use ARSPDUMP first to locate your index strings, then place those values in your INDEXER PARAMETERS.  Or build a manual index file and reference it with the generic indexer.

Larry

LWagner

  • Guest
Re: PDF Indexing on Z/Os
« Reply #6 on: May 26, 2011, 06:49:32 AM »
We initially accepted the premise from IBM that OnDemand could load any file, including PDFs, up to 4 Gb in OnDemand 8.4.0.3.

Then it turned out that this version does not handle common resources in PDF separately, so it produces an output PDF with all common resources added back to each elemental PDF in the container PDF.  This increased the size of our PDFs by a factor of 10 to 16, from about 32 Mb on input to anywhere from 300 to over 500 Mb on output. This was with a 1900 page internal limit to get consistent successful loads on our test system.

Once we went to production, we had a 75% "unsuccessful load" rate.  The files seemes to load, but we were not quite sure.  Then we came upon documentation that stated PDFs should be no larger than 50 Mb, but that with a special object parameter setting, they could be processed up to 256 Mb.

We cut the max page limit to 950 pages, and the load success rate jumped to 75% from 25%.

We then further cut the max page limit to 400, and we now get 100% of files loading on our production system without errors of any kind. With leads to files sizes that max out at about 9 Mb.  That's nine megabytes, in case you think I might have a typo.


Our daily bill runs of 40,000 to 70,000 accounts will now be producing from about 300 to 700 files daily.   The processing is also faster with many smaller files, going from nearly 5 hours to about 90 minutes total end to end processing of the same amount of data.

geoffwilde

  • Administrator
  • Sr. Member
  • *****
  • Posts: 253
  • z/os erm icn
    • View Profile
Re: PDF Indexing on Z/Os
« Reply #7 on: May 27, 2011, 11:14:15 AM »
Considerations too:  how many files are processing at one. We used to have 16 files at one time, but would experience sporadic load failures. We cut that to 4 at one time and have had no failures in almost a full week. Jury is still out, but it looks good so far.
President, OnDemand Users Group
Lead Technician for Content Manager OnDemand @
US Bank
#zSeries

LWagner

  • Guest
Re: PDF Indexing on Z/Os
« Reply #8 on: May 27, 2011, 11:38:46 AM »
Thanks, Geoff.

With smaller files loading faster, we could decrease the server count to reduce any contention based errors.