Author Topic: Compression with Generic indexer?  (Read 3795 times)

pankaj.puranik

  • Sr. Member
  • ****
  • Posts: 374
    • View Profile
Compression with Generic indexer?
« on: December 10, 2012, 03:57:44 AM »
Is there an option to achieve the same compression with Generic Indexer as you get with ACIF/PDF indexer.
To clarify my question, if I have a PDF and an AFP, I process the PDF with PDF indexer (RESTYPE=FONT,IMAGE).
I process the AFP file with the ACIF indexer.
Now I process both the files using a Generic Indexer.

1. Is it possible to achieve the same compression for AFP? (Does Generic Indexer split the file into .out and .res?)
2. Is it possible to achieve the same compression for PDF as with PDF indexer?

Thanks
Pankaj.

Alessandro Perucchi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1002
    • View Profile
Re: Compression with Generic indexer?
« Reply #1 on: December 10, 2012, 08:04:23 AM »
Hello Pankaj,

No

I've used a trick to simulate a generic index with ACIF and PDF indexer, it might work or not according to your files...

The trick is :
1) Create a PDF/ACIF indexer according to your file type
2) Setup a trigger (one must exist)
3) define all the options to remove all ressources
4) for the loader, you create a text file with the FIELDX=... and INDEXX=... with constant values
5) Load the document with arsload and use the option -j with the file created in 4).

And CMOD will load the document with the ACIF/PDF indexer, find the trigger (at least if your trigger is correctly setup of course!!! :-D).
Then read the file of your index, do the separation between ressources and data, and finally if everything is ok, archive the document.

Sincerely yours,
Alessandro
Alessandro Perucchi

#Install #Migrations #Conversion #Educate #Repair #Upgrade #Migrate #Enhance #Optimize #AIX #Linux #Multiplatforms #DB2 #Windows #Oracle #TSM #Tivoli #Performance #Audits #Customizing #Availability #HA #DR #JavaApi #ContentNavigator #ICN #WEBi #ODWEK #Services #PDF #AFP #XML

pankaj.puranik

  • Sr. Member
  • ****
  • Posts: 374
    • View Profile
Re: Compression with Generic indexer?
« Reply #2 on: January 24, 2013, 12:01:55 PM »
Hi Alessandro

This worked well for me.
Could you or anyone on this forum check the attached System Load data and confirm that resources are indeed being reused and that I am really saving space.

Thanks
Pankaj.

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2230
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Compression with Generic indexer?
« Reply #3 on: January 24, 2013, 05:48:31 PM »
Heh.  I knew it would happen one day, I just didn't expect it to be today...

I've read the post twice, and don't understand the question OR the solution.  This sounds like it could be important to someone else in the future -- can I get a few questions answered?

Where did the Generic Index come from?
Why won't CMOD split out resources, etc?
What style of AFP is being stored?  (Sounds like fully composed.)
Are the resources you're trying to store already exist in CMOD?

Any other background or explanation information would be great.

Explain it like I'm five years old!  :D

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

Alessandro Perucchi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1002
    • View Profile
Re: Compression with Generic indexer?
« Reply #4 on: February 05, 2013, 08:10:31 AM »
Hello Justin,

Ahahahhahahahahhahah :-) ok my little boy, dady will explain you the world :-D

sorry, I couldn't resist!!! :-D

In PDF you cannot have all the index you might want, you might for example write them white over white and then index them via PDF Indexer.
But sometimes such trick don't work, or the customer doesn't want to use them, or any other reasons.
And the PDF Index is too limited, I mean you have 7-8 fields like Author, Document Name, Last change, Date of creation, ... and nothing more, and if you add some extra fields, then the PDF Indexer will simply ignore them.

And sometimes using the Generic index is the way to go in order to have the flexibility to have all the index in the world.

Unfortunately using the Generic index, you lose the benefit of the AFP/PDF Indexe to split the resources (or at least I don't know how to do it :-D).

So my answer was to use the best of both world, the power of Generic index, and at the same time using the PDF/AFP Indexer to have the ressource splitting capability.

With the option -j of arsload, you can add informations to the AFP/PDF indexer on the fly, and that was what I was trying to explain in my answer.

So in order to work, you must have the following:

1) Have a trigger so the document is recognized
2) That's it, no index, no field, nothing expect the setting to remove all the ressources in the ACID/PDF Indexer
3) On the loader side, you create a simple text file, and you create something like :
Code: [Select]
FIELD01='01/01/2012'
FIELD02='SECRET DOC'
...
INDEX01='myindex1',FIELD01
INDEX02='myindex2',FIELD02

4) you simply use arsload with the normal options for the ACIF/PDF Indexer with the additional -j option with the file create in 3),

And then magically you have for your document the power of generic index, with the power of PDF/ACIF Indexer.

Well that works well with 1 document per index... so if you have let say 100000 pages to index, and each page needs to be indexed that way, it doesn't work, it cannot work, or you use a user exit for that.

And for your questions, the resources that are stored in CMOD, could be already in CMOD or not. CMOD will check itself if he needs to store them, or if he has them already... the normal behavior in fact!

I hope I've answered your concern and questions!!!

Sincerely yours,
Alessandro
Alessandro Perucchi

#Install #Migrations #Conversion #Educate #Repair #Upgrade #Migrate #Enhance #Optimize #AIX #Linux #Multiplatforms #DB2 #Windows #Oracle #TSM #Tivoli #Performance #Audits #Customizing #Availability #HA #DR #JavaApi #ContentNavigator #ICN #WEBi #ODWEK #Services #PDF #AFP #XML

Alessandro Perucchi

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1002
    • View Profile
Re: Compression with Generic indexer?
« Reply #5 on: February 05, 2013, 08:21:57 AM »
Hi Pankaj

This worked well for me.

Good to hear :-)

Quote
Could you or anyone on this forum check the attached System Load data and confirm that resources are indeed being reused and that I am really saving space.

Well after looking at your Excel, it seems that the ressources is being checked and sometimes it is New, and sometimes it is Existing, meaning that CMOD recognize the ressource and don't store it.
Everything looks great :-)

Sincerely yours,
Alessandro
Alessandro Perucchi

#Install #Migrations #Conversion #Educate #Repair #Upgrade #Migrate #Enhance #Optimize #AIX #Linux #Multiplatforms #DB2 #Windows #Oracle #TSM #Tivoli #Performance #Audits #Customizing #Availability #HA #DR #JavaApi #ContentNavigator #ICN #WEBi #ODWEK #Services #PDF #AFP #XML