Author Topic: Slow arsdoc get for PDF reports  (Read 1241 times)

Peter Mate

  • Newbie
  • *
  • Posts: 4
    • View Profile
Slow arsdoc get for PDF reports
« on: February 04, 2021, 09:47:38 AM »
Hi,

We are using Windows 2016 and CMOD 10.1 with DB2 and local cache only storage and I've experienced a huge slowness with the arsdoc get.
When I use arsdoc get to extract large PDF reports (which contains many PDF documents) the extraction time is really long (I'm using the -c -a -g -N parameters and -X with the loadID provided).
For example, a PDF report which contains ~4200 documents took almost 9 minutes to extract (this is the best case).
We would like to upgrade to CMOD 10.5 and setup Azure BLOB storage so AFAIK there is no other way to just export everything and reimport everything, so we have to use this method in the migration process.
But with this speed, it takes years to complete this (there are millions of reports). Does any of you know any other way to achieve the same result faster?

Some experience with the above:
During "arsdoc get" there is high CPU usage on CMOD server and the SYSTEM process consumes 36-40% of the CPU (I don't exactly know what it is doing, but only hit the CPU when I start the "arsdoc get").
Also, using process explorer I can see that it calls the arspdump.exe for each and every document (also don't know the reason, because there is no need to have the indexes for the PDF documents for exporting them)
And (as an extra) I loaded the exported PDF with the generic indexer and after that, I extracted it again with arsdoc get (the same way I described above), and it was blazing fast. 7 seconds instead of 9 minutes. So I only experience this behavior with the PDFs loaded using PDF indexer (and this is the case for all existing PDF in the system).

Any suggestions are welcome...

Thanks in advance
« Last Edit: February 04, 2021, 11:37:16 AM by Peter Mate »

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2228
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Slow arsdoc get for PDF reports
« Reply #1 on: February 05, 2021, 04:51:41 PM »
You'll want to apply the latest Content Manager OnDemand fixpack.  There was a fix for PDF retrievals in there.

Also, there are better ways to migrate CMOD data to cloud.

-JD.
« Last Edit: February 06, 2021, 04:41:14 AM by Justin Derrick »
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

Peter Mate

  • Newbie
  • *
  • Posts: 4
    • View Profile
Re: Slow arsdoc get for PDF reports
« Reply #2 on: February 07, 2021, 05:40:15 AM »
Thank you Justin, I'm going to try to extract using the latest CMOD 10.5.0.1 as the client.
Could you please give some hints/directions about the better way? (the current storage set is cache only with cache only node)