OnDemand User Group

Support Forums => MP Server => Topic started by: Peter Mate on February 04, 2021, 09:47:38 AM

Title: Slow arsdoc get for PDF reports
Post by: Peter Mate on February 04, 2021, 09:47:38 AM
Hi,

We are using Windows 2016 and CMOD 10.1 with DB2 and local cache only storage and I've experienced a huge slowness with the arsdoc get.
When I use arsdoc get to extract large PDF reports (which contains many PDF documents) the extraction time is really long (I'm using the -c -a -g -N parameters and -X with the loadID provided).
For example, a PDF report which contains ~4200 documents took almost 9 minutes to extract (this is the best case).
We would like to upgrade to CMOD 10.5 and setup Azure BLOB storage so AFAIK there is no other way to just export everything and reimport everything, so we have to use this method in the migration process.
But with this speed, it takes years to complete this (there are millions of reports). Does any of you know any other way to achieve the same result faster?

Some experience with the above:
During "arsdoc get" there is high CPU usage on CMOD server and the SYSTEM process consumes 36-40% of the CPU (I don't exactly know what it is doing, but only hit the CPU when I start the "arsdoc get").
Also, using process explorer I can see that it calls the arspdump.exe for each and every document (also don't know the reason, because there is no need to have the indexes for the PDF documents for exporting them)
And (as an extra) I loaded the exported PDF with the generic indexer and after that, I extracted it again with arsdoc get (the same way I described above), and it was blazing fast. 7 seconds instead of 9 minutes. So I only experience this behavior with the PDFs loaded using PDF indexer (and this is the case for all existing PDF in the system).

Any suggestions are welcome...

Thanks in advance
Title: Re: Slow arsdoc get for PDF reports
Post by: Justin Derrick on February 05, 2021, 04:51:41 PM
You'll want to apply the latest Content Manager OnDemand fixpack (https://cmod.wiki/index.php?title=Main_Page#IBM_CMOD_Fixpacks_.26_Security_Bulletins).  There was a fix for PDF retrievals in there.

Also, there are better ways to migrate CMOD data to cloud (https://cmod.cloud/migration/).

-JD.
Title: Re: Slow arsdoc get for PDF reports
Post by: Peter Mate on February 07, 2021, 05:40:15 AM
Thank you Justin, I'm going to try to extract using the latest CMOD 10.5.0.1 as the client.
Could you please give some hints/directions about the better way? (the current storage set is cache only with cache only node)