Author Topic: How Can I Optimize ARSDOC GET  (Read 654 times)

JeanineJ

  • Jr. Member
  • **
  • Posts: 18
    • View Profile
How Can I Optimize ARSDOC GET
« on: January 23, 2024, 09:11:46 AM »
I've got to extract a little over 170K pdf documents (or more) from my CMOD instance to give to the external entity that bought one of our subsidiaries. I had intended to use ARSDOC GET with the LOAD ID (-X ) and -c -g -N parameters to get the documents but my tests last week on a large load (4182 pdf documents) took almost 4 hours to run. I don't know if I'll have to do -c but I haven't yet heard from the external entity on how they need the pdf documents (they don't use CM or CMOD). TSM seems to be the sticking point since all of the documents have migrated out of cache and are now in TSM. IBM indicates that TSM is the problem. I'm open to suggestions of ways to speed this up without affecting my internal and external users. I'm on CMOD 10.5.0.5 with DB2 and TSM on a RHEL7 server. TIA

jsquizz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 576
    • View Profile
Re: How Can I Optimize ARSDOC GET
« Reply #1 on: January 23, 2024, 11:21:01 AM »
Hm, strange.

I did a full system extraction using arsdoc get with -X, no issues. Perhaps your DB is not tweaked right or there's a setting off in the the RHEL (Based on your reply to my post..) server.

Maybe give it "more juice"

But as far as arsdoc goes, just as a force of habit I usually do-

arsdoc get -u <user> -p <stash> -agcNv -G <ag> -X loadID -o <AgName>

or

Query the data tables, grab the DOC_NAME

Lets say theres 100 tables, maybe do three lists, of the doc_names do

List1:
FAA1
FAA2
FAA3
FAA4

List2:
FAA5.. etc

I think when I did the extract in 2021 I did something like:

Code: [Select]
#!/usr/bin/bash
#script1.bash
<DefineVariables or config file..>
while read DOC_NAME; do
   arsdoc get -u ${USER} -p ${PASS} -h ${HOST} -agcNv -i "where doc_name like '%${DOCNAME}%': -o ${AGNAME}
done < list1.list

nohup ./script1.bash > round1.out 2>&1 &


there's a few ways to do this. I know you can use arsadmin as well, but I have never taken that leap.
#CMOD #DB2 #AFP2PDF #TSM #AIX #RHEL #AWS #AZURE #GCP #EVERYTHING

JeanineJ

  • Jr. Member
  • **
  • Posts: 18
    • View Profile
Re: How Can I Optimize ARSDOC GET
« Reply #2 on: January 23, 2024, 11:37:17 AM »
I had a ticket open with IBM and they've pretty much said it's TSM causing the lag. I'd like to bypass looking in CACHE first only because I know for a fact that all the documents are in TSM only. I'm waiting on my TSM guy to be available so I can get him to open another ticket for TSM to see if it can be tuned to work faster. The external entity may need the pdf's individually because they definitely don't use CMOD so I don't see the value of generating a generic index file. But what do I know.

jsquizz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 576
    • View Profile
Re: How Can I Optimize ARSDOC GET
« Reply #3 on: January 24, 2024, 06:18:03 AM »
I had a ticket open with IBM and they've pretty much said it's TSM causing the lag. I'd like to bypass looking in CACHE first only because I know for a fact that all the documents are in TSM only. I'm waiting on my TSM guy to be available so I can get him to open another ticket for TSM to see if it can be tuned to work faster. The external entity may need the pdf's individually because they definitely don't use CMOD so I don't see the value of generating a generic index file. But what do I know.

That is a very valid point. Unfortunately, I've seen that also. Perhaps checkout ARSADMIN - I'm not very familiar with it but you can essentially retrieve documents as a "bunch" for lack of a better term
#CMOD #DB2 #AFP2PDF #TSM #AIX #RHEL #AWS #AZURE #GCP #EVERYTHING

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2230
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: How Can I Optimize ARSDOC GET
« Reply #4 on: January 24, 2024, 12:07:27 PM »
No matter what you do, if your documents are stored inside TSM, and on tape, it's going to take a while to get them back.

You could re-cache them (https://cmod.cloud/professionalservices/ibm-cmod-cache-optimization/) but the effort may not be worth it for only 170k documents.

-JD.

IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR