Hi Folks,
I can try to explain this as best I can.
I am Supporting a system running CMOD 8.5, DB2 V9.7, TSM 5.Something..AIX 6.1...we are so out of support, but we are also moving off of CMOD to P8 for whatever reason.
We are running a process (Provided/Supported by a third party) That is extracting documents AFP from CMOD, and from what I can tell- It's doing it one by one. the system is not that large in the grand scheme of things. It then runs through some of their utilities and sends it up to filenet. It runs 24/7 except for when we do
As far as the third party..it's running on a windows box..ODWEK 9.0.0.0 , it is my understanding-
1) Application group is defined for which to get docs from
2) Application queries for field mappings, for the folders associated to the app group, caches the results.
3) Application then makes a request to ODWEK Service for the document.
4) Odwek service looks a connection pool and determines if it needs to make a new login attempt or can use an already open one. Concurrent logins should be around the thread count, but there is also a timeout from the CMOD side so we have to logout and log back in roughly every 15 minutes. So it is more like log in, run as many queries as possible in 15 minutes, and log out.
When we run this with multiple threads, 75-100, we are seeing lots of 88 messages, files just failing to load, Files are also very small.
arsload: 07/28/21 01:41:48 Indexing completed
arsload: 07/28/21 01:41:48 -- Loading started, 83542 bytes to process
Resource FILEFAIL.20210728.010203SPIC.ARD.res matches the resource >10850-25-0<
An error occurred. Contact your system administrator and/or consult the System Log. File=arsadmp.c, Line=1252
Unable to store the object >10850<. Object size 1011467
Also seeing correponding TSM error, with nothing in the TSM Logs-
07/28/2021 00:41:45 ADMIN 5099595 Error No 20 SM Error: ANS0266I (RC2302) The dsmEndTxn vote is ABORT, so check the reason field., Return Code=2302, Reason=41, File=arssmsms.cpp, Line=2074 Srvr->production.server non-SSL<-
Also seeing this, but i cant correlate actual 88 records to this timeframe
07/27/2021 05:18:41 ODWEKSVCID 2774740 Error No 13 DB Error: [IBM][CLI Driver][DB2/AIX64] SQL0912N The maximum number of lock requests has been reached for the database. SQLSTATE=57011 -- SQLSTATE=57011, SQLCODE=-912, File=arsapp.c, Line=1587
Digging deeper, into db2diag, Not sure if this actual select is coming from that migration process or is something internal to CMOD.
2021-07-27-05.18.41.274849-240 E1352789920A1620 LEVEL: Warning
PID : 6029378 TID : 3391 PROC : db2sysc 0
INSTANCE: archive NODE : 000 DB : ARCHIVE
APPHDL : 0-54 APPID: *LOCAL.archive.210725085534
AUTHID : ROOT
EDUID : 3391 EDUNAME: db2agent (ARCHIVE) 0
FUNCTION: DB2 UDB, data management, sqldEscalateLocks, probe:1
MESSAGE : ADM5501I DB2 is performing lock escalation. The affected application
is named "arssockd", and is associated with the workload name
"SYSDEFAULTUSERWORKLOAD" and application ID
"*LOCAL.archive.210725085534" at member "0". The total number of
locks currently held is "16", and the target number of locks to hold
is "8". The current statement being executed is "SELECT
root.ARSAPP.name, root.ARSAPP.description, root.ARSAPP.agid,
root.ARSAPP.aid, root.ARSAPP.doc_type, root.ARSAPP.doc_comp_type,
root.ARSAPP.res_comp_type, root.ARSAPP.idx_type,
These file failures and TSM errors just started showing up when this "migration" process started. The first issue we had was back in May, the TSM database filled up, and we had to increase it. Now it seems like when this migration is running, pulling docs from CMOD, and we are doing concurrent loading - which isn't a very large amount, we are seeing the above issues.
I have a feeling we are overloading this tired outdated system, it isnt tuned/tweaked, patched..etc. But we are reluctant to stop the migration process due to a very tight deadline.
Just wondering if anyone has suggestions or 'bandaids' as to what we might be able to do.
Not sure if we can tweak this or play with it, seems low for a prod system? Isnt it
ARS_NUM_DBSRVR=20
Thanks in advance.