Author Topic: Unable to store object Failure  (Read 3176 times)

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Unable to store object Failure
« on: August 14, 2018, 03:05:03 AM »
Over the past month or so we've been getting intermittent failures.  They all receive the same EXACT msg number 429 (full message below).  It appears to be related to TSM but no one from our TSM area seems to know what the issue is.  What we've noticed is the failures ALWAYS occur on the quarter of the hour.  So as an example, errors occur at top of hour, 15 mins past, 30 mins past or 45 mins past.  Someone from Centera noticed a drive was kicking out errors.  They took that drive offline and we were good the rest of that day (Friday) and all of Saturday.  Saw one failure Sunday but then received 6 yesterday (Monday 8/13).  Failures occur on both OS390 and ACIF reports.  Its not related to size of the data because in some cases, failures are on 1 page reports.  I've also included below a sample if the actual 88 message.  Appreciate any info anyone may have regarding this.

2018-08-13 20:31:10.884452   ADMIN         21593   Error   No      429   TSM Error: ANS0266I (RC2302) The dsmEndTxn vote is ABORT, so check the reason field., Return Code=2302, Reason=15, File=arssmsms.cpp, Line=2695  Srvr->xsa00e70.example.com 999.999.49.139 non-SSL<-   

88msg info

2018-08-13 15:15:18.181784: ARS1144I OnDemand Load Id = >12118-14-0-342766FAA-20180813000000-20180813000000-14419<
2018-08-13 15:15:19.743461: ARS1146I Loaded 1 rows into the database
2018-08-13 15:15:25.974051: ARS1107E An error occurred.  Contact your system administrator and/or consult the System Log.  File=SYS17006.T095915.RA000.ARNLOAD.SRCCMS.H01(ARNLOAD), Line=6314
2018-08-13 15:15:25.975433: ARS1157E Unable to store the object >342766FAA1<.  Object size 151
2018-08-13 15:15:25.975447: ARS4311E Loading failed
2018-08-13 15:15:25.975455: ARS4320I Unloading started
2018-08-13 15:15:26.963835: ARS4321I Unloading of data was successful


« Last Edit: August 14, 2018, 03:48:40 AM by Justin Derrick »

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2229
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Unable to store object Failure
« Reply #1 on: August 14, 2018, 03:39:51 AM »
I've tweaked your post to anonymize server names and IP addresses.  :)

First and foremost, refer to the IBM CMOD troubleshooting guide on the wiki:  https://cmod.wiki/index.php?title=Troubleshooting_Content_Manager_OnDemand

Obviously, you know the issue is TSM related, so skip to that section for information about searching the TSM Activity Log. 

The return code (15) in your error message is:  "DSM_RC_ABORT_RETRY Unexpected retry request. The server found an error while writing the data."

This is probably rather urgent.  Problems writing data most certainly mean trouble reading data later on.  I'm suspecting your hardware is having trouble writing new data to its storage media.  If it's a tape library, a drive may need to be cleaned, or a tape may have been mounted too many times, and it's reached the end of it's useful life.  If it's a storage array of some sort, it may be degraded (running without a full compliment of drives).

I've added a section to the Content Manager OnDemand Troubleshooting page with a link to the TSM API return codes, should anyone stumble upon this in the future.
« Last Edit: August 14, 2018, 03:49:19 AM by Justin Derrick »
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Unable to store object Failure
« Reply #2 on: August 14, 2018, 04:03:33 AM »
Thanks you SO much Justin!!!!.... I've relayed this information to the TSM area....We thought when he took the drive offline on Friday (8/10) and we didn't see any errors the rest of the day or all of 8/11 that was the issue.  Then we received just one error on Sunday.  Granted we aren't loading much over the weekend and then 6 yesterday.  Can't pinpoint it to a specific time during the day.    We aren't sure if maybe this same drive he took offline somehow came back online over the weekend??..Not heard anything back on that....The oddity is the timing of these failures always being at the quarter hour...Always right on or 1 min past...See below for times last week of failures.  Not sure how many drives there are. 

2018-08-10 10:15:39.902531
2018-08-10 06:31:25.108161
2018-08-10 06:30:47.230555
2018-08-10 06:30:29.320138
2018-08-09 20:46:20.548694
2018-08-09 20:45:26.062104
2018-08-09 14:16:45.521092
2018-08-09 14:16:19.483341
2018-08-09 14:16:19.343845
2018-08-09 14:16:18.700834
2018-08-09 14:16:14.996145
2018-08-09 14:15:32.850422
2018-08-09 14:15:29.650348
2018-08-09 02:46:15.329205
2018-08-08 22:01:45.105292
2018-08-08 20:31:24.107198
2018-08-08 20:31:09.716809
2018-08-08 12:31:32.113680
2018-08-08 12:31:08.510195
2018-08-08 12:30:28.154920
2018-08-08 12:30:24.329362
2018-08-08 08:01:37.664348
2018-08-08 08:01:19.221901
2018-08-08 08:00:33.668738
2018-08-08 08:00:24.386561
2018-08-07 20:01:38.983334
2018-08-07 20:01:11.826213
2018-08-07 20:00:43.376407
2018-08-07 20:00:36.602169
2018-08-07 20:00:32.605160
2018-08-07 20:00:26.486074
2018-08-07 20:00:24.293329
2018-08-07 01:30:57.413941

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Unable to store object Failure
« Reply #3 on: August 14, 2018, 08:24:44 AM »
Justin, just a caveat to muddy the waters....We started seeing these errors immediately after attempting to migrate from Centera to ECS...Initially we believe this was the issue...We migrated back to Centera but have continued to receive these same errors...So not sure if its just a coincidence or not.....

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Unable to store object Failure
« Reply #4 on: August 16, 2018, 05:40:52 AM »
Justin,

Not sure if this is related, but we're now seeing 429 messages that aren't the same we were getting when we received the 88 failure indicating "Unable to store the object"

The new 429 message indicates, TSM Error: ANS1314E (RC14)   File data currently unavailable on server
, Return Code=14, Reason=0, File=arssmsms.cpp, Line=1893  Srvr->xsa00e70.example.com 999.999.49.139 non-SSL<-


We also see a message 24 that shows, Object >1102FAAC< in Application Group >KYG< not found in node >CENTERA3650<  Srvr->xsa00e70.example.com 999.999.49.139 non-SSL<-

Not sure where the 3 digit Application Group name is coming from..We have no Application Group ids called that....The users that are getting the message 24 each have their own unique Application group id

These are all on user ids....As of yet, we haven't had any complaints...But appears to be relates to users attempting to view data...I know you indicated before the failures we saw could mean trouble reading data in the future...

So far we the last 88 failure we received was on 08/14 when we received 11 in a row...No resolution from anyone involved other that setting up an STK log???
« Last Edit: August 16, 2018, 07:12:57 AM by Justin Derrick »

Justin Derrick

  • IBM Content Manager OnDemand Consultant
  • Administrator
  • Hero Member
  • *****
  • Posts: 2229
  • CMOD Guru for hire...
    • View Profile
    • Tenacious Consulting
Re: Unable to store object Failure
« Reply #5 on: August 16, 2018, 07:16:39 AM »
You need to look for errors on your TSM server.  CMOD's just telling you that the document isn't available -- the TSM activity logs will give you a hint as to why.

The three letter code is the "AGID_NAME".  It's used in the cache filesystems, db2 table names, and TSM filespace names to keep different application group data separate.  :)  You can get more info in the IBM CMOD Database Tables page on the wiki:  https://cmod.wiki/index.php?title=Content_Manager_OnDemand_Database_Tables

-JD.
IBM CMOD Professional Services: http://TenaciousConsulting.com
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Education & Webinars:  https://CMOD.Training/

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

DDP021

  • Sr. Member
  • ****
  • Posts: 343
    • View Profile
Re: Unable to store object Failure
« Reply #6 on: August 16, 2018, 07:28:29 AM »
Justin,

We've relayed all 2 errors (MSG 429) and (MSG 24) to the TSM team....We just started receiving these messages....When we try and retrieve any of the report we get the message, "The Server failed while retrieving a document".  By coincidence, it appears we had received 88 failures on other reports either before or after the documents that that we cannot retrieve even though they appear to have loaded successfully...It's only a matter of time till users start complaining....As indicated, this mess all started after they attempted to migrate to ECS from Centera.  Even though they had backed off that switch and we're supposedly back to writing to centera these issues are still happening...Again, thanks for the input Justin!!!