Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Johan Dahlgren

Pages: [1]
1
Tips and Tricks / Re: Document size on disk i AFP format
« on: February 02, 2022, 04:59:50 AM »
I see, that explains why it starts on SA2. And regardning System Load the oldest i can find is from 2017, does that mean that we chose to activate it in 2017 and anything before that was nevere logged? Ones this is activated can I trust that all loads are saved in the System Load facility or is it dependent on what load was run?

For example if we have 2 folders, folder1 and folder2 and each of these have their own application groups and applications connected. If System Load feature was activated will it store all loads for both folders/applications groups or do you have to activate it for each folder/application group?

2
Tips and Tricks / Re: Document size on disk i AFP format
« on: January 28, 2022, 07:26:46 AM »
System load table, is that the one called SA in the database? in our case it is named SA2, not sure why there is no SA1 but could be it has been deleted since it is set to expire after 3650 days.

I got a bit of help from our Database Admin with an SQL to get a decent number calculated, here is an example of one wouch SQL query:
Code: [Select]
select x.name, x.agid, x.agid_name  , AVG(BIGINT(x.decomp_size + x.DOC_LEN)) as AVG from (
select AG.NAME, AG.AGID, AG.AGID_NAME, RES.DECOMP_SIZE, DOC.DOC_LEN
from ARSRES as RES ,
ARSAG as AG ,(
select DOC_LEN,RESOURCE from QLA1    UNION ALL
select DOC_LEN,RESOURCE from QLA2    UNION ALL
select DOC_LEN,RESOURCE from QLA3    UNION ALL
select DOC_LEN,RESOURCE from QLA4    UNION ALL
select DOC_LEN,RESOURCE from QLA5
) as DOC
where AG.AGID=RES.AGID
and RES.RID=DOC.RESOURCE and AG.AGID_NAME='QLA'
) X  where decomp_size is not null
group by x.NAME, x.AGID, x.AGID_NAME

It works well enough for us we think. This combined with a select count(*) gives us the aproximate size of the application group.

I thing i can try to describe an overview of our conversion process. Unfortunately no details since the data is sensetive.

Basically we have CM OnDemand that has been running for many years, think it was early 2000. Anyway it was old when i started 11 years ago. :)
It has been recieving documents of various kinds over these years, slowly adding and changing with no regard for GDPR or that the data eventually had to be removed och preserved for future statistics.

Start of our project, we were tasked with making a new document storage as well as moving and converting the old one.

What we are doing to migrate the documents.

1. We segment a folder search, usually one year worth of documents/folder, depending on the size of the folder (number of documents).

2. Write the AFP files (so far we only worked with the AFP format but we will have to migrate some TIF aswell in the future) to disk along with the metadata about each document. We also give each document a unique name for when we put it in the new storage. We also include all the available resources, because they will be needen later in the conversion to PDF-A/1b.

3. We now have all the information we can get we run two processes
    A. Convert the file, this is done with AFP2PDF converter, it takes a bit of fiddeling with resources souch as font mapping and replacing missing peices of information with other resources. But we
        got it to work fairly well.
    B. Update the medatata about the document, this part is a bit sensetive but basically we update dates that are wrong or missing, customer personal id with internal customer ids instead and we
        apply rules for when document should be erased/preserved.

4. Now most of the work is done and we package the updated metadata about the documents with the new PDF files and send it to the new server that inserts it all in to its database and storage.

That is the basic process, if you have any questions about some part of it I'll try to answer if I can.

EDIT: Unfortunately it seems that our SA table does not contain loads for all our application groups.
Code: [Select]
db2 "select count(*) from SA2 where LOAD_AGID=5863"

1
-----------
          0

  1 record(s) selected.
And i have resources in ARSRES that have the AGID 5863 but alas no DECOMP_SIZE.

3
Tips and Tricks / Document size on disk i AFP format
« on: January 25, 2022, 06:24:14 AM »
Hello,

I am having problems with calculating the average size of the documents stored in OnDemand folders and application groups. I have been manually trying to match different sizes in the tables to match it to the actual size of the AFP file when written to disk outside of OnDemand.

We are working to convert our documents from AFP to PDF-A/2b and want an accurate way to estimate the time needed to do this. We are fetching the documents from OnDemand and writing them to disk in the AFP format, then we convert them using the AFP2PDF converter and finaly store them in a new storage.

I have been looking at the DOC_LEN in the tables named from the application group name (ARSAG, AGID_NAME) and combining that with the DECOMP_SIZE of the correlating resource from ARSRES table... for the most part it seems to give me the correct size of the file but when the resource DECOMP_SIZE is null the numbers are off. Examples below.

If i have a file with DOC_LEN of 2000 and that has a RESOURCE with a DECOMP_SIZE of 4000 the document size on disk will be 6000b

But if I have a file with DOC_LEN of 2000 and a RESOURCE with a DECOMP_SIZE of '-' the document size is not the expected 2000 but often much bigger even tho there is no DECOMP_SIZE.

Am i way off in what i try to do? Is there a better way to calculate the expected number of bytes a document will have once it is written to disk?

Kind regards
Johan Dahlgren

4
My SQL knowledge is not enough for this problem it seems.

The query works fine if I only find one result in the sub-select. but for some document types (other tables) i get over 700 unique DOC_NAMES for documents that are missing dates.

If i run the SQL query with multiple results from the subselect i get an error.

Code: [Select]
$ db2 "SELECT TIME_STAMP, MSG_TEXT FROM SL2 WHERE MSG_TEXT LIKE
(SELECT DISTINCT '%6414-' || PRI_NID || '-'|| SEC_NID || '-' ||
LEFT(DOC_NAME, LENGTH(TRIM(DOC_NAME))-1) || '%'
FROM MLA1  WHERE UTBDATUM < '1') AND MSG_NUM = 87"

TIME_STAMP  MSG_TEXT                                                                                                                                                                             
----------- --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL0811N  The result of a scalar fullselect, SELECT INTO statement, or VALUES
INTO statement is more than one row.  SQLSTATE=21000

This is just with one of my 30+ MLAX tables and one of my about 100 SL tables...

//Johan D

5
Ohhhh that explains it Darrel. Thank you so mouch all of you for taking the time to look at my problem.

Our SA2 table only contains load data for the applications group of my documents from 2017 at the oldest. The document that are missing their dates are from 2006, the ones I have manually searched for and opened to read atleast.

Thank you all again for helping me sort this problem out. I can now conclude that this is not a viable option for us to find dates for the documents with missing dates.

// Johan D

6
I've concluded that there are two scenarios for what I am doing.
a) I am doing somethings wrong since I cant find the load information about a document.
b) the load records are incomplete or have been deleted some time over the years

Is it possible that the AS2 table only stores the load information for a set time before removing the information? Some of the documents are very old (10+ years old)

//Johan D

7
There may be multiple System Load tables...  SA2, SA3, etc.  You may need to search more than just one. 

-JD.

Only seem to have one table named SA2 or are they hidden somehow

$ db2 list tables |grep SA
ARSAG                           ARCHIVE         T     2017-05-27-16.47.33.550405
ARSAG2FOL                       ARCHIVE         T     2017-05-27-16.47.33.653442
ARSAGFLD                        ARCHIVE         T     2017-05-27-16.47.33.733617
ARSAGFLDALIAS                   ARCHIVE         T     2017-05-27-16.47.33.815872
ARSAGINDEX                      ARCHIVE         T     2017-05-27-16.47.33.899852
ARSAGPERMS                      ARCHIVE         T     2017-05-27-16.47.33.986818
ARSANN                          ARCHIVE         T     2017-05-27-16.47.34.147018
ARSAPP                          ARCHIVE         T     2017-05-27-16.47.34.727632
ARSAPPUSR                       ARCHIVE         T     2017-05-27-16.47.35.053065
SA2                             ARCHIVE         T     2017-05-29-10.24.16.826301


//Johan D

8
Here's an SQL that works for me. I'm sure some adaptation will be needed for your circumstances. Because you know the AGID Name, UGA, you can look up the AGID from the ARSAG table. In my example, my AGID Name is LNJD and my AGID is 93322. My instance name is ONDENU. The DOC_NAME column in the AG table requires the last character to be removed so that doc name 1FAAA becomes 1FAA, which is the correct name for the load ID.

SELECT LOAD_TIME_DT, LOAD_ID FROM ONDENU.SA2 WHERE LOAD_ID LIKE
(SELECT DISTINCT '93322-' || PRI_NID || '-'|| SEC_NID || '-' ||
LEFT(DOC_NAME, LENGTH(TRIM(DOC_NAME))-1) || '%'               
FROM ONDENU.LNJD1  WHERE RPT_DATE = 0)                         

The output in my example is:
LOAD_TIME_DT                LOAD_ID                                           
2020-03-31-13.48.48.087472  93322-2-0-1FAA-19940110000000-19940110000000-93347


I hope this helps.

Great information, I've tried it in my OnDemand setup but it complains about my version of the SQL. Not sure if I missunderstan some things or if we have different setups?

This is the SQL i ran, changed to fit our names for things

SELECT LOAD_TIME_DT, LOAD_ID FROM ARSAG WHERE LOAD_ID LIKE (SELECT DISTINCT '6045-' || PRI_NID || '-'|| SEC_NID || '-' || LEFT(DOC_NAME, LENGTH(TRIM(DOC_NAME))-1) || '%' FROM UGA1  WHERE BESLUTSDATUM < 1)
SQL0206N  "LOAD_TIME_DT" is not valid in the context where it is used.
SQLSTATE=42703

i am not sure if your ONDENU.SA2 is ARSAG table or if I missunderstood here cause i have no "LOAD_TIME_DT" in either ARSLOAD or ARSAG. 6045 is my AGID.

Nevermind, just there is an actual table called SA2 that I have not seen before.

Will continue trying to figure this out, will return to report on progress

Sorry for spamming the changes here, finally figured it out but unfortunately there is no hits when I run the SQL. If i separate the SQLs they seem to find what I am looking for from the UGA1 table but it does not match anything in the SA2 table. Has it been removed or am i still doing something wrong?

my runs an their results:
$ db2 "select NAME, AGID, AGID_NAME from ARSAG where NAME like 'BB-%'"

NAME                                                         AGID        AGID_NAME
------------------------------------------------------------ ----------- ---------
BB-BESLUT                                                           6045 UGA

  1 record(s) selected.

$ db2 "select distinct '6045-' || PRI_NID||'-' || SEC_NID || '-' || LEFT(DOC_NAME,LENGTH(TRIM(DOC_NAME))-1) ||'%' FROM UGA1 where BESLUTSDATUM < 1"

1
-------------------------------
6045-19-0-4066FAA%

  1 record(s) selected.

$ db2 "select LOAD_ID from SA2 where LOAD_ID like '6045-19-0-4066FAA%'"

LOAD_ID
----------------------------------------------------------------------------------------------------

  0 record(s) selected.




//Johan D

9
System Load has that information and it retrieves it from ARSLOAD table. You can probably build a join between ARSLOAD table and your segment table.

Link to ARSLOAD table details: https://www.ibm.com/support/knowledgecenter/en/SSEPCD_10.5.0/com.ibm.ondemand.administeringmp.doc/dodsc018.htm

I hope this helps,
Mehmet

Hello Mehmet

I have been trying to get my head around how to join the tables as to give me the load date for the document but cant seem to find the correct correlation. Any help here would be appriciated.

I have tried to connect the tables using the DOC_NAME from my document storage table, in the example below UGA1. With the NAME in the ARSLOAD table but doing that i find many many results so I have to narrow it down a bit. I was then thinkig I could use the AGID from ARSAG to find the load corresponding to the document loaded in my UGA1 table but then I get no results.

Can i be that the ARSLOAD is incomplete?

what about using the system log? Can you view the document in the thick client?

Right click, take the partial load ID and search for it in the message field in the system log folder, along with msg num 88

Hello Jsquizz,

I can try it to see if I can find the load date for one document. Unfortunately I need to do this for thousands of documents missing the date so I really need a less manual way of doing it.

//Johan

10
Hello,

I have a problem with dates in our ondemand. We have documents that we need to apply rules to according to their age, unfortunately some of the documents are missing this information. The date value is set to "-1568" for about 30 000 of the documents. My question is if there is a way to find out when these documents were loaded into Ondemand and use that date instead?

I have the following example.

Our application group from the table ARSAG
NAME                       AGID  AGID_NAME
Application group 1   6045   UGA

DB2 describe of table UGA1
Column name     Data type schema    Column Data type name   Length  Scale   Nulls
VALUE_1         SYSIBM              CHARACTER               12      0       No
DATE            SYSIBM              SMALLINT                2       0       No
VALUE_2         SYSIBM              CHARACTER               4       0       No
DOC_NAME        SYSIBM              VARCHAR                 11      0       No
DOC_OFF         SYSIBM              INTEGER                 4       0       No
DOC_LEN         SYSIBM              INTEGER                 4       0       No
COMP_OFF        SYSIBM              INTEGER                 4       0       No
COMP_LEN        SYSIBM              INTEGER                 4       0       No
ANNOT           SYSIBM              CHARACTER               1       0       No
COMP_TYPE       SYSIBM              CHARACTER               1       0       No
RESOURCE        SYSIBM              INTEGER                 4       0       No
PRI_NID         SYSIBM              SMALLINT                2       0       No
SEC_NID         SYSIBM              SMALLINT                2       0       No

I was hoping there was a way to connect the DOC_NAME to the NAME in ARSLOAD and get the date from there but I cant seem to find the correct NAME and AGID to correlate with the DOC_NAME in my table UGA1.

So in short, is there  a way to find out when a specific document in my UGA1 table was loaded?

11
Tips and Tricks / Re: Best way to "select count" from ondemand?
« on: May 19, 2020, 06:59:21 AM »
Hello again, i found a way to do it. Thank you Justin for pointing me in the right direction.

First i had to find out what tables my documents were in by tracing them from ARSAG table to get the names of my application groups and then by looking at the AGID_NAME finding what names my tables had. From there it was just the question how to write an SQL query to find and count my documents.

this is what i ended up with. It did the job atleast.
Code: [Select]
db2 "select count(*) from (
    SELECT * from GKA1 UNION ALL
    SELECT * from GKA2 UNION ALL
    SELECT * from GKA3 UNION ALL
    SELECT * from GKA4 UNION ALL
    SELECT * from GKA5 UNION ALL
    SELECT * from GKA6 UNION ALL
    SELECT * from GKA7 UNION ALL
    SELECT * from GKA8 UNION ALL
    SELECT * from GKA9 UNION ALL
    SELECT * from GKA10) as l
where TYPE='one'"

Thanks again Justin for quick response and for pointing me in the right direction.

-JD

12
Tips and Tricks / Best way to "select count" from ondemand?
« on: April 23, 2020, 07:20:29 AM »
Hello,

I have a question. What is the best way to do a sort of "select count" from ondemand?

Backstory: we want to know how many of a certain type of document we have in a folder. We have a folder, lets call it LETTERS, with about 25.000.000 documents. we want to know how many type 1 and how many type 2 documents are in that folder.

What I have tried so far:
I have tried using the ARSDOC QUERY command to do this, se example below, but is there a better way to do this?

command i have used:
arsdoc query -v -h <SERVER> -u <USER> -p <PASSWORD> -f "<FOLDER NAME WITH SPACES>" -i "where id LIKE '%' and type= 'one'" | wc -1

Any hints or tips would be greatly appreciated.

Regards,

Johan Dahlgren

Pages: [1]