Hey again,
Of course you can look inside the 87's too.
In this environment, we did indeed only load one doc at a time. But loading bigger batches is simply a matter of "looping" the LoadAddDoc step, once per document, until you finally run LoadCommit only once.
That also answers your third question - to the best of my knowledge, the
only way to enter index data is by using the LoadAddDoc step. That means you probably have to parse index data in your intended input files in order to index the documents properly. I currently do not know of any way to index the documents locally on the OD Server using ACIF or PDF indexer.
As per your question whether I expect a performance penalty: it may actually be the other way around. Sure, there is a penalty for transferring the file to the OD Server via ODWEK, but it is probably not a lot greater than transferring the file any other way. And when it comes to the indexing - there is none! Since you have already entered the index data programmatically during the LoadAddDoc step, on the ODWEK server. When I look in the System Load table, the column "Index elapsed time (seconds)" is always zero for these documents. So, you may actually have a performance
bonus instead of a penalty - but I suggest you test this properly in the actual environment you will be using. (For example if you run ODWEK on the same server as your Load/Library server, you will probably not decrease the OD Server load very much, if at all.)
2018-03-14 12:12:20.325756: ARS4315I Processing file >/opt/ondemand/arscache/tmp/BANKDOC-NY.BANKDOC4PDF.ODADMIN.20180314.111220153101.139912273372928.ARD<
2018-03-14 12:12:20.325860: ARS4334I Load Version <9.5.0.9> Operating System <Linux> <#1 SMP Fri Sep 22 12:32:14 EDT 2017.2.6.32-696.13.2.el6.x86_64> OS Userid <ODADMIN> Install Location </home/odadmin/ibm/ondemand/V
2018-03-14 12:12:20.325890: ARS4335I Server Version <9.5.0.9> Operating System <Linux> <#1 SMP Fri Sep 22 12:32:14 EDT 2017.2.6.32-696.13.2.el6.x86_64> Database <DB2> <10.01.0006>
2018-03-14 12:12:20.345012: ARS4339I Application Group >BANKDOC-NY<
2018-03-14 12:12:20.345039: ARS4340I Application >BANKDOC4PDF<
2018-03-14 12:12:20.345048: ARS4341I Storage Set >Cache Only - Library Server<
2018-03-14 12:12:20.345056: ARS4342I Storage Node >Cache Only - Library Server<
2018-03-14 12:12:20.345098: ARS4312I Loading started, 3146122 bytes to process
2018-03-14 12:12:20.399537: ARS1144I OnDemand Load Id = >5083-1-0-1198FAA-20180202000000-20180202000000-5089<
2018-03-14 12:12:20.416971: ARS1146I Loaded 1 rows into the database
2018-03-14 12:12:20.424766: ARS1175I Document compression type used - OD77. Bytes Stored = >3268< Rows = >1<
2018-03-14 12:12:20.424830: ARS4310I Loading completed
2018-03-14 12:12:20.427613: ARS4317I Processing successful for file >/opt/ondemand/arscache/tmp/BANKDOC-NY.BANKDOC4PDF.ODADMIN.20180314.111220153101.139912273372928.ARD<