We are running OnDemand 8.4 and using the PDF indexer. We just ran into a situation where the last character of a field is a dash. Is there something special about the dash character being the last character in a field. Because it looks like the indexer is appending the next text it finds to the end of that field, thus making the field to long for the loader and it fails to load. To get an idea of what is happening we ran the PDF document though ARSPDUMP and below is an example of the output:
Sponsor:
ul.h = 1.16 ul.v = 1.97 lr.h = 1.64 lr.v = 2.12
COLLEGE
ul.h = 1.72 ul.v = 1.97 lr.h = 2.17 lr.v = 2.12
CHARLESTON
ul.h = 2.34 ul.v = 1.97 lr.h = 3.00 lr.v = 2.12
Sponsor
ul.h = 0.86 ul.v = 2.09 lr.h = 1.31 lr.v = 2.25
Ref#:
ul.h = 1.32 ul.v = 2.09 lr.h = 1.64 lr.v = 2.25
520877ISU
ul.h = 1.72 ul.v = 2.10 lr.h = 2.25 lr.v = 2.25
This becomes generic index records
GROUP_FIELD_NAME:SPNSR
GROUP_FIELD_VALUE:COLLEGE CHARLESTON
GROUP_FIELD_NAME:SPNSR_REF
GROUP_FIELD_VALUE:520877
Below is what happens when there is a dash at the end.
Sponsor:
ul.h = 1.16 ul.v = 1.97 lr.h = 1.64 lr.v = 2.12
BROOKHAVEN
ul.h = 1.72 ul.v = 1.97 lr.h = 2.40 lr.v = 2.12
ASSOCIATES-Sponsor
ul.h = 2.86 ul.v = 1.97 lr.h = 3.51 lr.v = 2.12
ul.h = 0.86 ul.v = 2.09 lr.h = 1.31 lr.v = 2.25
Ref#:
ul.h = 1.32 ul.v = 2.09 lr.h = 1.64 lr.v = 2.25
183423
ul.h = 1.72 ul.v = 2.10 lr.h = 2.08 lr.v = 2.25
This becomes generic index records
GROUP_FIELD_NAME:SPNSR
GROUP_FIELD_VALUE:BROOKHAVEN SCIENCE ASSOCIATES-Sponsor
GROUP_FIELD_NAME:SPNSR_REF
GROUP_FIELD_VALUE:149282
Has anyone seen this before? how do we fix it?
Hello bwissink,
I don't know what to say, except what version do you have? you say 8.4, are you on the latest fix pack of this version (8.4.0.3)? Did you try to upgrade to 8.4.1 (8.4.1.9)?
Anyways, as you probably know CMOD 8.4 is not anymore supported by IBM, since 30.09.2012 (Unix, Windows) and 30.04.2013 (z/OS).
So my best guess would be to think for an upgrade from your version to V8.5.0.7 or V9.0.0.2.
And test if your problem is still there (I would suggest first in a dev or test environment).
V8.5 and V9.0 have a complete rewrite of PDF indexer, so maybe the little glitches that might have happened before, are solved now. And it should be faster.
Of course, if somebody has an idea to help you, then great, otherwise you might consider my suggestion.
Sincerely yours,
Alessandro