OnDemand User Group

Support Forums => Report Indexing => Topic started by: DDP021 on September 27, 2020, 04:39:19 PM

Title: COLUMN DELIMITER ERROR
Post by: DDP021 on September 27, 2020, 04:39:19 PM: I'm trying to load a text file into CMOD. Keeps getting the same error:

Loading started, 3100678 bytes to process
Row 178: The string "/BENEFICIARY CURRENCYFAIR" has the column delimiter in it

I've opened the file in PSPAD under HEX Editor and only find 2 instances of this exact line. Never found the proceeding / character anywhere. I've tried editing both lines be inserting a x'20' character (space) in front of BENEFICIART and it still gets the same error. Am I missing something?
Title: Re: COLUMN DELIMITER ERROR
Post by: DDP021 on September 28, 2020, 03:34:47 AM: I've attached a screen shot of bad line from PSPAD
Title: Re: COLUMN DELIMITER ERROR
Post by: Justin Derrick on September 28, 2020, 09:51:52 AM: The field delimiter is usually the tab character (Hex 09). Check the file for tab characters inside the index.

-JD.
Title: Re: COLUMN DELIMITER ERROR
Post by: DDP021 on September 28, 2020, 09:58:20 AM: Justin,

I just heard back this AM from our CMOD engineering person...He actually used a UNIX command to find the issue (I'm not Unix person!! haha)...Apparently PSPAD didn't show this (as you said it may now)...Hoping the explanation below makes sense...;-)

The 88 msg of EMXDEFLT is caused by the tab character. As it’s showing the same as a space, it’s very elusive.
The way to find them is by using unix command: grep "$(printf '\t')" filename
After running that, it shows me there is a total of 19 tab characters (\t) in the input file.

For now, I have replaced all tab character with space (cat 1414319.1.EMXDEFLT.EMXDEFLT.2020925.10.ard.Failed.org|sed s/"$(printf '\t')"/' '/g > test.txt), and ingested that file into ERR successfully.
But this has to be corrected by the team who generated the input file in the future.
Please forward this email to that team. And let them know they can capture tab character by searching for "$(printf '\t')" .
Title: Re: COLUMN DELIMITER ERROR
Post by: Justin Derrick on September 28, 2020, 11:35:30 AM: Yup, stuff like this needs to get sent upstream for resolution. As archivists, we shouldn't be manipulating data on its way into CMOD. It calls into question the authenticity of the archive.

-JD.
Title: Re: COLUMN DELIMITER ERROR
Post by: Ed_Arnold on September 28, 2020, 03:24:12 PM: Yep. I've seen this before:

http://www.odusergroup.org/forums/index.php?topic=1987.msg7528#msg7528 (http://www.odusergroup.org/forums/index.php?topic=1987.msg7528#msg7528)

Ed
Title: Re: COLUMN DELIMITER ERROR
Post by: DDP021 on September 28, 2020, 04:46:54 PM: Thanks ED!
Title: Re: COLUMN DELIMITER ERROR
Post by: DDP021 on September 28, 2020, 04:55:05 PM: Totally agree Justin...But as I'm sure you are more than aware, its can be more than frustrating getting answers or fixes from those who are the ones generating the data. Even after showing them exactly what is occurring they have no answers. This particular group has frustrated us in the past. Most recently with field lengths. After confirmation of max lengths they will be sending, in more than one occasion they have randomly increased some of them which has cause max field length exceded ingestion errors. Our only solution we found was to create a new application group with the new field lengths. The issue is we need to rename the existing one to a different name so they can keep using the same naming convention when sending the files all the while still being able to access archived data. Sorry to get off topic here...Just needed to vent!! haha..As always appreciate everyones input and expertise! Take care. Dave
Title: Re: COLUMN DELIMITER ERROR
Post by: Justin Derrick on September 29, 2020, 05:40:05 AM: Hey Dave...

Yeah, I've seen my fair share of problematic vendors and lines of business sending bad data through systems. Sometimes we catch it early, sometimes we have to back-pedal to fix things. I always try to impress upon management that, as an archive, the more things you do to tweak bad data, the worse it looks to a judge in the midst of a lawsuit. Having to admit that you alter data before or during load, or change records after they've been loaded, or selectively delete data... that's just terrible for an organization's credibility.

...but we do the best we can. ;)

-JD.