OnDemand User Group

Support Forums => Report Indexing => Topic started by: mburnham on January 03, 2018, 06:14:10 AM

Title: Double byte character sets (DBCS)
Post by: mburnham on January 03, 2018, 06:14:10 AM: Hi,

I have a client who wants to index Japanese characters. We have a Unicode database:

$ db2 get db config |grep -i code
Database code page = 1208
Database code set = UTF-8
Database country/region code = 1

I can manually update a row with double byte characters and looking at a query, DB2 seems fine :

$ db2 "select SEARCH_ATTRIBUTE_1, SEARCH_ATTRIBUTE_2 from tvb1 where doc_name = '9FAAA'"

SEARCH_ATTRIBUTE_1 SEARCH_ATTRIBUTE_2
-------------------------------------------------- --------------------------------------------------
あああ

1 record(s) selected.

However, OnDemand isn't inserting these characters correctly:

$ db2 "select SEARCH_ATTRIBUTE_1, SEARCH_ATTRIBUTE_2 from tvb1 where doc_name = '14FAAA'"

SEARCH_ATTRIBUTE_1 SEARCH_ATTRIBUTE_2
-------------------------------------------------- --------------------------------------------------
▒▒▒▒▒ ▒▒▒▒

1 record(s) selected.

I'm using an index file that looks like this:

CODEPAGE:954
COMMENT:----- Checklist pdf Report 2017/07/19 for CBV_CHECKLIST --------------
GROUP_FIELD_NAME:Production_Date
GROUP_FIELD_VALUE:2017/12/20
GROUP_FIELD_NAME:System_DateTime
GROUP_FIELD_VALUE:2017/12/20
GROUP_FIELD_NAME:Report_Description
GROUP_FIELD_VALUE:Checklist Report CODEPAGE 932
GROUP_FIELD_NAME:Report_ID
GROUP_FIELD_VALUE:CBV_CHECKLIST
GROUP_FIELD_NAME:Job_Name
GROUP_FIELD_VALUE:Checklist_Process
GROUP_FIELD_NAME:CHECKLIST_CREATOR
GROUP_FIELD_VALUE:bernie
GROUP_FIELD_NAME:search_attribute_1
GROUP_FIELD_VALUE:あ
GROUP_FIELD_NAME:search_attribute_2
GROUP_FIELD_VALUE:ああ
GROUP_FIELD_NAME:search_attribute_3
GROUP_FIELD_VALUE:冬の日
GROUP_FIELD_NAME:search_attribute_4
GROUP_FIELD_VALUE:
GROUP_FIELD_NAME:search_attribute_5
GROUP_FIELD_VALUE: example text 5
GROUP_OFFSET:0
GROUP_LENGTH:0
GROUP_FILENAME:/var/tmp/japanese/7years_d4ECIN1-EeamXABQVgGNrwA.ard.out

I've tried setting the codepage to 954 (IBM suggested) and some other values - no luck.

There are other issues doing queries with the CMOD Windows client (9.5.0.x) but these may be Windows-related language settings.

Has anyone gotten this to work? IBM tells me double-byte characters have been supported in CMOD for 20 years, but can't tell me how to do this. :P

Thanks much in advance,

Mark
Title: Re: Double byte character sets (DBCS)
Post by: Nolan on January 03, 2018, 06:23:44 AM: What is in your ars.cfg file for ARS_ORIGINAL_CODEPAGE

https://www.ibm.com/support/knowledgecenter/en/SSQHWE_9.5.0/com.ibm.ondemand.configuringzos.doc/dodzc224.htm
Title: Re: Double byte character sets (DBCS)
Post by: Steve Bechtolt on February 02, 2018, 04:49:09 AM: In your index file, set CODEPAGE:1208