Friday, March 30, 2012

Problem saving/retrieving unicode characters NVARCHAR with unicode collation (java jdbc)

I'm connecting to a SQL Server 2005 database using the latest (beta) sql server driver (Microsoft SQL Server 2005 JDBC Driver 1.1 CTP June 2006) from within Java (Rational Application Developer).

The table in SQL Server database has collation Latin1_General_CI_AS and one of the columns is a NVARCHAR with collation Indic_General_90_CI_AS. This should be a Unicode only collation. However when storing for instance the following String:

€_£_ùúü?ùúü?_???????£????_???_??????_прстуф_????
... it is saved with ? for all unicode characters as follows (when looking in the database):
€_£_ùúü?ùúü?_???????£????_???_?_?_?

The above is not correct, since all unicode characters should still be visible. When inserting the same string directly into the sql server database (without using Java) the result is ok.

Also when trying to retrieve the results again it complains about the following error within Java:

Codepage 0 is not supported by the Java environment.

Hopefully somebody has an answer for this problem. When I alter the collation of the NVARCHAR column to be Latin1_General_CI_AS as well, the data can be stored and retrieved however then of course the unicode specific characters are lost and results into ? So in that case the output is as described above (ie €_£_ùúü?ùúü?_???????£????_???_?_?_?)

We would like to be able to persist and retrieve unicode characters in a SQL Server database using the correct JDBC Driver. We achieved this result already with an Oracle UTF8 database. But we need to be compliant with a SQL Server database as well. Please help.

Thanks in advance for your help.So far nobody replied on my question. unfortunately.
Please help me to sort out this problem.

Bottom line we just want to be able to save and retrieve Unicode characters in SQL server within our Java application. We want to do this by setting the NVARCHAR columns in all tables to a Unicode collation.

Please help.
Thanks.|||

Hugo,

Can you tell me whether you have sendStringParametersAsUnicode set to "false" in your connection properties?

Thank you,

--David Olix

JDBC Development

|||

Hi David,

Thanks for your reply. I can ensure that I didn't set the sendStringParametersAsUnicode to "false". This parameter is default set to "true" so I didn't change it in the connection properties.

I found out that the unicode Collation I used "Indic_General_90_CI_AS" had a Codepage 0 in my SQL Server 2005 installation. In fact all unicode only collations do have a codepage 0 if I check them all using the following command:

SELECT COLLATIONPROPERTY('Indic_General_90_CI_AS', 'CodePage') as CodePage

Is this perhaps related to the problem I have in Java? Because in there the exception explains about the fact that the codepage is 0.

I really hope you can help me to sort this out.

Thanks in advance for your help David.

Hugo

|||Hi David,

I found the solution. When updating a SQL Server 2005 NVARCHAR column, you have to use the following syntax:

UPDATE <table> SET <column> = N'<value>'

So for example:
UPDATE customer SET id = '€_???????£????_???_??????_прстуф_????'

Has to be this for a SQL Server connection:

UPDATE customer SET id = N'€_???????£????_???_??????_прстуф_????'


The prefix N before the value is required to make sure to save the string in Unicode format. I did not do this so far. Now I'm able to store and retrieve data into a SQL Server database and my collatin is now Latin1_General_CI_AS since this supports Unicode as well.

Hugo|||

hi hugo

this is gr8 thread

i have facing a problem

i want to insert indic values in a table in SQL server

the update statement u mentioned is useful

UPDATE <table> SET <column> = N'<value>'

but when i use INSERT INTO <table> VALUES(N'<value>')

can u tell me y doesnt this work

tahnx in advance.

No comments:

Post a Comment