Wednesday, October 31, 2012

Gibrish when inserting non-English string into CouchBase JSON document

When I put a non-English characters values as string into CouchBase I get gibrish:

"name": " ״§„ˆ״±Š״© …״×״±״¬… †״µˆ״µ …״¬״§†Š ״¥„‰ ",

Part of the ASP.NET code:


...
name = " الفورية مترجم نصوص مجاني إلى ",
...

client.StoreJson<Comparison>(StoreMode.Add, "2", box);


So what I am trying to solve now is how to put UTF-8 non-English characters into a JSON format document into CouchBase

Solution:

I'm not sure that this is the best way to do it, but what I did is:

name = Uri.EscapeDataString(" الفورية مترجم نصوص مجاني إلى "),

I escaped encode the string before inserting it to the class member and before serializing it. Then I can get back the data in Arabic Characters by using Url.UnescapedDataString(c.name).

I think that their is an option to put non-ascii characters in the CouchBase document without escaped encoding, but I am waiting for an answer from stackoverflow and couchbase forums. How to have a definitive answer soon.

There is also an option to store the data without escaping it by using JSON.NET instead of the .NET JSON serialization and it works great.


The question still remains whether this is the way to do it and that it doesn't interrupt with the functionality of couchbase when I want to query the data. 

No comments:

Post a Comment