Searching in All forums
(1 results)
Solution
geekpower
Demo licensed
This thread is over 1 year old but I have found a working .Net Solution for the bullet issue.

To Summarize the problem.

Data is received in iso-8859-1 format. This format has special characters that do not display correctly when displayed charset is utf-8 (subset of unicode I believe).

When converting from iso-8859-1 to utf-8 the characters end up not displaying correctly but are converted correctly.


Encoding iso8859 = Encoding.GetEncoding("iso-8859-1");
Encoding unicode = Encoding.Unicode;
byte[] srcTextBytes = iso8859.GetBytes(textToConvert);
byte[] destTextBytes = Encoding.Convert(iso8859,unicode, srcTextBytes);
char[] destChars = new char[unicode.GetCharCount(destTextBytes, 0, destTextBytes.Length)];
unicode.GetChars(destTextBytes, 0, destTextBytes.Length, destChars, 0);

This code will convert the bullet to unicode.

As pointed out by G. Dierckx:
Quote :ps: I also found something (wich isn't totally related to the problem) but the "bullet" in lfs..

I retrieve from hostprogress
unicode 0095 -> http://www.fileformat.info/info/unic...0095/index.htm

While a "real" bullet should be: 2022 -> http://www.fileformat.info/info/unic...2022/index.htm

u\0095 is one of the correct unicode characters for the bullet so is 149;

The html entity is &#149 (•)

Using a snippet from another board on manually converting special characters to html_entities I grabbed:

StringBuilder result = new StringBuilder(textToConvert.Length + (int)(textToConvert.Length * 0.1));

foreach (char c in destChars)
{
int value = Convert.ToInt32(c);
if (value > 127)
result.AppendFormat("&#{0};", value);
else
result.Append(c);
}

return result.ToString();

Which gave me this function:
publicstatic string iso8859ToUnicode(string textToConvert)
{
Encoding iso8859 = Encoding.GetEncoding("iso-8859-1");
Encoding unicode = Encoding.Unicode;
byte[] srcTextBytes = iso8859.GetBytes(textToConvert);
byte[] destTextBytes = Encoding.Convert(iso8859,unicode, srcTextBytes);
char[] destChars = new char[unicode.GetCharCount(destTextBytes, 0, destTextBytes.Length)];
unicode.GetChars(destTextBytes, 0, destTextBytes.Length, destChars, 0);

StringBuilder result = new StringBuilder(textToConvert.Length + (int)(textToConvert.Length * 0.1));

foreach (char c in destChars)
{
int value = Convert.ToInt32(c);
if (value > 127)
result.AppendFormat("&#{0};", value);
else
result.Append(c);
}

return result.ToString();
}

This successfully converted my • in iso-8859-1 to &#149 for displaying in utf-8.

I have not done extensive testing yet so I do not know if this will work for all cases.
FGED GREDG RDFGDR GSFDG