[Lazarus] Guessing the encoding of some text

José Mejuto joshyfun at gmail.com
Thu Nov 16 13:02:39 CET 2017


El 16/11/2017 a las 11:25, Torsten Bonde Christiansen via Lazarus escribió:
> Hi List.
> 
> I am reading some text of some .csv files, but the encoding of the files 
> is not always the same. In fact it may vary greatly from a lot of 
> european encodings, UTF8 and asian encoding.
> 

Hello,

Some years ago I wrote a code that must be trained which guess encoding 
and language. The problems are that it must be trained with large texts 
and of course the result are only statistical and only quite good over 
quite large texts (like 1000 chars or more) so it is not good for single 
sentences.

If you are interested I can dive into old codes to catch it.


-- 



More information about the Lazarus mailing list