|
KOI8-R is an 8-bit character encoding, designed to cover Russian, which uses the Cyrillic alphabet. It also happens to cover Bulgarian. A derivative encoding is KOI8-U, which adds Ukrainian characters.
KOI8 remains much more commonly used than ISO 8859-5, which never really caught on. Another common Cyrillic character encoding is Windows-1251. In recent times, both might eventually give way to Unicode.
In Russian, KOI8 stands for Kod Obmena Informatsiey, 8 bit (Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit".
The KOI8 character sets have the property that the Russian Cyrillic letters are in pseudo-Roman order rather than the natural Cyrillic alphabetical order as in ISO 8859-5. Although this may seem unnatural, it has the useful property that if the 8th bit is stripped, the text can still be read (or at least deciphered) in case-reversed transliteration on an ordinary ASCII terminal. For instance, "Русский Текст" in KOI8-R becomes rUSSKIJ tEKST ("Russian Text") if the 8th bit is stripped.
In the table above, 20 is the regular SPACE character, and 9A is the NO-BREAK SPACE.
Although RFC 1489 says that character 95 should be U+2219 (∙), it may also be U+2022 (•) to match the bullet character in Windows-1251.
External links
- RFC 1489 (http://www.faqs.org/rfcs/rfc1489.html)
|