KOI8-R KOI8-R

KOI8-R - Definition

KOI8-R is an 8-bit character encoding, designed to cover Russian, which uses the Cyrillic alphabet. It also happens to cover Bulgarian. A derivative encoding is KOI8-U, which adds Ukrainian characters.

KOI8 remains much more commonly used than ISO 8859-5, which never really caught on. Another common Cyrillic character encoding is Windows-1251. In recent times, both might eventually give way to Unicode.

In Russian, KOI8 stands for Kod Obmena Informatsiey, 8 bit (Код Обмена Информацией, 8 бит) which means "Code for Information Exchange, 8 bit".

The KOI8 character sets have the property that the Russian Cyrillic letters are in pseudo-Roman order rather than the natural Cyrillic alphabetical order as in ISO 8859-5. Although this may seem unnatural, it has the useful property that if the 8th bit is stripped, the text can still be read (or at least deciphered) in case-reversed transliteration on an ordinary ASCII terminal. For instance, "Русский Текст" in KOI8-R becomes rUSSKIJ tEKST ("Russian Text") if the 8th bit is stripped.

KOI8-R
x0x1x2x3x4x5x6x7x8x9xAxBxCxDxExF
0xunused
1x
2xSP!"#$%&'()*+,-./
3x 0 123456789:;<=>?
4x@ABCDEFGHIJKLMNO
5xPQRSTUVWXYZ[\]^_
6x`abcdefghijklmno
7xpqrstuvwxyz{|}~
8x
9xNBSP°²·÷
Axё
BxЁ©
Cxюабцдефгхийклмно
Dxпярстужвьызшэщчъ
ExЮАБЦДЕФГХИЙКЛМНО
FxПЯРСТУЖВЬЫЗШЭЩЧЪ

In the table above, 20 is the regular SPACE character, and 9A is the NO-BREAK SPACE.

Although RFC 1489 says that character 95 should be U+2219 (∙), it may also be U+2022 (•) to match the bullet character in Windows-1251.

External links

  • RFC 1489 (http://www.faqs.org/rfcs/rfc1489.html)

Copyright 2009 WordIQ.com - Privacy Policy  :: Terms of Use  :: Contact Us  :: About Us
This article is licensed under the GNU Free Documentation License. It uses material from the this Wikipedia article.