UTF-8: change upper limit from 0x1FFFFF to 0x10FFFF
authorJohn Marino <draco@marino.st>
Sun, 9 Aug 2015 00:12:35 +0000 (02:12 +0200)
committerJohn Marino <draco@marino.st>
Sun, 9 Aug 2015 00:12:35 +0000 (02:12 +0200)
commit6c455f217ffe428181a839ff6bedcbfc3589215c
treeeba0f2844b87a90bae24581f7095ad2990579066
parentbde8a1ee4a2e115e1e9670f9b48b626be2919f7e
UTF-8: change upper limit from 0x1FFFFF to 0x10FFFF

In November 2003, RFC 3629 changed the upper bound from 0x7FFFFFFF (up to
6 bytes) to 0x10FFFF in order to match the constraints of UTF-16 encoding.
Last week, 5- and 6- bytes were considered illegal, now we also mark
a large portion of 4-byte sequences as illegal as well.
lib/libc/locale/utf8.c