locales: Remove two more aliases (ja_JP.eucjp and en_US.ISO-8859-1) There was an earlier effort to define common locales (arising from Linux's case-and-hyphen insensitivity) and they were going to be removed when our locale handling followed Linux behavior, but it was decided that we would and should not follow Linux's example. Since then and through collaborative discussions with FreeBSD, it was decided these common-but-incorrect-for-BSD aliases should not continue. FreeBSD will never get them and I'm removing ours.
locales: Remove symlinks UTF8 => UTF-8 In retrospect, having an alias for UTF-8 does not bring any real benefits and it can cause confusion. Let's remove this *.UTF8 locale symlinks which is closer to the convention of the other BSDs. GCC testsuite is also removing utf8 and UTF8 locales for portability with BSD.
locale polishing: pl_PL, sk_SK, sl_SI, tr_TR, uk_UA, sv_FI changes pl_PL: Change shortname alias from ISO8859-2 to UTF-8. The former can not show the Polish currency symbol. sk_SK: Same as pl_PL, but for the Euro symbol sl_SI: Same as pl_PL, but for the Euro Symbol tr_TR: Switch shortname to UTF-8 to handle currency symbol uk_UA: Same as tr_TR sv_FI: Switch shortname to ISO-15 for consistency in Europe
locale polishing: lt_LT, et_EE, and en_IE changes lt_LT: Remove ISO8859-4. If an ISO encoding is needed, -13 would be used instead. et_EE: Estonia could use ISO8859-4 (not present), the newer ISO8859-13 (not present) or ISO8859-15 (present). To be uniform with other European countries that use -15, change the default from UTF-8 to ISO8859-15. Do not bring in -4 or -13. en_IE: Ireland only had UTF-8. To be uniform with the UK and Western Europe, bring in ISO8859-15 and set that as the alias for the shortname. Do not bring in -1.
locales: Switch several shortnames away from ISO-8859-1 For all western European countries that have their short names aliased to ISO-8859-1, this is against POLA. Once the decision to not use UTF-8 is made, the short names should be linked to ISO-8859-15, the one with the Euro symbol and other common characters that ISO-8859-1 lacks. While here, change en_PH to UTF-8 since the Peso sign can't be represented and do the same for Costa Rica and it's Colon currency.
locales: Remove new ISO-8859-15@euro symlinks We're in the process of thinking hard about locale symlinks. These new ISO-8859-15@euro symlinks (to their ISO8859-15 counterparts) were a bad idea that I copied from Linuxland. Let's reverse this before it gets released in DragonFly. It doesn't even make sense. @euro is basically shorthand for ISO8859-15, so what value does it have to "modify" that encoding? Stop the Linuxanity now ...
Add 17 new locales and really remove Latin Now that locale defintions are generates, it's easy to add more. I've added the following new locale defintions: * en_HK ISO-8859-1 (Hong Kong/English) * en_HK UTF-8 * en_PH ISO-8859-1 (Phillipines/English) * en_PH UTF-8 * en_SG ISO-8859-1 (Singapore/English) * en_SG UTF-8 * es_AR ISO-8859-1 (Argentina/Spanish) * es_AR UTF-8 * es_CR ISO-8859-1 (Costa Rica/Spanish) * es_CR UTF-8 * es_MX ISO-8859-1 (Mexico/Spanish) * es_MX UTF-8 * se_FI UTF-8 (Finland/Northern Sami) * se_NO UTF-8 (Norway/Northern Sami) * sv_FI ISO-8859-1 (Finland/Swedish) * sv_FI ISO-8859-15 * sv_FI UTF-8 There were a few places la_LN (Latin) was hidden so I've really removed it now.
locales: Change defaults for territory-only locales From research, it appears that the default for territory-only locales (e.g. en_US, de_DE) is to use ISO-8859-X (where X is not 15). The @euro modifier is the alias for ISO-8859-15. Since there is no apparent standard, I am going to switch the defaults from aliasing to UTF-8 to ISO-8859 where available. For the non-latin character sets, I left these at UTF-8 rather than try to decide on a different default. Incidentally, this enables a number of gcc libstdc++ tests to pass as well.
Import generated locales (time, money, msg, numeric) All the definitions were generated by the "new" tool located at src/tools/tools/locale. It uses CLDR version 2.0.1 and Unicode release 8.0.0. The tool is smart enough to eliminate duplicates -- every one of these real files is unique. The makefiles are also generated and are much simpler.
Move some locales before upcoming overhauls Several locales are obsolete and have misleading names. This moves many of them at once. An entry to the UPDATING files notifies what the new locale options are for users of affected locales. This is necessary because the upcoming locale updates are going to be generated and we can avoid carrying ancient aliases over. Also note that Serbia's change may not be complete until the entire locales are updated.
Remove no_NO (only use nb_NO and nn_NO) There are only two variations of Norwegian: Bokmål or nynorsk, both of which are equally official. These are established by law and governmental policy: Bokmål (literally "book language") and Nynorsk (literally "new Norwegian"). The Norwegian Language Council recommends the terms "Newegian Bokmål" and "Norwegian Nynorsk" in English. The no_NO variant is a holdover from when nb_NO and nn_NO were created. In glibc, the old no_NO was kept as an alias to nb_NO. This option is considered confusing and probably should have been removed years ago.
locales: Add 66 "@euro" symlinks, and UTF8 alias While not documented, it appears that modifiers work. The only widely used modifier is "@euro" which is an alias for ISO 8859-15 coded which contains the Euro currently along with 7 other new symbols as compared to ISO 8859-1. For all 33 localization sets that use la_LN.ISO8859-15 for LC_CTYPE, I'm creating an @euro and .ISO-8859-15@euro symlink to them. These are still found on Linux and the .ISO-8859* extension is the chosen linux format (this is unfortanately not standardized thus BSD and Linux differ). Thus any use of @euro will probably also refer to the Linux format and that's why I chose it. For the same reason, .UTF8 versions (Linux) are being symlinked to DragonFly's .UTF-8 versions. Symlinks are cheap.
Create short name locales In other (non-BSD?) platforms, it's possible to use a locale such as "en_GB". Up until now, this did not work on DragonFly. It required a codeset as well, so if "en_GB.UTF-8" wasn't specified, setlocale would fail. Every locale except two has a version that uses the UTF-8 codeset, so a symlink was created to these to create these territory locales. The other two are hi_IN => hi_IN.ISCII-DEV la_LN => la_LN.ISO8859-1 This should improve cross-platform compatibility, and it will lead to a change in gcc50 libstdc++ locale handling. While here, set the "share" directory to build in parallel rather than serially.