The ICU project is under the stewardship of The Unicode Consortium.
International Components for Unicode (ICU) is an open-source project of mature C/C++ and Java libraries for Unicode support, software internationalization, and software globalization. ICU is widely portable to many operating systems and environments. It gives applications the same results on all platforms and between C, C++, and Java software. The ICU project is a technical committee of the Unicode Consortium and sponsored, supported, and used by IBM and many other companies.[2]
ICU provides the following services: Unicode text handling, full character properties, and character set conversions; Unicode regular expressions; full Unicode sets; character, word, and line boundaries; language-sensitive collation and searching; normalization, upper and lowercase conversion, and script transliterations; comprehensive locale data and resource bundle architecture via the Common Locale Data Repository (CLDR); multi-calendar and time zones; and rule-based formatting and parsing of dates, times, numbers, currencies, and messages. ICU provided complex text layout service for Arabic, Hebrew, Indic, and Thai historically, but that was deprecated in version 54, and was completely removed in version 58 in favor of HarfBuzz.[3]
Homepage
Download
Recent Releases
73.214 Jul 2023 04:16
minor feature:
We are pleased to announce the release of Unicode ICU 73.2. It updates to CLDR 43.1 locale data with various additions and corrections. These are maintenance releases for ICU 73 and CLDR 43, with limited sets of bug fixes and no API or structural changes.
There are significant changes for GB18030-2022 compliance support:
CLDR extends the support for short Chinese sort orders to cover some additional, required characters for Level 2. This is carried over into ICU collation.
ICU has a modified character conversion table, mapping some GB18030 characters to Unicode characters that were encoded after GB18030-2005.
There are also changes for compatibility:
There
are optional variants of time formats with AM/PM (only for English) using ASCII spaces in CLDR that can also be used in ICU via custom data generation. This is intended to help certain implementers transition to the improved patterns, which have used a narrow no-break space between the time and AM/PM since CLDR 42.
For how to generate ICU data with this option, look for alt="ascii" on main/tools/cldr/cldr-to-icu/README.md
The changes to the word segmentation behavior of @ sign that were in CLDR 42 (ICU 72) have been reverted. These caused problems for certain parsers that did not expect @ to join to letters.
ICU 73.2 and CLDR 43.1 include several other bug fixes, including person name formatting, and Cyrillic transforms.
For details, please see https://icu.unicode.org/download/73.
Note: The prebuilt WinARM64 binaries below should be considered alpha/experimental.
65.105 Oct 2019 01:25
minor feature: