Details
-
Story
-
Resolution: Unresolved
-
Major
-
None
-
None
Description
NLS Format Analysis: Wine locale.c vs. ReactOS Infrastructure
Background
`dll/win32/kernelbase/wine/locale.c` is the Wine-derived implementation of the kernelbase locale API. It targets the Windows Vista+ (NT 6.0+) NLS architecture, which uses a substantially different set of NLS section types, file formats, and kernel APIs compared to what ReactOS currently implements (Windows XP/2003, NT 5.x).
The tool `sdk/tools/txt2nls/main.cpp` (note: file is `.cpp`, not `.c`) converts `.txt` codepage definitions to binary NLS files. It is only one part of the
full NLS infrastructure.
NtGetNlsSectionPtr Section Type Mapping
The central dispatch mechanism is NtGetNlsSectionPtr(type, id, unk, &ptr, &size).
The type numbers differ completely between the two eras.
Windows Vista+ / Wine (correct mapping, confirmed by wine test)
| Type | File / purpose | Notes |
|---|---|---|
| 9 | sortdefault (sort keys, casemap, ctypes, sort GUIDs) | bogus ID → STATUS_INVALID_PARAMETER_1 |
| 10 | casemap table (l_intl.nls format) | bogus ID → STATUS_INVALID_PARAMETER_1 or STATUS_UNSUCCESSFUL |
| 11 | codepage files (`c_*.nls`), keyed by codepage number | bogus ID → STATUS_OBJECT_NAME_NOT_FOUND |
| 12 | Unicode normalization files, keyed by normalization form | bogus ID → STATUS_OBJECT_NAME_NOT_FOUND |
| 13 | unknown sort-related | bogus ID → STATUS_INVALID_PARAMETER_1 |
| 14 | unknown | success or STATUS_INVALID_PARAMETER_1 depending on Windows version |
| all others (0–8, 15+) | invalid | → STATUS_INVALID_PARAMETER_1 |
ReactOS Current (`subsystems/win/basesrv/nls.c`)
| Type | File | Status |
|---|---|---|
| 1 | unicode.nls | implemented |
| 2 | locale.nls | implemented |
| 3 | ctype.nls | implemented |
| 4 | sortkey.nls | implemented |
| 5 | sorttbls.nls | implemented |
| 6 | c_437.nls | implemented |
| 7 | c_1252.nls | implemented |
| 8 | l_except.nls | implemented |
| 9 | - | STATUS_NOT_IMPLEMENTED |
| 10 | - | STATUS_NOT_IMPLEMENTED |
| 11 | codepage files (dynamic) | implemented, matches Vista+ |
| 12 | geo.nls | implemented, CONFLICTS with Vista+ normalization |
Key observations:
- Types 1–8 are the old XP/2003 CSR-based mechanism; they do not exist in Vista+.
- Type 9 is stubbed out; `locale.c` calls it unconditionally on startup.
- Type 12 in ReactOS maps to `geo.nls`; in Vista+ it maps to normalization files. This is a direct conflict that will cause `IsNormalizedString`/`NormalizeString` to receive geo data instead of normalization tables.
Function Stubs in ntdll
From `dll/ntdll/def/ntdll.spec`:
| Function | Status | Used by locale.c |
|---|---|---|
| NtGetNlsSectionPtr | -stub -version=0x600+ | Yes — types 9, 11, 12 |
| NtInitializeNlsFiles | -stub -version=0x600+ | Indirectly (Wine test uses it) |
| RtlNormalizeString | -stub -version=0x600+ | Yes — called by NormalizeString |
| RtlGetLocaleFileMappingAddress | not exported at all | Yes — called in load_locale_nls() |
RtlGetLocaleFileMappingAddress is the most critical missing export.
load_locale_nls() calls it during init_locale() and if it returns failure the entire locale subsystem fails to initialise.
NLS File Format Differences
1. locale.nls — Major format change
Old format (ReactOS, XP/2003):
- Binary table, no magic header, ~209 KB
- Per-LCID records in an undocumented 2003-era layout
- Read by old `kernel32` via NtGetNlsSectionPtr(2, ...)
New format (Vista+, required by Wine `locale.c`):
- Accessed via RtlGetLocaleFileMappingAddress()
- Starts with NLS_LOCALE_HEADER at offset 0 (magic `'NSDS'` at offset 0x0C)
- Followed by: NLS_LOCALE_DATA[] array, sorted LCID index, sorted LCNAME index, string pool, calendar array, {}and embedded geo data{} (new struct geo_id[] + struct geo_index[] with its own sub-header)
The NLS_LOCALE_HEADER and NLS_LOCALE_DATA types are defined only in `sdk/include/wine/winternl.h` (because only Wine code uses them today).
ReactOS NDK headers (`sdk/include/ndk/`) do not have these types. If `RtlGetLocaleFileMappingAddress` is to be implemented inside ntdll/RTL,
these types will need to be added to a suitable ReactOS header.
Impact: load_locale_nls() reads locale data, geo IDs, geo index, and character maps entirely from the pointer returned by RtlGetLocaleFileMappingAddress. None of this works today on ReactOS.
Tooling gap: No tool exists in ReactOS to produce a `locale.nls` in the new format. Wine's `nls/locale.nls` (generated by its `tools/make_unicode` script) is a compatible source; alternatively, a new tool would need to consume the MS locale data.
Note also that `sdk/lib/rtl/locale.c` currently resolves LCID↔name via a hardcoded RtlpLocaleTable[] array. This approach is orthogonal to
RtlGetLocaleFileMappingAddress; the two need to be reconciled.
2. `sortdefault.nls` — New file, missing in ReactOS
Old format (ReactOS, XP/2003):
- Two separate files served via old CSR types:
- `sortkey.nls` (type 4, ~262 KB): 4-byte sort weights per Unicode code point
- `sorttbls.nls` (type 5, ~21 KB): sort table metadata and filenames for locale-specific sort files (e.g. `big5.nls`, `prcp.nls`)
New format (Vista+, required by Wine `locale.c`):
- Single file `sortdefault.nls` served via NtGetNlsSectionPtr(9, 0, ...)
- Header layout (from load_sortdefault_nls()):
struct {
|
UINT sortkeys; // offset to sort key table (UINT per Unicode code point) |
UINT casemaps; // offset to casemap table (l_intl.nls format, USHORT pairs) |
UINT ctypes; // offset to CT_CTYPE1/2/3 table |
UINT sortids; // offset to sort ID block |
};
|
- After sort IDs: expansion count + struct sort_expansion[] (2×WCHAR per entry)
- After expansions: compression count + struct sort_compression[] + compression data
- After compression data: multiple-weights block + struct jamo_sort[]
Tooling gap: No tool exists in ReactOS to build `sortdefault.nls`.
Wine generates it from its `tools/make_unicode` script using Unicode data.
3. Normalization NLS files — Missing entirely
Old format (ReactOS, XP/2003): Not present.
New format (Vista+, required by Wine `locale.c`):
- Four files keyed by normalization form (NormalizationC=1, D=2, KC=5, KD=6), served via `NtGetNlsSectionPtr(12, form, ...)`
- Parsed via `struct norm_table` header (defined in `locale.c`):
- File name (13 WCHARs), checksum, Unicode version, normalization form
- Offsets to: combining class table, property tables (level 1 + 2), decomposition hash + map + sequence tables, composition hash + sequence tables
- Used by RtlNormalizeString() (also stubbed)
Conflict: ReactOS currently maps type 12 to `geo.nls`. The type 12 slot must be reassigned to normalization. Geo data in the new architecture is embedded in `locale.nls` itself (see section 1 above).
Tooling gap: No tool exists in ReactOS to build normalization NLS files. Wine's `nls/Normalize{C,D,KC,KD}.nls` files can serve as a source.
4. Casemap table (type 10)
Old format (ReactOS): `l_intl.nls` served via old CSR type, also referenced directly from ExpNlsSectionPointer in the kernel.
New format (Vista+): Served via NtGetNlsSectionPtr(10, 0, ...). The format is the same `l_intl.nls` USHORT-pair layout — `locale.c` explicitly notes /* casemap table, in l_intl.nls format */ for `sort.casemap`. The content of ReactOS's existing `l_intl.nls` (4870 bytes) should be compatible once type 10 is implemented.
5. Codepage NLS files (`c_*.nls`) — Minor header difference, mostly resolved
Both old and new code use the same 26-byte NLS_FILE_HEADER (13 WORDs).
The NDK header `sdk/include/ddk/ntnls.h` already defines the correct layout:
typedef struct _NLS_FILE_HEADER {
|
USHORT HeaderSize; // = 13 (WORDs) |
USHORT CodePage;
|
USHORT MaximumCharacterSize; // 1 = SBCS, 2 = DBCS |
USHORT DefaultChar;
|
USHORT UniDefaultChar;
|
USHORT TransDefaultChar; // Unicode → CP fallback: Unicode of DefaultChar |
USHORT TransUniDefaultChar; // CP → Unicode fallback: CP of UniDefaultChar |
UCHAR LeadByte[12]; |
} NLS_FILE_HEADER;
|
Old tool (`sdk/tools/create_nls/create_nls.c`):
- Uses a different in-memory layout (BYTE DefaultChar[2] + unknown1 }}/ {{{}unknown2)
- Always writes `'?'` (0x003F) for TransDefaultChar and TransUniDefaultChar
- Reads data from the host OS via GetCPInfoExA — Windows-only
New tool (`sdk/tools/txt2nls/main.cpp`):
- Uses the correct layout matching `ntnls.h`
- Properly computes TransDefaultChar and TransUniDefaultChar
- Reads from portable `.txt` source files; runs cross-platform
The `txt2nls` tool already generates correct NLS files. All codepage `.nls` files in `media/nls/` are now built by `txt2nls` from the `.txt` sources in
`media/nls/src/`. The `create_nls.c` tool is obsolete for this purpose.
Files not yet converted: `c_856.nls` and `c_878.nls` are listed in `media/nls/CMakeLists.txt` as static (manually generated) rather than built
from `.txt` sources. They may still be in the old format with `'?'` for the translated default chars, and should have `.txt` sources added so they can be
rebuilt by `txt2nls`.
Summary of Required Changes
Critical (locale.c cannot initialize without these)
- Implement RtlGetLocaleFileMappingAddress in `sdk/lib/rtl/` or `dll/ntdll/`. Must map and return a pointer to a Vista+-format `locale.nls` image.
- Implement NtGetNlsSectionPtr type 9 (sortdefault) in ntdll/ntoskrnl. Must serve `sortdefault.nls` in the new unified format.
- Implement NtGetNlsSectionPtr type 10 (casemap) in ntdll/ntoskrnl. Can reuse `l_intl.nls` data (format is already compatible).
- Fix NtGetNlsSectionPtr type 12 conflict in `subsystems/win/basesrv/nls.c`: change from `geo.nls` to the normalization NLS files. Geo data must instead be embedded in the new `locale.nls`.
- Create new `locale.nls` in Vista+ format (`NLS_LOCALE_HEADER` + `NLS_LOCALE_DATA[]`). Wine's `nls/locale.nls` can serve as a base.
- Create `sortdefault.nls` in the new unified format. Wine's `nls/sortdefault.nls` can serve as a base.
- Add NLS_LOCALE_HEADER and NLS_LOCALE_DATA type definitions to a ReactOS NDK header (e.g. `sdk/include/ndk/rtltypes.h` or a new `sdk/include/ndk/nlstypes.h`), so that RtlGetLocaleFileMappingAddress can be implemented without depending on Wine's `winternl.h`.
Important (affects correctness)
8. Implement RtlNormalizeString in `sdk/lib/rtl/`. Requires normalization NLS files served via NtGetNlsSectionPtr(12, ...).
9. Create normalization NLS files (`Normalize{C,D,KC,KD}.nls`). Wine's `nls/Normalize*.nls` can serve as a base.
10. Implement `NtInitializeNlsFiles` in ntdll. Used by some callers to prime the locale file mapping before RtlGetLocaleFileMappingAddress is called.
11. Reconcile `sdk/lib/rtl/locale.c` hardcoded RtlpLocaleTable[] with the file-based data from `locale.nls`. Currently both represent locale↔LCID
mappings independently. Long term the RTL should derive this data from `locale.nls` rather than a compile-time table.
Lower priority
12. Convert `c_856.nls` and `c_878.nls` from static files to `.txt`-sourced builds via `txt2nls`, ensuring TransDefaultChar / TransUniDefaultChar are correct.
13. Deprecate `sdk/tools/create_nls/create_nls.c`. It is superseded by `txt2nls`. The files it would produce have incorrect TransDefaultChar values.
14. Document the old CSR NLS types 1–8 in `basesrv/nls.c` as XP/2003-era compatibility only. They should be preserved for any remaining old code paths but are invisible to Vista+ callers.
File Reference
| File | Role |
|---|---|
| dll/win32/kernelbase/wine/locale.c | New locale API; requires Vista+ NLS |
| sdk/tools/txt2nls/main.cpp | Codepage NLS generator (correct, in use) |
| sdk/tools/create_nls/create_nls.c | Old codepage NLS generator (obsolete) |
| sdk/include/ddk/ntnls.h | CPTABLEINFO, NLSTABLEINFO, NLS_FILE_HEADER |
| sdk/include/ndk/rtltypes.h | NLS_FILE_HEADER (matches new format) |
| sdk/include/wine/winternl.h | NLS_LOCALE_HEADER, NLS_LOCALE_DATA (Wine only) |
| sdk/lib/rtl/locale.c | RTL locale name↔LCID resolution (hardcoded table) |
| dll/ntdll/def/ntdll.spec | Stubs: NtGetNlsSectionPtr, RtlNormalizeString, etc. |
| subsystems/win/basesrv/nls.c | NLS section type → file mapping (old scheme) |
| media/nls/locale.nls | Old-format locale data (must be replaced) |
| media/nls/sortkey.nls | Old sort key table (superseded by `sortdefault.nls`) |
| media/nls/sorttbls.nls | Old sort table metadata (superseded by `sortdefault.nls`) |
| media/nls/l_intl.nls | Case mapping (compatible with new type 10) |
| media/nls/geo.nls | Geo data (must move into new `locale.nls`) |
| media/nls/CMakeLists.txt | NLS file build rules |