Uploaded image for project: 'Core ReactOS'
  1. Core ReactOS
  2. CORE-20653

Convert NLS section format to NT6 for Wine compatibility

    XMLWordPrintable

Details

    • Story
    • Resolution: Unresolved
    • Major
    • None
    • Wine
    • None

    Description

       

      NLS Format Analysis: Wine locale.c vs. ReactOS Infrastructure

      Background

      `dll/win32/kernelbase/wine/locale.c` is the Wine-derived implementation of the kernelbase locale API. It targets the Windows Vista+ (NT 6.0+) NLS architecture, which uses a substantially different set of NLS section types, file formats, and kernel APIs compared to what ReactOS currently implements (Windows XP/2003, NT 5.x).

      The tool `sdk/tools/txt2nls/main.cpp` (note: file is `.cpp`, not `.c`) converts `.txt` codepage definitions to binary NLS files. It is only one part of the
      full NLS infrastructure.

      NtGetNlsSectionPtr Section Type Mapping

      The central dispatch mechanism is NtGetNlsSectionPtr(type, id, unk, &ptr, &size).
      The type numbers differ completely between the two eras.

      Windows Vista+ / Wine (correct mapping, confirmed by wine test)

      Type File / purpose Notes
      9 sortdefault (sort keys, casemap, ctypes, sort GUIDs) bogus ID → STATUS_INVALID_PARAMETER_1
      10 casemap table (l_intl.nls format) bogus ID → STATUS_INVALID_PARAMETER_1 or STATUS_UNSUCCESSFUL
      11 codepage files (`c_*.nls`), keyed by codepage number bogus ID → STATUS_OBJECT_NAME_NOT_FOUND
      12 Unicode normalization files, keyed by normalization form bogus ID → STATUS_OBJECT_NAME_NOT_FOUND
      13 unknown sort-related bogus ID → STATUS_INVALID_PARAMETER_1
      14 unknown success or STATUS_INVALID_PARAMETER_1 depending on Windows version
      all others (0–8, 15+) invalid STATUS_INVALID_PARAMETER_1

      ReactOS Current (`subsystems/win/basesrv/nls.c`)

      Type File Status
      1 unicode.nls implemented
      2 locale.nls implemented
      3 ctype.nls implemented
      4 sortkey.nls implemented
      5 sorttbls.nls implemented
      6 c_437.nls implemented
      7 c_1252.nls implemented
      8 l_except.nls implemented
      9 - STATUS_NOT_IMPLEMENTED
      10 - STATUS_NOT_IMPLEMENTED
      11 codepage files (dynamic) implemented, matches Vista+
      12 geo.nls implemented, CONFLICTS with Vista+ normalization

      Key observations:

      • Types 1–8 are the old XP/2003 CSR-based mechanism; they do not exist in Vista+.
      • Type 9 is stubbed out; `locale.c` calls it unconditionally on startup.
      • Type 12 in ReactOS maps to `geo.nls`; in Vista+ it maps to normalization files. This is a direct conflict that will cause `IsNormalizedString`/`NormalizeString` to receive geo data instead of normalization tables.

      Function Stubs in ntdll

      From `dll/ntdll/def/ntdll.spec`:

      Function Status Used by locale.c
      NtGetNlsSectionPtr -stub -version=0x600+ Yes — types 9, 11, 12
      NtInitializeNlsFiles -stub -version=0x600+ Indirectly (Wine test uses it)
      RtlNormalizeString -stub -version=0x600+ Yes — called by NormalizeString
      RtlGetLocaleFileMappingAddress not exported at all Yes — called in load_locale_nls()

       

      RtlGetLocaleFileMappingAddress is the most critical missing export.
      load_locale_nls() calls it during init_locale() and if it returns failure the entire locale subsystem fails to initialise.

      NLS File Format Differences

      1. locale.nls — Major format change

      Old format (ReactOS, XP/2003):

      • Binary table, no magic header, ~209 KB
      • Per-LCID records in an undocumented 2003-era layout
      • Read by old `kernel32` via NtGetNlsSectionPtr(2, ...)

      New format (Vista+, required by Wine `locale.c`):

      • Accessed via RtlGetLocaleFileMappingAddress()
      • Starts with NLS_LOCALE_HEADER at offset 0 (magic `'NSDS'` at offset 0x0C)
      • Followed by: NLS_LOCALE_DATA[] array, sorted LCID index, sorted LCNAME index, string pool, calendar array, {}and embedded geo data{} (new struct geo_id[] + struct geo_index[] with its own sub-header)

      The NLS_LOCALE_HEADER and NLS_LOCALE_DATA types are defined only in `sdk/include/wine/winternl.h` (because only Wine code uses them today).
      ReactOS NDK headers (`sdk/include/ndk/`) do not have these types. If `RtlGetLocaleFileMappingAddress` is to be implemented inside ntdll/RTL,
      these types will need to be added to a suitable ReactOS header.

      Impact: load_locale_nls() reads locale data, geo IDs, geo index, and character maps entirely from the pointer returned by RtlGetLocaleFileMappingAddress. None of this works today on ReactOS.

      Tooling gap: No tool exists in ReactOS to produce a `locale.nls` in the new format. Wine's `nls/locale.nls` (generated by its `tools/make_unicode` script) is a compatible source; alternatively, a new tool would need to consume the MS locale data.

      Note also that `sdk/lib/rtl/locale.c` currently resolves LCID↔name via a hardcoded RtlpLocaleTable[] array. This approach is orthogonal to
      RtlGetLocaleFileMappingAddress; the two need to be reconciled.

      2. `sortdefault.nls` — New file, missing in ReactOS

      Old format (ReactOS, XP/2003):

      • Two separate files served via old CSR types:
          - `sortkey.nls` (type 4, ~262 KB): 4-byte sort weights per Unicode code point
          - `sorttbls.nls` (type 5, ~21 KB): sort table metadata and filenames for locale-specific sort files (e.g. `big5.nls`, `prcp.nls`)

      New format (Vista+, required by Wine `locale.c`):

      • Single file `sortdefault.nls` served via NtGetNlsSectionPtr(9, 0, ...)
      • Header layout (from load_sortdefault_nls()):

      struct {
             UINT sortkeys;   // offset to sort key table (UINT per Unicode code point)
             UINT casemaps;   // offset to casemap table (l_intl.nls format, USHORT pairs)
             UINT ctypes;     // offset to CT_CTYPE1/2/3 table
             UINT sortids;    // offset to sort ID block
      };  

      * Sort ID block: version + guid_count + {{struct sortguid[] }}(each: 16-byte GUID + flags + compression/exception/casing offsets)

      • After sort IDs: expansion count + struct sort_expansion[] (2×WCHAR per entry)
      • After expansions: compression count + struct sort_compression[] + compression data
      • After compression data: multiple-weights block + struct jamo_sort[]

       

      Tooling gap: No tool exists in ReactOS to build `sortdefault.nls`.
      Wine generates it from its `tools/make_unicode` script using Unicode data.

      3. Normalization NLS files — Missing entirely

      Old format (ReactOS, XP/2003): Not present.

      New format (Vista+, required by Wine `locale.c`):

      • Four files keyed by normalization form (NormalizationC=1, D=2, KC=5, KD=6), served via `NtGetNlsSectionPtr(12, form, ...)`
      • Parsed via `struct norm_table` header (defined in `locale.c`):
        • File name (13 WCHARs), checksum, Unicode version, normalization form
        • Offsets to: combining class table, property tables (level 1 + 2), decomposition hash + map + sequence tables, composition hash + sequence tables
      • Used by RtlNormalizeString() (also stubbed)

      Conflict: ReactOS currently maps type 12 to `geo.nls`. The type 12 slot must be reassigned to normalization. Geo data in the new architecture is embedded in `locale.nls` itself (see section 1 above).

      Tooling gap: No tool exists in ReactOS to build normalization NLS files. Wine's `nls/Normalize{C,D,KC,KD}.nls` files can serve as a source.

      4. Casemap table (type 10)

      Old format (ReactOS): `l_intl.nls` served via old CSR type, also referenced directly from ExpNlsSectionPointer in the kernel.

      New format (Vista+): Served via NtGetNlsSectionPtr(10, 0, ...). The format is the same `l_intl.nls` USHORT-pair layout — `locale.c` explicitly notes /* casemap table, in l_intl.nls format */ for `sort.casemap`. The content of ReactOS's existing `l_intl.nls` (4870 bytes) should be compatible once type 10 is implemented.

      5. Codepage NLS files (`c_*.nls`) — Minor header difference, mostly resolved

      Both old and new code use the same 26-byte NLS_FILE_HEADER (13 WORDs).
      The NDK header `sdk/include/ddk/ntnls.h` already defines the correct layout:

      typedef struct _NLS_FILE_HEADER {
          USHORT HeaderSize;           // = 13 (WORDs)
          USHORT CodePage;
          USHORT MaximumCharacterSize; // 1 = SBCS, 2 = DBCS
          USHORT DefaultChar;
          USHORT UniDefaultChar;
          USHORT TransDefaultChar;     // Unicode → CP fallback: Unicode of DefaultChar
          USHORT TransUniDefaultChar;  // CP → Unicode fallback: CP of UniDefaultChar
          UCHAR  LeadByte[12];
      } NLS_FILE_HEADER; 

       

      Old tool (`sdk/tools/create_nls/create_nls.c`):

      • Uses a different in-memory layout (BYTE DefaultChar[2] + unknown1 }}/ {{{}unknown2)
      • Always writes `'?'` (0x003F) for TransDefaultChar and TransUniDefaultChar
      • Reads data from the host OS via GetCPInfoExA — Windows-only

      New tool (`sdk/tools/txt2nls/main.cpp`):

      • Uses the correct layout matching `ntnls.h`
      • Properly computes TransDefaultChar and TransUniDefaultChar
      • Reads from portable `.txt` source files; runs cross-platform

      The `txt2nls` tool already generates correct NLS files. All codepage `.nls` files in `media/nls/` are now built by `txt2nls` from the `.txt` sources in
      `media/nls/src/`. The `create_nls.c` tool is obsolete for this purpose.

      Files not yet converted: `c_856.nls` and `c_878.nls` are listed in `media/nls/CMakeLists.txt` as static (manually generated) rather than built
      from `.txt` sources. They may still be in the old format with `'?'` for the translated default chars, and should have `.txt` sources added so they can be
      rebuilt by `txt2nls`.

      Summary of Required Changes

      Critical (locale.c cannot initialize without these)

      1. Implement RtlGetLocaleFileMappingAddress in `sdk/lib/rtl/` or `dll/ntdll/`. Must map and return a pointer to a Vista+-format `locale.nls` image.
      2. Implement NtGetNlsSectionPtr type 9 (sortdefault) in ntdll/ntoskrnl. Must serve `sortdefault.nls` in the new unified format.
      3. Implement NtGetNlsSectionPtr type 10 (casemap) in ntdll/ntoskrnl. Can reuse `l_intl.nls` data (format is already compatible).
      4. Fix NtGetNlsSectionPtr type 12 conflict in `subsystems/win/basesrv/nls.c`: change from `geo.nls` to the normalization NLS files. Geo data must instead be embedded in the new `locale.nls`.
      5. Create new `locale.nls` in Vista+ format (`NLS_LOCALE_HEADER` + `NLS_LOCALE_DATA[]`). Wine's `nls/locale.nls` can serve as a base.
      6. Create `sortdefault.nls` in the new unified format. Wine's `nls/sortdefault.nls` can serve as a base.
      7. Add NLS_LOCALE_HEADER and NLS_LOCALE_DATA type definitions to a ReactOS NDK header (e.g. `sdk/include/ndk/rtltypes.h` or a new `sdk/include/ndk/nlstypes.h`),  so that RtlGetLocaleFileMappingAddress can be implemented without depending on Wine's `winternl.h`.

      Important (affects correctness)

      8. Implement RtlNormalizeString in `sdk/lib/rtl/`. Requires normalization NLS files served via NtGetNlsSectionPtr(12, ...).

      9. Create normalization NLS files (`Normalize{C,D,KC,KD}.nls`). Wine's `nls/Normalize*.nls` can serve as a base.

      10. Implement `NtInitializeNlsFiles` in ntdll. Used by some callers to prime the locale file mapping before RtlGetLocaleFileMappingAddress is called.

      11. Reconcile `sdk/lib/rtl/locale.c` hardcoded RtlpLocaleTable[] with the file-based data from `locale.nls`. Currently both represent locale↔LCID
          mappings independently. Long term the RTL should derive this data from `locale.nls` rather than a compile-time table.

      Lower priority

      12. Convert `c_856.nls` and `c_878.nls` from static files to `.txt`-sourced builds via `txt2nls`, ensuring TransDefaultChar / TransUniDefaultChar are correct.

      13. Deprecate `sdk/tools/create_nls/create_nls.c`. It is superseded by `txt2nls`. The files it would produce have incorrect TransDefaultChar values.

      14. Document the old CSR NLS types 1–8 in `basesrv/nls.c` as XP/2003-era compatibility only. They should be preserved for any remaining old code paths but are invisible to Vista+ callers.

      File Reference

      File Role
      dll/win32/kernelbase/wine/locale.c New locale API; requires Vista+ NLS
      sdk/tools/txt2nls/main.cpp Codepage NLS generator (correct, in use)
      sdk/tools/create_nls/create_nls.c Old codepage NLS generator (obsolete)
      sdk/include/ddk/ntnls.h CPTABLEINFO, NLSTABLEINFO, NLS_FILE_HEADER
      sdk/include/ndk/rtltypes.h NLS_FILE_HEADER (matches new format)
      sdk/include/wine/winternl.h NLS_LOCALE_HEADER, NLS_LOCALE_DATA (Wine only)
      sdk/lib/rtl/locale.c RTL locale name↔LCID resolution (hardcoded table)
      dll/ntdll/def/ntdll.spec Stubs: NtGetNlsSectionPtr, RtlNormalizeString, etc.
      subsystems/win/basesrv/nls.c NLS section type → file mapping (old scheme)
      media/nls/locale.nls Old-format locale data (must be replaced)
      media/nls/sortkey.nls Old sort key table (superseded by `sortdefault.nls`)
      media/nls/sorttbls.nls Old sort table metadata (superseded by `sortdefault.nls`)
      media/nls/l_intl.nls Case mapping (compatible with new type 10)
      media/nls/geo.nls Geo data (must move into new `locale.nls`)
      media/nls/CMakeLists.txt NLS file build rules

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            ThePhysicist Timo Kreuzer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: