The RC SUPPORT LIBRARY documentation.

This is a library addressed to satisfy the requirements for RC resource related functionality extensions. To understand better the information described in this document, first you may want no read more About Resource Files: http://msdn.microsoft.com/en-us/library/windows/desktop/aa380599.aspx

Although currently it contains only number-dependence string handling, the library is not intended to be limited to that if other similar functionality needs will occur. Number-dependence is a way of determining a specific string according to a language-specific rule. The most exemplifiable use-case is the plural-forms handling, where different languages have different plural-forms and different rules related to them.

Number-dependent resource syntax

The structure-set model

In order to achieve resource number-dependence, a model of two separate RC structures were designed. One of them stores the body of all the possible strings of the given context, and the other one addresses the complexity of choosing the right string given an external (provided at run-time) number. We'll call this pair of structures "string-set" and "rule-set" respectively. The library uses the rule-set to obtain an index to one of the available strings defined in the string-set structure. Both these structures can be defined differently from one language to another in order to address their number-dependence grammar specifics.

Although all the needed information can be (and it was, in the initial steps of development) defined in one single structure, due to several considerations it was however changed along the way to the current form. These considerations are the desired modularity (making the less various rule-related information available across multiple string-storing parts, and the other way around - to deal with the strings per rule ratio, where the amount of strings is always less than or equal to the amount of rules), simplicity (as each of the remaining structures resulted less adorned), and speed (because the rule-sets got smaller and more likely to be processed faster due to caching).

Rule-sets format

A rule-set is a series of integral numbers embedded into a RCDATA structure:


    WHATEVERRULESET_ID RCDATA
    BEGIN
        amount_of_rules,
        1strule_lowbound, 1strule_divisor, 1strule_highlimit, 1strule_result,
        /* ...all the other rules */
        lastrule_lowbound, lastrule_divisor, lastrule_highlimit, lastrule_result
    END

That is the amount of rules followed by sequences of three to four numbers forming the rule definitions.

How it works? The library uses the rule-set and the aforementioned external number (which we'll call from now on "deciding number") to generate an index for the string-set resource. The process of result deduction involves a pass through the RCDATA block, interpreting its raw content. When processing this structure, the library expects an amount of data sufficient for an amount of rules declared in "amount_of_rules" (which is a WORD, i.e. a "short int" value) before it stops. Each rule consists of tree or four WORD integrals, which are rule constraints (the "low bound" and the "high limit"), the divisor, and the rule's result. The variable amount is due to rule's divisor being equal to or different than zero, in which case the rule's next part - the high limit - is omitted or not. The rule's result is considered valid if all the rule's constraints were satisfied, and only the result of the last validated rule is returned in the end. Passing each of the rule's constraints involves operations with the deciding number as follows:

      Comparison - if greater than or equal with rule's low bound.
      Modulo operation with rule's divisor (if rule's divisor is not zero).
      Comparison - if smaller than rule's high limit (if the divisor wasn't zero).

The first comparison is made between the deciding number and the rule's low limit, the second comparison is between the rule's second constraint and the remainder of the deciding number divided to rule's divisor.

The case where all these operations are involved defines a "cycling rule". If, the rule's divisor is zero (which defines it as "non-cycling rule"), then the low bound remains the the rule's only constraint for result validation. Here's the example of English plural-forms rule-set with non-cycling rules (unlike the one defined in gndrrd_b.rc):


    ENGLISH_PLURAL_FORMS_RULESET_ID RCDATA
    BEGIN
        3, /* three rules in total */
        0, 0, 0,    /* this rule is always validated, as any integral number is
                     * greater or equal with zero; rule's (and rule-set's default)
                     * result is "0" */
        1, 0, 1,    /* this rule gets validated only for deciding number being
                     * greater than or equal with one, thus overrides the previous
                     * rule-set for all but zero; rule's result is "1"
                     */
        2, 0, 0     /* this rule gets validated when the deciding number is greater
                     * than or equal with two, thus overrides the previous
                     * rule-set for all but zero and one; rule's result is "0"
                     */
    END

In order to to exemplify the cycling rules, the following one determines the odd or even nature of the deciding number (although admittedly not very useful rule-set outside its demonstrative purpose, considering that maths doesn't vary across different languages):


    ODD_OR_EVEN_RULESET_ID RCDATA
    BEGIN
        2, /* two rules in total */
        0, 0, 1,    /* this rule is always validated, as any integral number is
                     * greater or equal with zero; rule's (and rule-set's default)
                     * result is "1" */
        0, 2, 1, 0  /* this cycling rule gets validated when the deciding number is
                     * greater than or equal to zero and its division remainder to
                     * two is less than one, which gives us all the even numbers
                     * and leaves the odd numbers validated only in the previous
                     * rule; rule's result is "0"
                     */
    END

Note that the rule's values can be set to the maximum value (65535 or -1), which may be useful in some cases. Here's the example of English plural-forms rule-set with cycling rule (as defined in gndrrd_b.rc), commented:


    /* ENGLISH_PLURAL_FORMS_RULESET_ID */ GNDR_RULESET_ID RCDATA
    BEGIN
        2, /* two rules in total */
        0, 0, 1,    /* this rule is always validated, as any integral number is
                     * greater or equal with zero; rule's (and rule-set's default)
                     * result is "1" */
        1, -1, 2, 0 /* this cycling rule gets validated when the deciding number is
                     * greater than or equal to one and its division to the highest
                     * value available (i.e. the deciding number unchanged) is
                     * smaller than two, which gives us a very narrow validating
                     * case of deciding number being one; rule's result is "0"
                     */
    END

String-sets format

A string-set is a RCDATA structure embedding a rule-set reference ID and a series of explicitly zero-delimited literal strings:


    WHATEVERSTRINGSET_ID RCDATA
    BEGIN
        ruleset_id,
        first_string,
        second_string,
        /* ...all the other strings */
        laststring
    END

The "ruleset_id" is a WORD integral, the strings are regular strings ending with the terminal character ('\0'). Here's the string-set corresponding to the odd/even rule-set:


    ODD_OR_EVEN_STRINGSET_ID RCDATA
    BEGIN
        ODD_OR_EVEN_RULESET_ID, /* reference to the correct rule-set for strings */
        "even\0", /* index 0 */
        "odd\0" /* index 1 */
    END

Consider that although string-sets are dependent on (and are acting as clients of) their rule-set, when the actual developing of a new number-dependent pair occurs, it is easier to start by laying out the string-set with all the possible strings first. Check out the following tutorial in order to learn how to do that.

Tutorial

In order to illustrate better the mechanics of string selection, let's take a step-by-step approach by creating rule-sets and their corresponding string-sets for several examples.

In English, a good demonstration case can be made for ordinal numeral suffixes. The rule here is that different numbers get different suffixes: "st" for cardinal numeral "1" resulting in "1st", "nd" for "2", "rd" for "3", and "th" for "4" and higher. Thus, a string-set can be defined as follows:


    ORDINAL_STRINGSET_ID RCDATA
    BEGIN
        ORDINAL_RULESET_ID, /* reference to the correct rule-set */
        "st\0", /* index 0, for case one */
        "nd\0", /* index 1, for case two */
        "rd\0", /* index 2, for case three */
        "th\0" /* index 3, for case four */
    END

Note that for the strings inside this structure null-terminal character ('\0') must be manually specified.

Now let's shape an acting rule-set for numbers one to ten. This initial limited domain and the simplistic approach of its corresponding acting rule-set is chosen intentionally out of didactic reasons and will be further developed step by step to cover all the unsigned numbers domain. First we'll write the rules, then we'll specify the amount of them in first row of data. To cover the rule for "1st" we can say greater or equal with 1 (the lower bound), with 0 as divisor (no cyclicity for now), to the index 0: "1, 0, 0,". Then the second rule for case two we say greater or equal with 2, again with no cyclicity, index 1: "2, 0, 1,". The same for the cases three and four, resulting in "3, 0, 2," and "4, 0, 3". Thus we have four rules, and a simple rule-set structure as follows:


    ORDINAL_RULESET_ID RCDATA
    BEGIN
        4, /* the amount of rules */
        1, 0, 0, /* for 1 (case one) */
        2, 0, 1, /* for 2 (case two) */
        3, 0, 2, /* for 3 (case three) */
        4, 0, 3 /* for 4 and above (case four) */
    END

The evaluation procedure will go for the given number N (the deciding number) by applying the amount of specified rules. In each rule N is compared with the first number in the row, then continue or stop if N is greater or equal, or less respectively. The second number in each rule is the divisor (for cyclicity) which is not considered for now and we leave it 0. The last number is the resulting index if none of the rest of the rules is validated.

Here for "0" no rule in the rule-set can be validated (and therefore neither a matching string can be retrieved). This can be fixed by adding an exception rule for 0 (assuming "0th" as the correct ordinal form for it): "0, 0, 3,". Our rule-set becomes now:


    ORDINAL_RULESET_ID RCDATA
    BEGIN
        5, /* the amount of rules */
        0, 0, 3, /* for 0 (case four) */
        1, 0, 0, /* for 1 (case one) */
        2, 0, 1, /* for 2 (case two) */
        3, 0, 2, /* for 3 (case three) */
        4, 0, 3 /* for 4 and above (case four) */
    END

Now let's expand the numeral domain further, beyond the limit of 0 to 10. We can observe that case one can be found again at other numbers that have "1" as their least significant digit (e.g. 21, 31, 41, ..., 91, 101, 121, ...). We'll call this "cyclic occurrence" and in a NDR rule an additional restriction about it can be specified. Remember the second number left "0" in rule-set rows above? If it is different than 0, the number N that gets compared with the first number in the rule is then divided to this second number. If this second number is not "0", the rule has to have four numbers instead of three, where the third one is compared with the remainder of the previous division to validate or invalidate the rule. The last number in the rule is the rule's resulting index as before. Let's rewrite the second rule (for the case one) to cover all the numbers that have "1" as their least significant digit: "1, 10, 2, 0" which reads as "if N is greater than 1, and the remainder of the division by 10 is smaller than 2, return index 0". Rewriting in a similar manner the rest of the rules, we get:


    ORDINAL_RULESET_ID RCDATA
    BEGIN
        5, /* the amount of rules */
        0, 0, 3, /* for 0 (case four) */

        /* (intended to be) cyclic, for each ten */
        1, 10, 2, 0, /* (intended to be) for 1 (case one) */
        2, 10, 3, 1, /* (intended to be) for 2 (case two) */
        3, 10, 4, 2, /* (intended to be) for 3 (case three) */
        4, 10, 5, 3 /* (intended to be) for 4 and above (case four) */
    END

This has two problems, one regarding the initial constraint (for the deciding number having to be greater than 1, 2, 3, and 4) which is no longer necessary as the validation can be done solely on division remainder comparison, and the other is about the order of constraints as the later rules are more permissive than the previous ones and thus override them by resulting case four for every number above 3. Let's rectify this by removing the initial constraints and by reversing the order of rules:


    ORDINAL_RULESET_ID RCDATA
    BEGIN
        5, /* the amount of rules */
        0, 0, 3, /* acting as default - case four */

        /* cyclic, for each ten */
        0, 10, 4, 2, /* for 3, 2, 1, and 0 (case three) */
        0, 10, 3, 1, /* for 2, 1, and 0 (case two) */
        0, 10, 2, 0, /* for 1 and 0 (case one) */
        0, 10, 1, 3 /* for 0 (case four) */
    END

Here, the rule-set in the first rule prepares the case four as the default result, then each rule inflicts an additional constraint upon the previous one: the second rule overrides the default (four case - "th" ordinal suffix) for the values of N with their least significant digit smaller than 4 (which is 0, 1, 2, and 3), then the next rules are narrowing this pool further and further to smaller than 3 (0, 1, and 2), smaller than 2 (0, and 1), and smaller than 1 respectively.

This works for all the numbers having their least significant digit as 1, 2, 3, or something else, but in English there is one additional exception for the number domain of 11 to 13, which all take only case four prefix - "11th" instead of "11st", "12th" instead of "12nd", and "13th" instead of "13rd". We add one more exception-rule for them and get the new rule-set:


    ORDINAL_RULESET_ID RCDATA
    BEGIN
        6, /* the amount of rules */
        0, 0, 3, /* acting as default - case four */

        /* cyclic, for each ten */
        0, 10, 4, 2, /* for 3, 2, 1, and 0 (case three) */
        0, 10, 3, 1, /* for 2, 1, and 0 (case two) */
        0, 10, 2, 0, /* for 1 and 0 (case one) */
        0, 10, 1, 3, /* for 0 (case four) */
        11, -1, 14, 3 /* exception for 11, 12, and 13 (case four) */
    END

The added line has its first number defining the low bound, then uses the cyclicity information to define a high limit ("-1" acts as the highest integral number) as the remainder of any division of a dividend smaller than the divisor is the dividend itself. The result is a rule that addresses only the numerical domain 11 to 13 of all the numbers.

At last, it should be noticed that in real life the last added exception should actually be applied also for other domains, like 111 to 113, 211 to 213, and so on, or in other words - cycled in each new hundred. After defining cyclic rules for it we finally get the complete rule-set:


    ORDINAL_RULESET_ID RCDATA
    BEGIN
        10, /* the amount of rules */
        0, 0, 3, /* acting as default - case four */
       
        /* cyclic, for each ten */
        0, 10, 4, 2, /* for 3, 2, 1, and 0 (case three) */
        0, 10, 3, 1, /* for 2, 1, and 0 (case two) */
        0, 10, 2, 0, /* for 1 and 0 (case one) */
        0, 10, 1, 3, /* for 0 (case four) */
       
        /* cyclic, for each hundred */
        0, 100, 14, 3, /* for 13, 12, 11, and lower (case four) */
        0, 100, 4, 2, /* for 3, 2, 1, and 0 (case three) */
        0, 100, 3, 1, /* for 2, 1, and 0 (case two) */
        0, 100, 2, 0, /* for 1 and 0 (case one) */
        0, 100, 1, 3 /* for 0 (case four) */
    END

Note that usually you won't have to write rule-sets, as by far the most relevant use-case is working with resources involving cardinal numeral related grammar inflection formulas, rule-sets for which already exist (see below).

Existing rule-sets

This library has defined several plural-forms rule-sets for a number of languages. These rule-sets are found in "gndrrd" directory (which stands for Grammatical Number Dependent Resources Rule-sets Definitions). Here you have an alphabetical list with languages for which rule-set definitions already exist. The information used as source for writing these rule-sets:
https://www.gnu.org/software/gettext/manual/html_node/Plural-forms.html
http://localization-guide.readthedocs.org/en/latest/l10n/pluralforms.html

Language	File name
Afrikaans	gndrrd_b.rc
Albanian	gndrrd_b.rc
Amharic	gndrrd_c.rc
Arabic	gndrrd_l.rc
Armenian	gndrrd_b.rc
Assamese	gndrrd_b.rc
Azeri	gndrrd_b.rc
Basque	gndrrd_b.rc
Belarusian	gndrrd_h.rc
Bengali	gndrrd_b.rc
Bosnian	gndrrd_h.rc
Breton	gndrrd_c.rc
Bulgarian	gndrrd_b.rc
Catalan	gndrrd_b.rc
Chinese	gndrrd_c.rc
Croatian	gndrrd_h.rc
Czech	gndrrd_i.rc
Danish	gndrrd_b.rc
Dutch	gndrrd_b.rc
Georgian	gndrrd_a.rc
German	gndrrd_b.rc
Greek	gndrrd_b.rc
Greenlandic	gndrrd_b.rc
English	gndrrd_b.rc
Estonian	gndrrd_b.rc

Language	File name
Faroese	gndrrd_b.rc
Filipino	gndrrd_c.rc
Finnish	gndrrd_b.rc
French	gndrrd_c.rc
Frisian	gndrrd_b.rc
Fula	gndrrd_a.rc
Irish	gndrrd_m.rc
Gaelic	gndrrd_o.rc
Galician	gndrrd_b.rc
Gujarati	gndrrd_b.rc
Hausa	gndrrd_b.rc
Hebrew	gndrrd_b.rc
Hindi	gndrrd_b.rc
Hungarian	gndrrd_b.rc
Icelandic	gndrrd_p.rc
Indonesian	gndrrd_a.rc
Italian	gndrrd_b.rc
Japanese	gndrrd_a.rc
Kannada	gndrrd_b.rc
Kazakh	gndrrd_a.rc
Khmer	gndrrd_a.rc
Kinyarwanda	gndrrd_b.rc
Korean	gndrrd_a.rc
Kurdish	gndrrd_b.rc
Kyrgyz	gndrrd_a.rc

Language	File name
Lao	gndrrd_a.rc
Latvian	gndrrd_d.rc
Lithuanian	gndrrd_g.rc
Luxembourgish	gndrrd_b.rc
Macedonian	gndrrd_q.rc
Malagasy	gndrrd_c.rc
Malay	gndrrd_a.rc
Malayalam	gndrrd_b.rc
Maltese	gndrrd_r.rc
Maori	gndrrd_c.rc
Mapudungun	gndrrd_c.rc
Marathi	gndrrd_b.rc
Meithei	gndrrd_b.rc
Mongolian	gndrrd_b.rc
Nepali	gndrrd_b.rc
Norwegian	gndrrd_b.rc
Occitan	gndrrd_c.rc
Oriya	gndrrd_b.rc
Pashto	gndrrd_b.rc
Persian	gndrrd_a.rc
Polish	gndrrd_j.rc
Portuguese (Brasilian)	gndrrd_c.rc
Portuguese (Portugal)	gndrrd_b.rc
Punjabi	gndrrd_b.rc
Romanian	gndrrd_f.rc
Romansh	gndrrd_b.rc
Russian	gndrrd_h.rc

Language	File name
Sakha	gndrrd_a.rc
Sami	gndrrd_b.rc
Serbian	gndrrd_h.rc
Sindhi	gndrrd_b.rc
Sinhalese	gndrrd_b.rc
Slovak	gndrrd_i.rc
Slovene	gndrrd_k.rc
Sotho	gndrrd_b.rc
Spanish	gndrrd_b.rc
Swedish	gndrrd_b.rc
Swahili	gndrrd_b.rc
Tajik	gndrrd_c.rc
Tamil	gndrrd_b.rc
Tatar	gndrrd_a.rc
Telugu	gndrrd_b.rc
Thai	gndrrd_a.rc
Tibetan	gndrrd_a.rc
Tigrinya	gndrrd_c.rc
Turkish	gndrrd_c.rc
Turkmen	gndrrd_b.rc
Ukrainian	gndrrd_h.rc
Urdu	gndrrd_b.rc
Uyghur	gndrrd_a.rc
Uzbek	gndrrd_c.rc
Vietnamese	gndrrd_a.rc
Walloon	gndrrd_c.rc
Welsh	gndrrd_n.rc
Wolof	gndrrd_a.rc
Yoruba	gndrrd_b.rc

Note that although not all the existing languages are listed (due to the unavailability of the necessary information to define rule-sets for), it is likely that many of the missing languages will share one of the existing plural-forms rules, just as a great deal of existing ones do.

The RC SUPPORT LIBRARY

Contents

Introduction

Presentation

Advantages and disadvantages