From: Robert Griesemer Date: Thu, 12 May 2022 01:22:51 +0000 (-0700) Subject: spec: use Unicode terminology consistently X-Git-Tag: go1.19beta1~282 X-Git-Url: http://www.git.cypherpunks.su/?a=commitdiff_plain;h=1dfe994fe9e87e17b141a3f06c6a88632821020a;p=gostls13.git spec: use Unicode terminology consistently - refer to character "categories" rather than "classes" per the definitions in the Unicode standard - use "uppercase", "lowercase" (one word) instead of "upper case" or "upper-case", matching the spelling in the Unicode standard - clarify that that the blank character "_" is considered a lowercase letter for Go's purposes (export of identifiers) Fixes #44715. Change-Id: I54ef177d26c6c56624662fcdd6d1da60b9bb8d02 Reviewed-on: https://go-review.googlesource.com/c/go/+/405758 Reviewed-by: Robert Griesemer Reviewed-by: Ian Lance Taylor Reviewed-by: Robert Findley --- diff --git a/doc/go_spec.html b/doc/go_spec.html index 4f647cac10..279dd279fa 100644 --- a/doc/go_spec.html +++ b/doc/go_spec.html @@ -1,6 +1,6 @@ @@ -53,7 +53,7 @@ operators, in increasing precedence:

-Lower-case production names are used to identify lexical tokens. +Lowercase production names are used to identify lexical tokens. Non-terminals are in CamelCase. Lexical tokens are enclosed in double quotes "" or back quotes ``.

@@ -79,7 +79,7 @@ will use the unqualified term character to refer to a Unicode code point in the source text.

-Each code point is distinct; for instance, upper and lower case letters +Each code point is distinct; for instance, uppercase and lowercase letters are different characters.

@@ -96,13 +96,13 @@ A byte order mark may be disallowed anywhere else in the source.

Characters

-The following terms are used to denote specific Unicode character classes: +The following terms are used to denote specific Unicode character categories:

 newline        = /* the Unicode code point U+000A */ .
 unicode_char   = /* an arbitrary Unicode code point except newline */ .
-unicode_letter = /* a Unicode code point classified as "Letter" */ .
-unicode_digit  = /* a Unicode code point classified as "Number, decimal digit" */ .
+unicode_letter = /* a Unicode code point categorized as "Letter" */ .
+unicode_digit  = /* a Unicode code point categorized as "Number, decimal digit" */ .
 

@@ -115,7 +115,7 @@ as Unicode letters, and those in the Number category Nd as Unicode digits.

Letters and digits

-The underscore character _ (U+005F) is considered a letter. +The underscore character _ (U+005F) is considered a lowercase letter.

 letter        = unicode_letter | "_" .
@@ -406,7 +406,7 @@ An imaginary literal represents the imaginary part of a
 complex constant.
 It consists of an integer or
 floating-point literal
-followed by the lower-case letter i.
+followed by the lowercase letter i.
 The value of an imaginary literal is the value of the respective
 integer or floating-point literal multiplied by the imaginary unit i.
 

@@ -2246,8 +2246,8 @@ An identifier may be exported to permit access to it from another package An identifier is exported if both:

    -
  1. the first character of the identifier's name is a Unicode upper case - letter (Unicode class "Lu"); and
  2. +
  3. the first character of the identifier's name is a Unicode uppercase + letter (Unicode character category Lu); and
  4. the identifier is declared in the package block or it is a field name or method name.
  5. @@ -2761,8 +2761,8 @@ It is shorthand for a regular variable declarat with initializer expressions but no types:

    -
    -"var" IdentifierList = ExpressionList .
    +
    +"var" IdentifierList "=" ExpressionList .