From 37a097519facb21c29af821cbdfffbf9d48c045b Mon Sep 17 00:00:00 2001 From: Robert Griesemer Date: Fri, 29 May 2015 17:36:26 -0700 Subject: [PATCH] spec: be precise about rune/string literals and comments See #10248 for details. Fixes #10248. Change-Id: I373545b2dca5d1da1c7149eb0a8f6c6dd8071a4c Reviewed-on: https://go-review.googlesource.com/10503 Reviewed-by: Russ Cox Reviewed-by: Rob Pike --- doc/go_spec.html | 48 ++++++++++++++++++++++++------------------------ 1 file changed, 24 insertions(+), 24 deletions(-) diff --git a/doc/go_spec.html b/doc/go_spec.html index 9f29989d05..95406a1687 100644 --- a/doc/go_spec.html +++ b/doc/go_spec.html @@ -1,6 +1,6 @@ @@ -129,27 +129,27 @@ hex_digit = "0" … "9" | "A" … "F" | "a" … "f" .

Comments

-There are two forms of comments: +Comments serve as program documentation. There are two forms:

  1. Line comments start with the character sequence // -and stop at the end of the line. A line comment acts like a newline. +and stop at the end of the line.
  2. General comments start with the character sequence /* -and continue through the character sequence */. A general -comment containing one or more newlines acts like a newline, otherwise it acts -like a space. +and stop with the first subsequent character sequence */.

-Comments do not nest. +A comment cannot start inside a rune or +string literal, or inside a comment. +A general comment containing no newlines acts like a space. +Any other comment acts like a newline.

-

Tokens

@@ -176,11 +176,8 @@ using the following two rules:

  1. -

    When the input is broken into tokens, a semicolon is automatically inserted -into the token stream at the end of a non-blank line if the line's final -token is -

    +into the token stream immediately after a line's final token if that token is
    • an identifier @@ -357,9 +354,10 @@ imaginary_lit = (decimals | float_lit) "i" .

      A rune literal represents a rune constant, an integer value identifying a Unicode code point. -A rune literal is expressed as one or more characters enclosed in single quotes. -Within the quotes, any character may appear except single -quote and newline. A single quoted character represents the Unicode value +A rune literal is expressed as one or more characters enclosed in single quotes, +as in 'x' or '\n'. +Within the quotes, any character may appear except newline and unescaped single +quote. A single quoted character represents the Unicode value of the character itself, while multi-character sequences beginning with a backslash encode values in various formats. @@ -433,6 +431,7 @@ escaped_char = `\` ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | `\` | "'" | ` '\xff' '\u12e4' '\U00101234' +'\'' // rune literal containing single quote character 'aa' // illegal: too many characters '\xa' // illegal: too few hexadecimal digits '\0' // illegal: too few octal digits @@ -449,8 +448,8 @@ obtained from concatenating a sequence of characters. There are two forms: raw string literals and interpreted string literals.

      -Raw string literals are character sequences between back quotes -``. Within the quotes, any character is legal except +Raw string literals are character sequences between back quotes, as in +`foo`. Within the quotes, any character may appear except back quote. The value of a raw string literal is the string composed of the uninterpreted (implicitly UTF-8-encoded) characters between the quotes; @@ -461,8 +460,9 @@ are discarded from the raw string value.

      Interpreted string literals are character sequences between double -quotes "". The text between the quotes, -which may not contain newlines, forms the +quotes, as in "bar". +Within the quotes, any character may appear except newline and unescaped double quote. +The text between the quotes forms the value of the literal, with backslash escapes interpreted as they are in rune literals (except that \' is illegal and \" is legal), with the same restrictions. @@ -484,17 +484,17 @@ interpreted_string_lit = `"` { unicode_value | byte_value } `"` .

      -`abc`  // same as "abc"
      +`abc`                // same as "abc"
       `\n
      -\n`    // same as "\\n\n\\n"
      +\n`                  // same as "\\n\n\\n"
       "\n"
      -""
      +"\""                 // same as `"`
       "Hello, world!\n"
       "日本語"
       "\u65e5本\U00008a9e"
       "\xff\u00FF"
      -"\uD800"       // illegal: surrogate half
      -"\U00110000"   // illegal: invalid Unicode code point
      +"\uD800"             // illegal: surrogate half
      +"\U00110000"         // illegal: invalid Unicode code point
       

      -- 2.50.0