• Adjacent string literals

    From James Kuyper@21:1/5 to All on Mon Jan 25 10:15:25 2021
    I learned a couple of decades ago that adjacent string literals get concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated
    by white-space do not qualify as adjacent. There's also no mention of white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space"
    - but that doesn't seem obvious to me. It also sounds vaguely familiar,
    like I've had this discussion with someone before, but I can't locate
    the discussion. Every example of adjacent string literals that appears
    in the standard has at least one white-space character separating them,
    so the intent is crystal-clear, but the wording doesn't clearly say so.

    If the phrase "White-space characters separating tokens are no longer significant." were moved from the beginning of the description of phase
    7 to the beginning of the description phase 6, it would make the
    insignificance of white space separating string literals perfectly
    clear, and as far as I can see, would have no other effect

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jakob Bohm@21:1/5 to Ben Bacarisse on Tue Jan 26 13:48:26 2021
    On 2021-01-26 13:22, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that
    "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated
    by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space"
    - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?


    The interesting situation is cases like these:

    "a" /* Long comment explaining why b is the next byte */ "b"

    And

    #define LEAD_BYTE "a"
    #define TRAIL_BYTE "b"

    LEAD_BYTE TRAIL_BYTE

    Enjoy

    Jakob
    --
    Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
    Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
    This public discussion message is non-binding and may contain errors.
    WiseMo - Remote Service Management for PCs, Phones and Embedded

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Richard Damon@21:1/5 to Ben Bacarisse on Tue Jan 26 07:52:42 2021
    On 1/26/21 7:22 AM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that
    "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated
    by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space"
    - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?


    I'm not sure, but 6.4p3 it says

    As described in 6.10, in certain circumstances during translation phase
    4, white space (or the absence thereof) serves as more than
    preprocessing token separation.

    which seems to imply that for most purposes (unless expressly stated) white-space between tokens is generally insignificant. There are cases
    where it matters, like the difference between

    #define macro(x) (x)
    and
    #define macro (x) (x)

    but these cases explicitly talk about the white-space affecting the
    meaning. This would seem to at least imply that it is to be ignored
    elsewhere, and thus the white-space between literals doesn't mean they
    aren't adjacent.

    It would seem that the removal of the possible significance could have
    been moved up earlier (but has to be after phase 4 since that has an
    explicit use of white-space), as far as I can see, phases 5 and 6 don't
    need the white-space significance, but maybe the fact that phase 7 also converts processor tokens into token says that we want to handle all the
    string literal stuff before doing that.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to James Kuyper on Tue Jan 26 12:22:52 2021
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated
    by white-space do not qualify as adjacent. There's also no mention of white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space"
    - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Ben Bacarisse on Tue Jan 26 09:29:31 2021
    On 1/26/21 7:22 AM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that
    "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated
    by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space"
    - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?

    No, I would not - and that's precisely because "long int x" is not
    parsed as a declaration until translation phase 7, and the very first
    sentence of the description of that phase says "White-space characters separating tokens are no longer significant.". Phase 6 occurs before
    that sentence applies, which is precisely my point.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Keith Thompson@21:1/5 to Jakob Bohm on Tue Jan 26 13:05:44 2021
    Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:
    On 2021-01-26 13:22, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that
    "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated >>> by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space" >>> - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?

    The interesting situation is cases like these:

    "a" /* Long comment explaining why b is the next byte */ "b"

    And

    #define LEAD_BYTE "a"
    #define TRAIL_BYTE "b"

    LEAD_BYTE TRAIL_BYTE

    Sorry, but those cases aren't particularly interesting. Comments are
    replaced by spaces in translation phase 3, and macros are expanded in
    phase 4. Adjacent string literals are concatenated in phase 6.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for Philips Healthcare
    void Void(void) { Void(); } /* The recursive call of the void */

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to James Kuyper on Tue Jan 26 21:46:11 2021
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/26/21 7:22 AM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that
    "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated >>> by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space" >>> - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?

    No, I would not - and that's precisely because "long int x" is not
    parsed as a declaration until translation phase 7, and the very first sentence of the description of that phase says "White-space characters separating tokens are no longer significant.". Phase 6 occurs before
    that sentence applies, which is precisely my point.

    I meant at the stage you were asking about: phase 6. The example was an attempt to find out if your reluctance to see "a" "b" as being adjacent
    was in part due to do with the fact that they could have been written
    with no spaces.

    I think your answer makes it clear that, at phase 6, you think that
    there are no two tokens adjacent to one another. I find that a rather artificial reading.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to Jakob Bohm on Tue Jan 26 21:40:20 2021
    Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:

    On 2021-01-26 13:22, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that
    "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated >>> by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space" >>> - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?


    The interesting situation is cases like these:

    "a" /* Long comment explaining why b is the next byte */ "b"

    By translation phase 6 (when adjacent string literals are concatenated)
    this has become

    "a" "b"

    And

    #define LEAD_BYTE "a"
    #define TRAIL_BYTE "b"

    LEAD_BYTE TRAIL_BYTE

    And this has become

    "a" "b"

    Am I missing some ambiguity?

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Ben Bacarisse on Tue Jan 26 18:28:21 2021
    On 1/26/21 4:46 PM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/26/21 7:22 AM, Ben Bacarisse wrote:
    ...
    No, I would not - and that's precisely because "long int x" is not
    parsed as a declaration until translation phase 7, and the very first
    sentence of the description of that phase says "White-space characters
    separating tokens are no longer significant.". Phase 6 occurs before
    that sentence applies, which is precisely my point.

    I meant at the stage you were asking about: phase 6. The example was an attempt to find out if your reluctance to see "a" "b" as being adjacent
    was in part due to do with the fact that they could have been written
    with no spaces.

    Yes, it is. In "a""b", the two tokens are adjacent. In "a" "b", they are
    not, because both are adjacent to some white-space instead. I'm not
    suggesting that the committee intended to prohibit white space between
    the tokens, merely that wording chosen doesn't clearly allow it.

    I think your answer makes it clear that, at phase 6, you think that
    there are no two tokens adjacent to one another. I find that a rather artificial reading.

    If they had used the term "consecutive", I could have seen that as a
    reasonable interpretation. "a" is one token, and "b" is the next token,
    even though they are separated by something, because that something
    isn't a token.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to James Kuyper on Wed Jan 27 01:16:03 2021
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/26/21 4:46 PM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/26/21 7:22 AM, Ben Bacarisse wrote:
    ...
    No, I would not - and that's precisely because "long int x" is not
    parsed as a declaration until translation phase 7, and the very first
    sentence of the description of that phase says "White-space characters
    separating tokens are no longer significant.". Phase 6 occurs before
    that sentence applies, which is precisely my point.

    I meant at the stage you were asking about: phase 6. The example was an
    attempt to find out if your reluctance to see "a" "b" as being adjacent
    was in part due to do with the fact that they could have been written
    with no spaces.

    Yes, it is. In "a""b", the two tokens are adjacent. In "a" "b", they are
    not, because both are adjacent to some white-space instead.

    Adjacent does not mean with nothing in between (thought it can, of
    course). What's more, things can be adjacent to each other, and also
    adjacent to something in between. I can say that there was a fire in
    the house adjacent to mine. The two house are adjacent. But both are
    adjacent to the lane separating them.

    <cut>
    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Ben Bacarisse on Tue Jan 26 22:48:33 2021
    On 1/26/21 8:16 PM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    ...
    Yes, it is. In "a""b", the two tokens are adjacent. In "a" "b", they are
    not, because both are adjacent to some white-space instead.

    Adjacent does not mean with nothing in between (thought it can, of
    course). What's more, things can be adjacent to each other, and also adjacent to something in between. I can say that there was a fire in
    the house adjacent to mine. The two house are adjacent. But both are adjacent to the lane separating them.

    It takes at least two dimensions for the issue you raise to come up. As
    far as the C standard is concerned, source code is a one-dimensional
    sequence of characters. It's possible to think of the text
    two-dimensionally, but the standard doesn't make use of that fact in any
    way that I'm aware of. I don't think anyone would suggest that two
    string literals that are vertically adjacent to each other:

    char first = "James";
    char second = "Kuyper";

    should be merged.
    Even if you acknowledge only that this is one possible way of
    interpreting "adjacent", that would mean the meaning is ambiguous.
    Moving the first sentence of translation phase 7 to be the first
    sentence of translation phase 6 would remove all ambiguity, and have, as
    far as I can see, no other consequence.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to James Kuyper on Wed Jan 27 15:46:49 2021
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/26/21 8:16 PM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    ...
    Yes, it is. In "a""b", the two tokens are adjacent. In "a" "b", they are >>> not, because both are adjacent to some white-space instead.

    Adjacent does not mean with nothing in between (thought it can, of
    course). What's more, things can be adjacent to each other, and also
    adjacent to something in between. I can say that there was a fire in
    the house adjacent to mine. The two house are adjacent. But both are
    adjacent to the lane separating them.

    It takes at least two dimensions for the issue you raise to come up.

    I don't follow. 1 and 2 are adjacent integers on the real line
    (i.e. despite having other kinds of number between them). In addition,
    they are both integers adjacent to 1/2.

    As
    far as the C standard is concerned, source code is a one-dimensional
    sequence of characters. It's possible to think of the text
    two-dimensionally, but the standard doesn't make use of that fact in any
    way that I'm aware of. I don't think anyone would suggest that two
    string literals that are vertically adjacent to each other:

    char first = "James";
    char second = "Kuyper";

    should be merged.
    Even if you acknowledge only that this is one possible way of
    interpreting "adjacent", that would mean the meaning is ambiguous.

    Lots of words in the standard could, at a pinch, be taken to mean
    something other than what is obviously intended. But if you think
    someone might read about phase 6 and think that "a""b" will be
    concatenated but not "a" "b", then you should file a defect report.

    Moving the first sentence of translation phase 7 to be the first
    sentence of translation phase 6 would remove all ambiguity, and have, as
    far as I can see, no other consequence.

    I think the strongest case for the possibility of misunderstanding comes
    from this sentence being where it is. I don't see any problem with the
    word "adjacent", but I can imagine someone wondering why this sentence
    is where it is if not to do what you are suggesting.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Ben Bacarisse on Wed Jan 27 11:20:44 2021
    On 1/27/21 10:46 AM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/26/21 8:16 PM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    ...
    Yes, it is. In "a""b", the two tokens are adjacent. In "a" "b", they are >>>> not, because both are adjacent to some white-space instead.

    Adjacent does not mean with nothing in between (thought it can, of
    course). What's more, things can be adjacent to each other, and also
    adjacent to something in between. I can say that there was a fire in
    the house adjacent to mine. The two house are adjacent. But both are
    adjacent to the lane separating them.

    It takes at least two dimensions for the issue you raise to come up.

    I don't follow. 1 and 2 are adjacent integers on the real line
    (i.e. despite having other kinds of number between them). In addition,
    they are both integers adjacent to 1/2.

    I'm not familiar with any meaning that could reasonably be attached to "adjacent" which would make either of those statements true. In the
    future, I will try to remember that there's at least one person who does
    attach such a meaning to that word - but it would make it easier for me
    to understand how you could say such a thing if you would specify that definition.

    When using a meaning that allows 1 and 2 to be both adjacent to 1/2,
    while also being adjacent to each other, how do you interpret "adjacent
    string literal" so that it doesn't apply to

    ptrdiff_t d = "Ben"-"Bacarisse";

    It seems to me that, despite having no idea how you could possibly mean
    what you seem to have said, I can make a direct analogy, matching 1 with
    "Ben", 1/2 with '-', and 2 with "Bacarisse". So, how does that analogy
    break down? Or are you claiming that they should be concatenated?

    ...
    Moving the first sentence of translation phase 7 to be the first
    sentence of translation phase 6 would remove all ambiguity, and have, as
    far as I can see, no other consequence.

    I think the strongest case for the possibility of misunderstanding comes
    from this sentence being where it is. I don't see any problem with the
    word "adjacent", but I can imagine someone wondering why this sentence
    is where it is if not to do what you are suggesting.

    I think you just agreed with me, but you didn't quite say so directly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ben Bacarisse@21:1/5 to James Kuyper on Thu Jan 28 03:05:24 2021
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/27/21 10:46 AM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On 1/26/21 8:16 PM, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    ...
    Yes, it is. In "a""b", the two tokens are adjacent. In "a" "b", they are >>>>> not, because both are adjacent to some white-space instead.

    Adjacent does not mean with nothing in between (thought it can, of
    course). What's more, things can be adjacent to each other, and also
    adjacent to something in between. I can say that there was a fire in
    the house adjacent to mine. The two house are adjacent. But both are >>>> adjacent to the lane separating them.

    It takes at least two dimensions for the issue you raise to come up.

    I don't follow. 1 and 2 are adjacent integers on the real line
    (i.e. despite having other kinds of number between them). In addition,
    they are both integers adjacent to 1/2.

    I'm not familiar with any meaning that could reasonably be attached to "adjacent" which would make either of those statements true.

    That's and interesting view, but probably so off-topic that it would not be reasonable to investigate it here.

    In the future, I will try to remember that there's at least one person
    who does attach such a meaning to that word - but it would make it
    easier for me to understand how you could say such a thing if you
    would specify that definition.

    I am not a lexicographer, and not skilled at writing definitions. So I
    looked in the two dictionaries on the shelf here. The OED says:

    "Lying near to; adjoining; bordering. (Not necessarily touching.)"

    and Collins says

    "being near or close, esp. having a common boundary; adjoining;
    contiguous."

    These are pretty close to what I feel the word means.

    For comparison, what is your understanding of the word?

    When using a meaning that allows 1 and 2 to be both adjacent to 1/2,
    while also being adjacent to each other, how do you interpret "adjacent string literal" so that it doesn't apply to

    ptrdiff_t d = "Ben"-"Bacarisse";

    It seems to me that, despite having no idea how you could possibly mean
    what you seem to have said, I can make a direct analogy, matching 1 with "Ben", 1/2 with '-', and 2 with "Bacarisse". So, how does that analogy
    break down? Or are you claiming that they should be concatenated?

    It depends on what is the considered significant and what is merely a
    separator or common boundary.

    On the number line, we can stress what we want to focus on. "Adjacent /integers/" relegates everything else to being a mere separating
    boundary.

    So, to push the point to the edge of reason, if I choose to read the key sentence as "Adjacent /string literal/ tokens are concatenated", I
    could, at a pinch, make the case that "Ben" and "Bacarisse" are, in your example, adjacent. The context would have to be such that considering
    another token as a mere boundary or separator would be reasonable. The
    C standard is not such a context.

    But if I read it as "Adjacent string literal /tokens/ are concatenated",
    then the intervening token stops them being adjacent. When tokenising a character stream, all the tokens matter, so I believe there is only one reasonable way to read that sentence.

    ...
    Moving the first sentence of translation phase 7 to be the first
    sentence of translation phase 6 would remove all ambiguity, and have, as >>> far as I can see, no other consequence.

    I think the strongest case for the possibility of misunderstanding comes
    from this sentence being where it is. I don't see any problem with the
    word "adjacent", but I can imagine someone wondering why this sentence
    is where it is if not to do what you are suggesting.

    I think you just agreed with me, but you didn't quite say so directly.

    Agreement is not binary. I don't find your argument based on what
    adjacent means to be compelling, but I agree that the presence of that
    sentence one phase too late muddies the waters a bit.

    I've tried to express the extent and the nature of my agreement (and disagreement) as directly as I can. I'm sorry if you think I have been oblique.

    TL;DR: The fact that adjacent means something in the cluster of ideas
    around "being near to" and "having a common boundary, but not
    necessarily touching" means that I don't think there is any problem with
    "a" "b" being described as adjacent string literal tokens.

    --
    Ben.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jakob Bohm@21:1/5 to Ben Bacarisse on Thu Jan 28 09:53:34 2021
    On 2021-01-26 22:40, Ben Bacarisse wrote:
    Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:

    On 2021-01-26 13:22, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that >>>> "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance >>>> until translation phase 7. Therefore, string literals that are separated >>>> by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space" >>>> - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens
    are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?


    The interesting situation is cases like these:

    "a" /* Long comment explaining why b is the next byte */ "b"

    By translation phase 6 (when adjacent string literals are concatenated)
    this has become

    "a" "b"

    And

    #define LEAD_BYTE "a"
    #define TRAIL_BYTE "b"

    LEAD_BYTE TRAIL_BYTE

    And this has become

    "a" "b"

    Am I missing some ambiguity?


    Sorry, but I couldn't easily find the definition of the translation
    phases, only scattered mentions of "phase 6" and "phase 7", so I had to
    guess which practically related language features were buried in that distinction.



    Enjoy

    Jakob
    --
    Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
    Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
    This public discussion message is non-binding and may contain errors.
    WiseMo - Remote Service Management for PCs, Phones and Embedded

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Jakob Bohm on Thu Jan 28 05:45:46 2021
    On 1/28/21 3:53 AM, Jakob Bohm wrote:
    On 2021-01-26 22:40, Ben Bacarisse wrote:
    Jakob Bohm <jb-usenet@wisemo.com.invalid> writes:

    On 2021-01-26 13:22, Ben Bacarisse wrote:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that >>>>> "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance >>>>> until translation phase 7. Therefore, string literals that are separated >>>>> by white-space do not qualify as adjacent. There's also no mention of >>>>> white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me >>>>> that "adjacent" should be understood as "adjacent, ignoring white-space" >>>>> - but that doesn't seem obvious to me.

    Surely it just means "next to", and in the sequence of tokens "a" "b"
    the two are next to each other. It happens that string literal tokens >>>> are such that they can be adjacent without having any white-space
    between then, but I suspect that's making you over-think the meaning.
    Would you say that 'long int x' has no tokens adjacent to any others?


    The interesting situation is cases like these:

    "a" /* Long comment explaining why b is the next byte */ "b"

    By translation phase 6 (when adjacent string literals are concatenated)
    this has become

    "a" "b"

    And

    #define LEAD_BYTE "a"
    #define TRAIL_BYTE "b"

    LEAD_BYTE TRAIL_BYTE

    And this has become

    "a" "b"

    Am I missing some ambiguity?


    Sorry, but I couldn't easily find the definition of the translation
    phases, only scattered mentions of "phase 6" and "phase 7", so I had to
    guess which practically related language features were buried in that distinction.

    "5.1.1.2 Translation Phases
    The precedence among the syntax rules of translation is specified by the following
    phases. 6)
    1. Physical source file multibyte characters are mapped, in an
    implementation- defined manner, to the source character set (introducing new-line characters for end-of-line indicators) if necessary. Trigraph sequences are replaced by corresponding single-character internal representations.
    2. Each instance of a backslash character (\) immediately followed by a new-line character is deleted, splicing physical source lines to form
    logical source lines. Only the last backslash on any physical source
    line shall be eligible for being part of such a splice. A source file
    that is not empty shall end in a new-line character, which shall not be immediately preceded by a backslash character before any such splicing
    takes place.
    3. The source file is decomposed into preprocessing tokens 7) and
    sequences of white-space characters (including comments). A source file
    shall not end in a partial preprocessing token or in a partial comment.
    Each comment is replaced by one space character. New-line characters are retained. Whether each nonempty sequence of white-space characters other
    than new-line is retained or replaced by one space character is implementation-defined.
    4. Preprocessing directives are executed, macro invocations are
    expanded, and _Pragma unary operator expressions are executed. If a
    character sequence that matches the syntax of a universal character name
    is produced by token concatenation (6.10.3.3), the behavior is
    undefined. A #include preprocessing directive causes the named header or
    source file to be processed from phase 1 through phase 4, recursively.
    All preprocessing directives are then deleted.
    5. Each source character set member and escape sequence in character
    constants and string literals is converted to the corresponding member
    of the execution character set; if there is no corresponding member, it
    is converted to an implementation-defined member other than the null
    (wide) character. 8)
    6. Adjacent string literal tokens are concatenated.
    7. White-space characters separating tokens are no longer significant.
    Each preprocessing token is converted into a token. The resulting tokens
    are syntactically and semantically analyzed and translated as a
    translation unit.
    8. All external object and function references are resolved. Library
    components are linked to satisfy external references to functions and
    objects not defined in the current translation. All such translator
    output is collected into a program image which contains information
    needed for execution in its execution environment."

    The referenced footnotes are:
    "6) Implementations shall behave as if these separate phases occur, even
    though many are typically folded together in practice. Source files, translation units, and translated translation units need not necessarily
    be stored as files, nor need there be any one-to-one correspondence
    between these entities and any external representation. The description
    is conceptual only, and does not specify any particular implementation.
    7) As described in 6.4, the process of dividing a source file’s
    characters into preprocessing tokens is context-dependent. For example,
    see the handling of < within a #include preprocessing directive.
    8) An implementation need not convert all non-corresponding source
    characters to the same execution character."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to James Kuyper on Sat Jul 10 08:49:07 2021
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated
    by white-space do not qualify as adjacent. There's also no mention of white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space"
    - but that doesn't seem obvious to me. It also sounds vaguely familiar,
    like I've had this discussion with someone before, but I can't locate
    the discussion. Every example of adjacent string literals that appears
    in the standard has at least one white-space character separating them,
    so the intent is crystal-clear, but the wording doesn't clearly say so.

    If the phrase "White-space characters separating tokens are no longer significant." were moved from the beginning of the description of phase
    7 to the beginning of the description phase 6, it would make the insignificance of white space separating string literals perfectly
    clear, and as far as I can see, would have no other effect

    The word "adjacent" doesn't alway mean touching. There is another
    word for that, the word "adjoining". Booking a hotel reservation
    for adjacent rooms is not the same as a reservation for adjoining
    rooms.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Keith Thompson@21:1/5 to Tim Rentsch on Sat Jul 10 14:58:59 2021
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:
    [...]
    If the phrase "White-space characters separating tokens are no longer
    significant." were moved from the beginning of the description of phase
    7 to the beginning of the description phase 6, it would make the
    insignificance of white space separating string literals perfectly
    clear, and as far as I can see, would have no other effect

    The word "adjacent" doesn't alway mean touching. There is another
    word for that, the word "adjoining". Booking a hotel reservation
    for adjacent rooms is not the same as a reservation for adjoining
    rooms.

    That's not entirely clear. dictionary.com (not a definitive reference
    but a convenient one) shows "adjoining" as one of the definitions of "adjacent".

    If I understand you correctly, if rooms 110 and 112 share a common wall, perhaps with a door going between them, they're both adjacent and
    adjoining, but if instead they're on opposide sides of the elevator
    they're adjacent but not adjoining. Is that what you meant? I'm not
    sure I'd call them "adjacent" in that case.

    A footnote on "Adjacent string literals are concatenated" saying that
    two string literals are adjacent if they're adjoining or separated only
    by white-space characters would clear this up. Moving "White-space
    characters separating tokens are no longer significant." from the
    beginning of phase 7 to the beginning of phase 6 would also be a good
    solution.

    But given the clear examples, I wouldn't object to leaving it as it is.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for Philips
    void Void(void) { Void(); } /* The recursive call of the void */

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Tim Rentsch on Sun Jul 11 11:41:49 2021
    On Saturday, July 10, 2021 at 11:49:09 AM UTC-4, Tim Rentsch wrote:
    James Kuyper <james...@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance until translation phase 7. Therefore, string literals that are separated
    by white-space do not qualify as adjacent. There's also no mention of white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space"
    - but that doesn't seem obvious to me. It also sounds vaguely familiar, like I've had this discussion with someone before, but I can't locate
    the discussion. Every example of adjacent string literals that appears
    in the standard has at least one white-space character separating them,
    so the intent is crystal-clear, but the wording doesn't clearly say so.

    If the phrase "White-space characters separating tokens are no longer significant." were moved from the beginning of the description of phase
    7 to the beginning of the description phase 6, it would make the insignificance of white space separating string literals perfectly
    clear, and as far as I can see, would have no other effect
    The word "adjacent" doesn't alway mean touching. There is another
    word for that, the word "adjoining". Booking a hotel reservation
    for adjacent rooms is not the same as a reservation for adjoining
    rooms.

    But, if it doesn't mean "touching", what does it mean? If a blank space
    doesn't prevent them from being adjacent, what does? How do you
    draw the line between things that do prevent two string literals from
    being adjacent, and things that don't? And - most importantly, where
    in the actual text of the standard does it clearly make that distinction?
    I contend that it doesn't clearly make that distinction anywhere, but
    that moving the sentence "White-space characters separating
    tokens are no longer significant." From the beginning of phase 7 to
    the beginning of phase 6 would remove all ambiguity, making the text
    match the way all real world implementations actually handle this
    issue, and would have no other effect. Do you disagree? If so, with
    which part of what I just said, and for what reason?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to Keith Thompson on Thu Jul 22 10:29:33 2021
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    [...]

    If the phrase "White-space characters separating tokens are no longer
    significant." were moved from the beginning of the description of phase
    7 to the beginning of the description phase 6, it would make the
    insignificance of white space separating string literals perfectly
    clear, and as far as I can see, would have no other effect

    The word "adjacent" doesn't alway mean touching. There is another
    word for that, the word "adjoining". Booking a hotel reservation
    for adjacent rooms is not the same as a reservation for adjoining
    rooms.

    That's not entirely clear. dictionary.com (not a definitive reference
    but a convenient one) shows "adjoining" as one of the definitions of "adjacent".

    That's consistent with what I said: "adjoining" being only one
    of the definitions is consistent with saying "adjacent" doesn't
    _always_ mean touching. Words in English can be ambiguous in
    their meanings.

    If I understand you correctly, if rooms 110 and 112 share a common wall, perhaps with a door going between them, they're both adjacent and
    adjoining,

    In the case of hotels I think "adjoining" always means connected,
    either with or perhaps without a door, but yes.

    but if instead they're on opposide sides of the elevator
    they're adjacent but not adjoining. Is that what you meant? I'm not
    sure I'd call them "adjacent" in that case.

    A better example is a small utility closet rather than an
    elevator. "Adjacent" usually implies "closeness" even if
    it doesn't always mean touching, and two rooms with a bank
    of four elevators between them would for most people not
    be considered adjacent, I think. In the case of hotel
    rooms at least it's a matter of degree.

    Another example is two rooms having the same latitude and
    longitude, but on different (consecutive) floors. I think most
    people wouldn't call those rooms "adjacent". However, if there
    is a connecting stairway between them, a hotel might very well
    offer them as "adjoining rooms".

    A footnote on "Adjacent string literals are concatenated" saying that
    two string literals are adjacent if they're adjoining or separated only
    by white-space characters would clear this up. Moving "White-space characters separating tokens are no longer significant." from the
    beginning of phase 7 to the beginning of phase 6 would also be a good solution.

    But given the clear examples, I wouldn't object to leaving it as it is.

    Given that the wording lasted more than 30 years without anyone
    even noticing a problem, I think the case for leaving it alone
    is decidedly stronger than the case for making a change.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to James Kuyper on Thu Jul 22 15:26:15 2021
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On Saturday, July 10, 2021 at 11:49:09 AM UTC-4, Tim Rentsch wrote:

    James Kuyper <james...@alumni.caltech.edu> writes:

    I learned a couple of decades ago that adjacent string literals get
    concatenated into a single longer literal, even if separated by
    arbitrarily large amounts of white-space.

    Yesterday I happened to notice that translation phase 6 says only that
    "Adjacent string literal tokens are concatenated.", without saying
    anything about white-space. White-space doesn't lose it's significance
    until translation phase 7. Therefore, string literals that are separated >>> by white-space do not qualify as adjacent. There's also no mention of
    white-space in the fuller discussion that occurs in 6.4.5p5.

    Am I missing something obvious here? I can imagine someone telling me
    that "adjacent" should be understood as "adjacent, ignoring white-space" >>> - but that doesn't seem obvious to me. It also sounds vaguely familiar, >>> like I've had this discussion with someone before, but I can't locate
    the discussion. Every example of adjacent string literals that appears
    in the standard has at least one white-space character separating them,
    so the intent is crystal-clear, but the wording doesn't clearly say so.

    If the phrase "White-space characters separating tokens are no longer
    significant." were moved from the beginning of the description of phase
    7 to the beginning of the description phase 6, it would make the
    insignificance of white space separating string literals perfectly
    clear, and as far as I can see, would have no other effect

    The word "adjacent" doesn't alway mean touching. There is another
    word for that, the word "adjoining". Booking a hotel reservation
    for adjacent rooms is not the same as a reservation for adjoining
    rooms.

    But, if it doesn't mean "touching", what does it mean?

    In hotels, normally it means on the same floor and with no
    intervening rooms or other major building structures (but small
    things like utility closets don't count). In a country inn where
    there are standalone cottages rather than rooms, two cottages
    would normally be called adjacent if there were no other cottages
    in between, and the cottages in question were not inordinately far
    apart.

    In the C standard it means having no intervening tokens.

    If a blank space
    doesn't prevent them from being adjacent, what does?

    Another token (not a string literal token, presumably, but only
    because we might consider a sequence of string literal tokens
    to be "adjacent tokens").

    How do you
    draw the line between things that do prevent two string literals from
    being adjacent, and things that don't?

    In the text of the C standard, the word "adjacent" is an adjective
    modifying the noun "tokens", and hence tokens are what matters.
    The line is drawn by normal English usage.

    And - most importantly, where in the actual text of the standard
    does it clearly make that distinction?

    That depends in part on one's notion of what it means "to clearly
    make" a distinction. Speaking for myself, the combination of
    "adjacent" modifying "tokens" and the examples given in 6.4.5 make
    the distinction quite clearly enough.

    I contend that it doesn't clearly make that distinction anywhere,

    If I may make a suggestion, how you read the C standard doesn't
    match the reading mode expected by its authors. The C standard
    wasn't written for a target audience of lawyers or mathematicians,
    but by practical software developers expecting it would be read by
    other practical software developers. The issue suggested here is
    way below their radar, and indeed way below the radar of most
    people who read the C standard. If no one else has noticed it in
    more than 30 years, what does that say about how clear or unclear
    the distinction is?

    but
    that moving the sentence "White-space characters separating
    tokens are no longer significant." From the beginning of phase 7 to
    the beginning of phase 6 would remove all ambiguity, making the text
    match the way all real world implementations actually handle this
    issue, and would have no other effect. Do you disagree?

    I don't either agree or disagree, because I think the extremely
    low probability of anyone being confused makes it not worth the
    effort of investigating the question.

    If so, with which part of what I just said, and for what reason?

    If there is something I disagree with, I think it's the idea that
    attempting to "clarify" the language here will necessarily result
    in a net benefit. Consider for example the C++ standard: its
    authors apparently strive for exact and precise (and presumably
    ambiguity free) phrasing, but the result is an unreadable mess.
    To me it seems obvious that the writing in the C standard is much
    closer to a good balance point between being formally exact and
    being understandable. From my point of view, if writing in the C
    standard (or other similar standards) isn't understandable, it's
    useless, no matter how precise or exact it is. In this particular
    case I would say the current wording is definitely on the right
    side of the line.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From James Kuyper@21:1/5 to Tim Rentsch on Thu Jul 22 17:29:18 2021
    On Thursday, July 22, 2021 at 6:26:22 PM UTC-4, Tim Rentsch wrote:
    James Kuyper <james...@alumni.caltech.edu> writes:
    On Saturday, July 10, 2021 at 11:49:09 AM UTC-4, Tim Rentsch wrote:
    ...
    The word "adjacent" doesn't alway mean touching. There is another
    word for that, the word "adjoining". Booking a hotel reservation
    for adjacent rooms is not the same as a reservation for adjoining
    rooms.

    But, if it doesn't mean "touching", what does it mean?
    In hotels, normally it means on the same floor and with no
    intervening rooms or other major building structures (but small
    things like utility closets don't count). In a country inn where
    there are standalone cottages rather than rooms, two cottages
    would normally be called adjacent if there were no other cottages
    in between, and the cottages in question were not inordinately far
    apart.

    In the C standard it means having no intervening tokens.
    If a blank space
    doesn't prevent them from being adjacent, what does?
    Another token (not a string literal token, presumably, ...

    I think your wording got a little confused there. In "A""B""C", the "B"
    string literal token definitely does prevent the "A" and "C" string literal tokens from being considered adjacent. An implementation would
    certainly be non-conforming if it concatenated "A" directly to "C" without first concatenating one or the other with "B".
    The following wording may be intended to address that issue:

    ... but only
    because we might consider a sequence of string literal tokens
    to be "adjacent tokens").

    but it's not very clear that it does. The simpler approach is to say that
    the one thing that unambiguously DOES prevent two string literal tokens
    from being considered adjacent is another string literal token. The only
    real question is whether there's anything else that does so.

    It would make much more sense for pre-processing tokens to serve as
    separators, rather than tokens, since tokens don't exist yet during
    translation phase 6 - they don't come into existence until they are
    created by conversion from pre-processing tokens during translation
    phase 7. String literals are members of both categories. header-names
    are removed during translation phase 4, but all of the other differences between pre-processing tokens and tokens remain valid during phase 6.

    However, since white-space characters separating tokens supposedly
    remains significant until translation phase 7, the same logic that favors pre-processing tokens over tokens also favors including white-space
    characters as separators. If they are still significant in phase 6, how are they significant, if not as separators of string literal tokens? I don't claim that this was the committee's intent (which is irrelevant to my mode of
    reading the standard), only that it's an unintentional side effect of putting the wording about white-space characters in the wrong translation
    phase, which should be corrected.

    ...
    I contend that it doesn't clearly make that distinction anywhere,
    If I may make a suggestion, how you read the C standard doesn't
    match the reading mode expected by its authors. ...

    Your reading mode puts too much emphasis on guessing the intent of
    the authors, and not enough on trying to write the text clearly enough to
    avoid the need for such guesswork. You might be right that it is the
    intended reading mode, but if so, I consider it a seriously flawed one.

    ...
    ... If no one else has noticed it in
    more than 30 years, what does that say about how clear or unclear
    the distinction is?

    You can't be sure that no one else has noticed it, only that no one has mentioned the issue in any forum that you monitor, during the time that
    you have monitored it. Unless you're super-human, you could not have
    come close to monitoring all forums where such an issue might have
    been raised, for the entire 30 years that you refer to.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Tim Rentsch@21:1/5 to James Kuyper on Mon Jan 17 05:29:58 2022
    James Kuyper <jameskuyper@alumni.caltech.edu> writes:

    On Thursday, July 22, 2021 at 6:26:22 PM UTC-4, Tim Rentsch wrote:

    James Kuyper <james...@alumni.caltech.edu> writes:

    On Saturday, July 10, 2021 at 11:49:09 AM UTC-4, Tim Rentsch wrote:

    ...

    The word "adjacent" doesn't alway mean touching. There is
    another word for that, the word "adjoining". Booking a hotel
    reservation for adjacent rooms is not the same as a reservation
    for adjoining rooms.

    But, if it doesn't mean "touching", what does it mean?

    In hotels, normally it means on the same floor and with no
    intervening rooms or other major building structures (but small
    things like utility closets don't count). In a country inn where
    there are standalone cottages rather than rooms, two cottages
    would normally be called adjacent if there were no other cottages
    in between, and the cottages in question were not inordinately
    far apart.

    In the C standard it means having no intervening tokens.

    If a blank space
    doesn't prevent them from being adjacent, what does?

    Another token (not a string literal token, presumably, ...

    I think your wording got a little confused there. In "A""B""C",
    the "B" string literal token definitely does prevent the "A" and
    "C" string literal tokens from being considered adjacent. An
    implementation would certainly be non-conforming if it
    concatenated "A" directly to "C" without first concatenating one
    or the other with "B". The following wording may be intended to
    address that issue:

    ... but only
    because we might consider a sequence of string literal tokens
    to be "adjacent tokens").

    but it's not very clear that it does. The simpler approach is to
    say that the one thing that unambiguously DOES prevent two string
    literal tokens from being considered adjacent is another string
    literal token. The only real question is whether there's anything
    else that does so.

    [...]

    Apparently you have missed the point of what I was saying. That
    surprises me, because I didn't think it was difficult to
    understand.


    I contend that it doesn't clearly make that distinction anywhere,

    If I may make a suggestion, how you read the C standard doesn't
    match the reading mode expected by its authors. ...

    Your reading mode puts too much emphasis on guessing the intent of
    the authors,

    It's not surprising that you think so, because that view doesn't
    fit with your agenda. However, judging what meaning is intended
    isn't what I'm talking about when I say "reading mode".

    and not enough on trying to write the text clearly
    enough to avoid the need for such guesswork.

    That's a non-sequitur. The two views are not in opposition;
    they are about different kinds of discussion regarding the C
    standard. They are not mutually exclusive.

    You might be right that it is the intended reading mode, but if
    so, I consider it a seriously flawed one.

    If "it" refers to "judging what meaning is intended", then "it"
    is independent of "reading mode" as I am using the term. (Note
    also that the word I used is "expected", and not "intended", but
    that distinction is not the primary point of focus.)

    Let me give an example. The C standard is not a math textbook.
    Most people don't read the C standard as though it were a math
    textbook. Trying to read the C standard in much the same way as
    one reads a math text would be a different "reading mode" than
    how most people read it. Does this example help explain what I
    mean by "reading mode"?


    ... If no one else has noticed it in
    more than 30 years, what does that say about how clear or unclear
    the distinction is?

    You can't be sure that no one else has noticed it, [...]

    I never said I was. The question is not what I know but what you
    know. If, as far as /you/ know, no one else has noticed the
    point you brought up, then it would appear that no one else is
    bothered by it. Do you know of any previous instance of someone
    else bringing up this question? Or is it, to the best of your
    knowledge, the case that your posting here is the first such
    occurrence?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)