This discussion requires familiarity with the standard's specifications
and terminology from sections 6.10, 6.10.1 and 6.10.2. Unless you've got those sections memorized, you might need to cross-reference them to understand what I'm saying.
To simplify the following discussion, I'm going to write it as if the
only conditional inclusion preprocessing directives were #if, #else, and #endif. Code using the other conditional inclusion directives can always
be rewritten to use only those three, with essentially the same
behavior, with a minor exception in the case of #elsif, where the
subsequent occurrences of __LINE__ would have increased values. Those
other directives don't change the issue I'm discussing, they only
complicate the discussion.
I've long understood that, during translation phase 4, as soon as a
compiler reaches the new-line at the end of a #if directive, it knows
whether the #if group will be included. It not, and there's a
corresponding #else, it knows that the #else group will be included.
Either way, as soon as it starts reading a group that will be included,
it can immediately start preprocessing that group (and this is the
important part:) while searching for the #else or #endif directive that terminates the group.
I've also long understood that the #if, #else (if any) and #endif
directives that make up an if-section must all occur in the same file.
I'm not sure how I reached that conclusion - it's not anything that the standard says explicitly.
I just recently realized that, under certain circumstances, those two understandings are in conflict:
if.c:
#if 1
int i = 0;
#include "else.h"
int l = 3;
#endif
else.h:
int j = 1;
#else
int k = 2;
If preprocessing of the #if group could continue while searching for the terminating #else or #endif, then that would mean that the #include
directive in if.c would be replaced by the contents of else.h, and that
the #else from else.h would therefore be recognized as terminating the if-group from if.c, and starting a new else-group that continues until
the #endif in if.c. The declarations of `i` and `j` should be included,
and those of `k` and `l` should be skipped.
I didn't expect it to work, and my tests with gcc confirm that
expectation - but I'm having trouble identifying how the standard
specifies that this shouldn't work.
The grammar for an if-group in 6.10p1 includes the following rule:
# if constant-expression new-line group opt
This could be interpreted as meaning that the entire if-group must be
parsed as such by the compiler before carrying out the behavior
associated with that if-group, which is to process the optional group if
the constant-expression has a non-zero value. This would imply that the
#else or #endif that terminates the group must be identified as such
before replacing any #include directives that might be found in that
group with the contents of the specified file. That in turn would imply
that a #else in the included file could not qualify as that terminating directive.
The thing is, It's not clear to me that the standard actually says so. C
was designed around the same time I started my computer programming
career, when keeping a program's memory footprint small was more
important than it is now. I've noted that, particularly with the
original version of the C standard, the language seems, for the most
part, deliberately designed to allow single-pass processing with
relatively low memory requirements, which is why I did not expect it to require scanning for the end of a group before processing any #include directives in that group.
Have I missed something that says this more explicitly than the grammar
rule cited above? I'm sure there are people who will tell me that the
grammar rule cited above is sufficient, because they think it makes this point perfectly clear - but is there anyone who agrees with me that it's
not clear?
On 12/25/21 1:18 AM, James Kuyper wrote:...
...if.c:
#if 1
int i = 0;
#include "else.h"
int l = 3;
#endif
else.h:
int j = 1;
#else
int k = 2;
While the grammer may not be clear as to what happens between the start
and end of the if-group, the description of what happens in the block
says (6.10.1p6)
Each directive’s condition is checked in order. If it evaluates to false (zero), the group that it controls is skipped: directives are processed
only through the name that determines the directive in order to keep
track of the level of nested conditionals; the rest of the directives’ preprocessing tokens are ignored, as are the other preprocessing tokens
in the group. Only the first group whose control condition evaluates to
true (nonzero) is processed. If none of the conditions evaluates to
true, and there is a #else directive, the group controlled by the #else
is processed; lacking a #else directive, all the groups until the #endif
are skipped.)
Thus it is clear that an #include statement within a skipped block is
not processed and thus #else and the like within the include file are
not seen.
This discussion requires familiarity with the standard's specifications
and terminology from sections 6.10, 6.10.1 and 6.10.2. Unless you've got those sections memorized, you might need to cross-reference them to understand what I'm saying.
To simplify the following discussion, I'm going to write it as if the
only conditional inclusion preprocessing directives were #if, #else, and #endif. Code using the other conditional inclusion directives can always
be rewritten to use only those three, with essentially the same
behavior, with a minor exception in the case of #elsif, where the
subsequent occurrences of __LINE__ would have increased values. Those
other directives don't change the issue I'm discussing, they only
complicate the discussion.
I've long understood that, during translation phase 4, as soon as a
compiler reaches the new-line at the end of a #if directive, it knows
whether the #if group will be included. It not, and there's a
corresponding #else, it knows that the #else group will be included.
Either way, as soon as it starts reading a group that will be included,
it can immediately start preprocessing that group (and this is the
important part:) while searching for the #else or #endif directive that terminates the group.
I've also long understood that the #if, #else (if any) and #endif
directives that make up an if-section must all occur in the same file.
I'm not sure how I reached that conclusion - it's not anything that the standard says explicitly. [...]
James Kuyper <james...@alumni.caltech.edu> writes:...
I've long understood that, during translation phase 4, as soon as a compiler reaches the new-line at the end of a #if directive, it knows whether the #if group will be included. It not, and there's a
corresponding #else, it knows that the #else group will be included.
Either way, as soon as it starts reading a group that will be included,
it can immediately start preprocessing that group (and this is the important part:) while searching for the #else or #endif directive that terminates the group.
I've also long understood that the #if, #else (if any) and #endif directives that make up an if-section must all occur in the same file.
I'm not sure how I reached that conclusion - it's not anything that the standard says explicitly. [...]
The first rule of grammar in 6.10 paragraph 1 says (with \sub()
to mean subscript)
preprocessing-file:
group \sub(opt)
Thus each preprocessing file must consist of an integral number
of group-part, and so cannot contain any unbalanced #if/#endif
directives, or any #else directive outside an #if/#endif section.
On Saturday, December 25, 2021 at 11:52:06 AM UTC-5, Tim Rentsch wrote:
James Kuyper <james...@alumni.caltech.edu> writes:
...
I've long understood that, during translation phase 4, as soon as
a compiler reaches the new-line at the end of a #if directive, it
knows whether the #if group will be included. It not, and there's
a corresponding #else, it knows that the #else group will be
included. Either way, as soon as it starts reading a group that
will be included, it can immediately start preprocessing that
group (and this is the important part:) while searching for the
#else or #endif directive that terminates the group.
I've also long understood that the #if, #else (if any) and #endif
directives that make up an if-section must all occur in the same
file. I'm not sure how I reached that conclusion - it's not
anything that the standard says explicitly. [...]
The first rule of grammar in 6.10 paragraph 1 says (with \sub()
to mean subscript)
preprocessing-file:
group \sub(opt)
Thus each preprocessing file must consist of an integral number
of group-part, and so cannot contain any unbalanced #if/#endif
directives, or any #else directive outside an #if/#endif section.
I believe that what you're saying, using the terms defined in the C preprocessing grammar, is that neither an if-group, an else-group,
nor a endif-line qualifies separately as a group-part, only a
complete if-section can do so.
When the standard defines the meaning of a term, that definition
takes precedence over any other interpretation you might reach by
analyzing the meaning of the words making up that term.
"preprocessing-file" is simply a symbol in the grammar - it's
definition is the grammar rule associated with that symbol.
I've always interpreted the specification given in 6.10.2 as meaning
that a given preprocessing file must match the grammar described in
6.10 up until the point that it recognizes a #include directive,
which 'causes the replacement of that directive by the entire
contents of the source file identified by the specified sequence
between the " delimiters.' It's only the file after that replacement
(and all other such replacements), which must fully parse in
accordance with the grammar in 6.10.
However, the term "preprocessing file" is also defined in 5.1.1.1p1.
That's a section of the standard that seldom comes up in discussion,
so I'd forgotten about that definition. I agree that it makes sense
that a "preprocessing file" is meant to match the syntax specified
for a "preprocessing-file". The standard often uses a grammar
symbol name, with '-' replaced by spaces, to refer to things
matching that grammar symbol. However, this is one of the few
places where the name, with that replacement, is formally defined
separately from the grammar, implying a connection between those two definitions.
This is not the clearest way to impose such a requirement. If
each preprocessing file is supposed to separately parse as a preprocessing-file, I think it would have been better to
explicitly mention that fact in the description of 6.10.2
"Source file Inclusion." The "replacement" wording actually
used gave me the strong impression that there were no content
restrictions on the #included file itself, but only on the
result after replacing the directive with those contents.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 365 |
Nodes: | 16 (3 / 13) |
Uptime: | 25:54:40 |
Calls: | 7,748 |
Files: | 12,888 |
Messages: | 5,740,251 |