Comparing pointers pointing at distinct objects is already invalid (for
some interpretation of "invalid"), so: no. Yes, that means the implementation of a function like memmove() cannot be fully portable C.
Am 03.07.2024 um 16:31 schrieb Vincent Lefevre:
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
For instance, is the following program valid and what does the ISO C
standard say about that?
Comparing pointers pointing at distinct objects is already invalid (for
some interpretation of "invalid")
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
For instance, is the following program valid and what does the ISO C
standard say about that?
#include <stdio.h>
#include <string.h>
typedef char *volatile vp;
int main (void)
{
char a = '\0', b = '\0';
vp p = &a, q = &b;
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
a = 'x';
printf ("%zd\n", strlen (p));
If such a program is valid, would there be issues by working with
pointers on such a string, say, dereferencing p[1] in the first "if"
(which is normally UB)?
On 7/3/24 10:31, Vincent Lefevre wrote:...
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
For instance, is the following program valid and what does the ISO C
standard say about that?
#include <stdio.h>
#include <string.h>
typedef char *volatile vp;
int main (void)
{
char a = '\0', b = '\0';
a and b are not guaranteed to be contiguous.
vp p = &a, q = &b;
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
That comparison is legal, and has well-defined behavior. It will be true
only if they are in fact contiguous.
a = 'x';
printf ("%zd\n", strlen (p));
Because strlen() must take a pointer to 'a' (which is treated, for these purposes, as a array of char of length 1), and increment it one past the
end of that array, and then dereference that pointer to check whether it points as a null character, the behavior is undefined.
James Kuyper <jameskuyper@alumni.caltech.edu> writes:...
Because strlen() must take a pointer to 'a' (which is treated, for these
purposes, as a array of char of length 1), and increment it one past the
end of that array, and then dereference that pointer to check whether it
points as a null character, the behavior is undefined.
I think this is slightly misleading. It suggests that the UB comes from something strlen /must/ do, but strlen must be thought of as a black
box. We can't base anyhting on a assumed implementation.
But our conclusion is correct because there is explicit wording covering
this case. The section on "String function conventions" (7.24.1)
states:
"If an array is accessed beyond the end of an object, the behavior is undefined."
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 7/3/24 10:31, Vincent Lefevre wrote:...
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
For instance, is the following program valid and what does the ISO C
standard say about that?
#include <stdio.h>
#include <string.h>
typedef char *volatile vp;
int main (void)
{
char a = '\0', b = '\0';
a and b are not guaranteed to be contiguous.
vp p = &a, q = &b;
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
That comparison is legal, and has well-defined behavior. It will be true only if they are in fact contiguous.
a = 'x';
printf ("%zd\n", strlen (p));
Because strlen() must take a pointer to 'a' (which is treated, for these purposes, as a array of char of length 1), and increment it one past the end of that array, and then dereference that pointer to check whether it points as a null character, the behavior is undefined.
I think this is slightly misleading. It suggests that the UB comes from something strlen /must/ do, but strlen must be thought of as a black
box. We can't base anyhting on a assumed implementation.
But our conclusion is correct because there is explicit wording covering
this case. The section on "String function conventions" (7.24.1)
states:
"If an array is accessed beyond the end of an object, the behavior is
undefined."
In article <87zfqy6v54.fsf@bsb.me.uk>,
Ben Bacarisse <ben@bsb.me.uk> wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 7/3/24 10:31, Vincent Lefevre wrote:...
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
For instance, is the following program valid and what does the ISO C
standard say about that?
#include <stdio.h>
#include <string.h>
typedef char *volatile vp;
int main (void)
{
char a = '\0', b = '\0';
a and b are not guaranteed to be contiguous.
vp p = &a, q = &b;
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
That comparison is legal, and has well-defined behavior. It will be true >> > only if they are in fact contiguous.
a = 'x';
printf ("%zd\n", strlen (p));
Because strlen() must take a pointer to 'a' (which is treated, for these >> > purposes, as a array of char of length 1), and increment it one past the >> > end of that array, and then dereference that pointer to check whether it >> > points as a null character, the behavior is undefined.
I think this is slightly misleading. It suggests that the UB comes from
something strlen /must/ do, but strlen must be thought of as a black
box. We can't base anyhting on a assumed implementation.
I agree (and note that strlen is not necessarily written in C).
But our conclusion is correct because there is explicit wording covering
this case. The section on "String function conventions" (7.24.1)
states:
"If an array is accessed beyond the end of an object, the behavior is
undefined."
Arguments of these functions are either arrays and strings, where a
string is not defined as being an array (or a part of an array). So
I don't see why this text, as written, would apply to strings.
BTW, the definition of an object is rather vague: "region of data
storage in the execution environment, the contents of which can
represent values". But it is not excluded that contiguous areas
can form an object.
Similarly, malloc() is specified as allocating space for an object,
but this does not mean that one initially has an object in the
allocated space, though with the above restriction, this would
be important to be able to use memset() on this storage area.
In article <87zfqy6v54.fsf@bsb.me.uk>,
Ben Bacarisse <ben@bsb.me.uk> wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 7/3/24 10:31, Vincent Lefevre wrote:...
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
For instance, is the following program valid and what does the ISO C
standard say about that?
#include <stdio.h>
#include <string.h>
typedef char *volatile vp;
int main (void)
{
char a = '\0', b = '\0';
a and b are not guaranteed to be contiguous.
vp p = &a, q = &b;
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
That comparison is legal, and has well-defined behavior. It will be true >>> only if they are in fact contiguous.
a = 'x';
printf ("%zd\n", strlen (p));
Because strlen() must take a pointer to 'a' (which is treated, for these >>> purposes, as a array of char of length 1), and increment it one past the >>> end of that array, and then dereference that pointer to check whether it >>> points as a null character, the behavior is undefined.
I think this is slightly misleading. It suggests that the UB comes from
something strlen /must/ do, but strlen must be thought of as a black
box. We can't base anyhting on a assumed implementation.
I agree (and note that strlen is not necessarily written in C).
But our conclusion is correct because there is explicit wording covering
this case. The section on "String function conventions" (7.24.1)
states:
"If an array is accessed beyond the end of an object, the behavior is
undefined."
Arguments of these functions are either arrays and strings, where a
string is not defined as being an array (or a part of an array). So
I don't see why this text, as written, would apply to strings.
BTW, the definition of an object is rather vague: "region of data
storage in the execution environment, the contents of which can
represent values". But it is not excluded that contiguous areas
can form an object.
Similarly, malloc() is specified as allocating space for an object,
but this does not mean that one initially has an object in the
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A string
is a contiguous sequence of characters terminated by and including
the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
For instance, is the following program valid and what does the ISO C
standard say about that?
#include <stdio.h>
#include <string.h>
typedef char *volatile vp;
int main (void)
{
char a = '\0', b = '\0';
vp p = &a, q = &b;
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
a = 'x';
printf ("%zd\n", strlen (p));
}
if (q + 1 == p)
{
b = 'x';
printf ("%zd\n", strlen (q));
}
return 0;
}
If such a program is valid, would there be issues by working with
pointers on such a string, say, dereferencing p[1] in the first "if"
(which is normally UB)?
Comparing pointers pointing at distinct objects is already invalid
(for some interpretation of "invalid"), [...].
Yes, that means the implementation of a function like memmove()
cannot be fully portable C.
In article <87zfqy6v54.fsf@bsb.me.uk>,
Ben Bacarisse <ben@bsb.me.uk> wrote:
James Kuyper <jameskuyper@alumni.caltech.edu> writes:
On 7/3/24 10:31, Vincent Lefevre wrote:
ISO C17 (and C23 draft) 7.1.1 defines a string as follows: "A
string is a contiguous sequence of characters terminated by and
including the first null character."
But may a string span multiple, independent objects that happens
to be contiguous in memory?
...
For instance, is the following program valid and what does the
ISO C standard say about that?
#include <stdio.h>
#include <string.h>
typedef char *volatile vp;
int main (void)
{
char a = '\0', b = '\0';
a and b are not guaranteed to be contiguous.
vp p = &a, q = &b;
printf ("%p\n", (void *) p);
printf ("%p\n", (void *) q);
if (p + 1 == q)
{
That comparison is legal, and has well-defined behavior. It will
be true only if they are in fact contiguous.
a = 'x';
printf ("%zd\n", strlen (p));
Because strlen() must take a pointer to 'a' (which is treated, for
these purposes, as a array of char of length 1), and increment it
one past the end of that array, and then dereference that pointer
to check whether it points as a null character, the behavior is
undefined.
I think this is slightly misleading. It suggests that the UB comes
from something strlen /must/ do, but strlen must be thought of as a
black box. We can't base anyhting on a assumed implementation.
I agree (and note that strlen is not necessarily written in C).
But our conclusion is correct because there is explicit wording
covering this case. The section on "String function conventions"
(7.24.1) states:
"If an array is accessed beyond the end of an object, the
behavior is undefined."
Arguments of these functions are either arrays and strings, where
a string is not defined as being an array (or a part of an array).
So I don't see why this text, as written, would apply to strings.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 475 |
Nodes: | 16 (2 / 14) |
Uptime: | 54:31:25 |
Calls: | 9,496 |
Calls today: | 7 |
Files: | 13,621 |
Messages: | 6,124,209 |
Posted today: | 1 |