In a C99 program on Linux (Ubuntu) I intended to use usleep() and then
also strnlen().
This is a CPP question that arose last month. It's not about an actual
issue with the software, just out of curiosity and to be sure it works reliable (it seemingly does).
In a C99 program on Linux (Ubuntu) I intended to use usleep() and then
also strnlen().
When I added usleep() and its include file I got an error and was asked
to define the CPP tag '_BSD_SOURCE'. I did so, and because I wanted
side effects of that tag kept as small as possible I prepended it just
before the respective #include and put it at the end of my #include list
...other #includes...
#define _BSD_SOURCE
#include <unistd.h>
But as got obvious *that* way there had been side-effects and I had to
put the tag at the beginning of all include files (which astonished me)
#define _GNU_SOURCE /* necessary for strnlen() in string.h */
[...]
In a C99 program on Linux (Ubuntu) I intended to use usleep() and then
also strnlen().
When I added usleep() and its include file I got an error and was asked
to define the CPP tag '_BSD_SOURCE'. [...]
Feature selection macros must be in effect before any system header
is included, and are usually put on the compiler command line:
I think once you define _GNU_SOURCE, Glibc's feature selection will
give you everything.
usleep() isn't in C99. It comes from POSIX, where it was declared
obsolete in 2001.
I'll just recommend that you follow POSIX and use nanosleep()
instead.
By the way , this kind of question is more appropriate for
comp.unix.programmer
Lowell wrote:
I'll just recommend that you follow POSIX and use nanosleep()
instead.
When I had read about the various 'sleep' options I decided to use one
which supports sub-second resolution and with a most simple interface.
That's why my choice was the simple 'usleep(usec);' even if obsolete
by POSIX. The nanosleep() is not "very complex", sure, but I'd have to
litter my code with variables unnecessary in my context, and also the advertised "advantages" of this function do not apply in my case.[*]
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
Lowell wrote:
I'll just recommend that you follow POSIX and use nanosleep()
instead.
When I had read about the various 'sleep' options I decided to use one
which supports sub-second resolution and with a most simple interface.
That's why my choice was the simple 'usleep(usec);' even if obsolete
by POSIX. The nanosleep() is not "very complex", sure, but I'd have to
litter my code with variables unnecessary in my context, and also the
advertised "advantages" of this function do not apply in my case.[*]
To be honest,I didn't actually understand where your problem came from
in the first place -- I just chose not to bring up more than one point
at a time. While usleep() is obsolete, it works fine, without any
feature test macro games, on (as far as I know) all POSIX-ish
systems. Certainly on recent Ubuntu, the following program compiles and
runs perfectly well without even any warnings with even the most extreme levels of warning enabled:
#include <stdio.h>
#include <unistd.h>
int main(void)
{
printf("starting\n");
usleep(2500000);
printf("finishing\n");
}
On 26.12.2023 16:59, Janis Papanagnou wrote:
[...]
In a C99 program on Linux (Ubuntu) I intended to use usleep() and then
also strnlen().
When I added usleep() and its include file I got an error and was asked
to define the CPP tag '_BSD_SOURCE'. [...]
Thanks for all the replies.
Kaz wrote:
Feature selection macros must be in effect before any system header
is included, and are usually put on the compiler command line:
Right. This time I came from the functions' man pages and read that
such tags are necessary for them, so I didn't think about the original purpose of these tags. I forgot that decades ago we used for platform specific declarations. Thanks for refreshing my neurons.
Once you specify the dialect, things get strict.
$ gcc usleep.c
Nothing
$ gcc -std=c99 usleep.c
usleep.c: In function ‘main’:
usleep.c:7:4: warning: implicit declaration of function ‘usleep’; did
you mean ‘sleep’? [-Wimplicit-function-declaration]
usleep(2500000);
^~~~~~
sleep
Lowell Gilbert <lgusenet@be-well.ilk.org> writes:
[...]
To be honest,I didn't actually understand where your problem came from
in the first place -- I just chose not to bring up more than one point
at a time. While usleep() is obsolete, it works fine, without any
feature test macro games, on (as far as I know) all POSIX-ish
systems. Certainly on recent Ubuntu, the following program compiles and
runs perfectly well without even any warnings with even the most extreme
levels of warning enabled:
#include <stdio.h>
#include <unistd.h>
int main(void)
{
printf("starting\n");
usleep(2500000);
printf("finishing\n");
}
I think the warnings you enabled weren't extreme enough:
$ gcc -std=c11 -c c.c
c.c: In function ‘main’:
c.c:7:4: warning: implicit declaration of function ‘usleep’; did you mean ‘sleep’? [-Wimplicit-function-declaration]
7 | usleep(2500000);
| ^~~~~~
| sleep
$
On 2023-12-28, Lowell Gilbert <lgusenet@be-well.ilk.org> wrote:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
Lowell wrote:
I'll just recommend that you follow POSIX and use nanosleep()
instead.
When I had read about the various 'sleep' options I decided to use one
which supports sub-second resolution and with a most simple interface.
That's why my choice was the simple 'usleep(usec);' even if obsolete
by POSIX. The nanosleep() is not "very complex", sure, but I'd have to
litter my code with variables unnecessary in my context, and also the
advertised "advantages" of this function do not apply in my case.[*]
To be honest,I didn't actually understand where your problem came from
in the first place -- I just chose not to bring up more than one point
at a time. While usleep() is obsolete, it works fine, without any
feature test macro games, on (as far as I know) all POSIX-ish
systems. Certainly on recent Ubuntu, the following program compiles and
runs perfectly well without even any warnings with even the most extreme
levels of warning enabled:
#include <stdio.h>
#include <unistd.h>
int main(void)
{
printf("starting\n");
usleep(2500000);
printf("finishing\n");
}
But if you don't specify any options, you're not even specifying the C dialect. You will get whatever dialect your gcc installation defaults
to. That is always a GNU dialect, firstly, which you might not want. Secondly, it's a moving target; it' used to be gnu89, then gnu99
then gnu11. Now GCC defaults to gnu17.
Once you specify the dialect, things get strict.
But as got obvious *that* way there had been side-effects and I had to
put the tag at the beginning of all include files (which astonished me)
Last time I looked into the system header files, three decades ago, I
got repelled by all the #ifdef's, cascaded and nested, a spaghetti code
of dependencies; I'm astonished it works.
On Tue, 26 Dec 2023 16:59:40 +0100, Janis Papanagnou wrote:
But as got obvious *that* way there had been side-effects and I had to
put the tag at the beginning of all include files (which astonished me)
It has always been thus <https://manpages.debian.org/7/feature_test_macros.en.html>:
NOTE: In order to be effective, a feature test macro must be defined
before including any header files.
Last time I looked into the system header files, three decades ago, I
got repelled by all the #ifdef's, cascaded and nested, a spaghetti code
of dependencies; I'm astonished it works.
The whole concept of include files and string-based macro processing is flawed. But that’s C for you ...
To be honest,I didn't actually understand where your problem came from
in the first place -- I just chose not to bring up more than one point
at a time. While usleep() is obsolete, it works fine, without any
feature test macro games, on (as far as I know) all POSIX-ish
systems. Certainly on recent Ubuntu, the following program compiles and
runs perfectly well without even any warnings with even the most extreme levels of warning enabled:
[snip code]
Yes, that's true. I was making an educated guess that the original poster wasn't actually asking for strict C99, despite referring to a "C99 program."
On 29/12/2023 02:35, Lawrence D'Oliveiro wrote:
On Tue, 26 Dec 2023 16:59:40 +0100, Janis Papanagnou wrote:
But as got obvious *that* way there had been side-effects and I had to
put the tag at the beginning of all include files (which astonished me)
It has always been thus <https://manpages.debian.org/7/feature_test_macros.en.html>:
NOTE: In order to be effective, a feature test macro must be defined
before including any header files.
Last time I looked into the system header files, three decades ago, I
got repelled by all the #ifdef's, cascaded and nested, a spaghetti code
of dependencies; I'm astonished it works.
The whole concept of include files and string-based macro processing is flawed. But that’s C for you ...
It's not just C's fault. It's the insistence of having have just ONE
system header that has to work for as many platforms and versions as possible.
Then that is just added to over the years to include to result in the patched-together mess that you see that is utterly unreadable. You can't simplify it it take things out because something could break. It is
fragile.
Why not have a dedicated header file that is the specific to a
particular version of a C compiler for a given platform? That it can be streamlined for that purpose.
I'll just recommend that you follow POSIX and use nanosleep()
instead.
When I had read about the various 'sleep' options I decided to use one
which supports sub-second resolution and with a most simple interface.
That's why my choice was the simple 'usleep(usec);' even if obsolete
by POSIX. The nanosleep() is not "very complex", sure, but I'd have to
litter my code with variables unnecessary in my context, and also the >advertised "advantages" of this function do not apply in my case.[*]
On 29/12/2023 02:35, Lawrence D'Oliveiro wrote:
On Tue, 26 Dec 2023 16:59:40 +0100, Janis Papanagnou wrote:
But as got obvious *that* way there had been side-effects and I had to
put the tag at the beginning of all include files (which astonished me)
It has always been thus <https://manpages.debian.org/7/feature_test_macros.en.html>:
NOTE: In order to be effective, a feature test macro must be defined
before including any header files.
Last time I looked into the system header files, three decades ago, I
got repelled by all the #ifdef's, cascaded and nested, a spaghetti code
of dependencies; I'm astonished it works.
The whole concept of include files and string-based macro processing is flawed. But that’s C for you ...
It's not just C's fault. It's the insistence of having have just ONE
system header that has to work for as many platforms and versions as possible.
Then that is just added to over the years to include to result in the patched-together mess that you see that is utterly unreadable. You can't simplify it it take things out because something could break. It is
fragile.
Why not have a dedicated header file that is the specific to a
particular version of a C compiler for a given platform? That it can be streamlined for that purpose.
If someone is maintaining compilers that need to work across a range of targets, then they can have a process that synthesises the header needed
for a specific configuration.
(I guess this is something that is harder on Linux because there, many standard headers are not part of a specific C compiler, but are a
resource shared by all C compilers, or tools that need to process C
headers.)
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
I'll just recommend that you follow POSIX and use nanosleep()
instead.
When I had read about the various 'sleep' options I decided to use one
which supports sub-second resolution and with a most simple interface.
That's why my choice was the simple 'usleep(usec);' even if obsolete
by POSIX. The nanosleep() is not "very complex", sure, but I'd have to
litter my code with variables unnecessary in my context, and also the
advertised "advantages" of this function do not apply in my case.[*]
You can always define your own usleep:
inline int
usleep(useconds_t microseconds)
{
struct timespec ts;
ts.tv_sec = microseconds / (useconds_t)1000000;
ts.tv_nsec = (long)(microseconds % (useconds_t)1000000) * 1000L;
return nanosleep(&ts, NULL);
}
On 28.12.2023 20:11, Lowell Gilbert wrote:
To be honest,I didn't actually understand where your problem came from
in the first place -- I just chose not to bring up more than one point
at a time. While usleep() is obsolete, it works fine, without any
feature test macro games, on (as far as I know) all POSIX-ish
systems. Certainly on recent Ubuntu, the following program compiles and
runs perfectly well without even any warnings with even the most extreme
levels of warning enabled:
[snip code]
Here's the output of the compiler call with #define _GNU_SOURCE removed
$ cc -std=c99 -o warn warn.c
warn.c: In function ‘delay_time’:
warn.c:368:3: warning: implicit declaration of function ‘strnlen’ [-Wimplicit-function-declaration]
warn.c: In function ‘main’:
warn.c:579:5: warning: implicit declaration of function ‘usleep’ [-Wimplicit-function-declaration]
It compiles, but if I see warnings I nonetheless try to get rid of them.
On 29/12/2023 02:35, Lawrence D'Oliveiro wrote:
On Tue, 26 Dec 2023 16:59:40 +0100, Janis Papanagnou wrote:
But as got obvious *that* way there had been side-effects and I had to
put the tag at the beginning of all include files (which astonished me)
It has always been thus <https://manpages.debian.org/7/feature_test_macros.en.html>:
NOTE: In order to be effective, a feature test macro must be defined
before including any header files.
Last time I looked into the system header files, three decades ago, I
got repelled by all the #ifdef's, cascaded and nested, a spaghetti code
of dependencies; I'm astonished it works.
The whole concept of include files and string-based macro processing is flawed. But that’s C for you ...
It's not just C's fault. It's the insistence of having have just ONE
system header that has to work for as many platforms and versions as possible.
Then that is just added to over the years to include to result in the patched-together mess that you see that is utterly unreadable. You can't simplify it it take things out because something could break. It is fragile.
Why not have a dedicated header file that is the specific to a
particular version of a C compiler for a given platform? That it can be streamlined for that purpose.
If someone is maintaining compilers that need to work across a range of targets, then they can have a process that synthesises the header needed
for a specific configuration.
(I guess this is something that is harder on Linux because there, many standard headers are not part of a specific C compiler, but are a
resource shared by all C compilers, or tools that need to process C
headers.)
You can always define your own usleep:
inline int
usleep(useconds_t microseconds)
{ etc... }
David Brown <david.brown@hesbynett.no> writes:
A useful tool that someone might like to write for this particular
situation would be a partial C preprocessor, letting you choose what
gets handled. You could choose to expand the code here for, say,
_GNU_SOURCE and _BSD_SOURCE - any use of these in #ifdef's and
conditional compilation would be expanded according to whether you
have defined the symbols or not, leaving an output that is easier to
understand while keeping most of the pre-processor stuff unchanged (so
not affecting #includes, and leaving #define'd macros and constants
untouched and therefore more readable).
The unifdef tool does some of this. (I haven't used it much.)
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2023-12-29, Bart <bc@freeuk.cm> wrote:[...]
(I guess this is something that is harder on Linux because there, many
standard headers are not part of a specific C compiler, but are a
resource shared by all C compilers, or tools that need to process C
headers.)
Headers on GNU/Linux systems tend to assume GCC. Clang would not be
usable did it not have GCC compatibility.
On my Ubuntu 22.04 system, tcc manages to use the system headers, which
are mostly provided by glibc. In a quick glance at /usr/include/stdio.h,
I see some #ifdefs for symbols like __GNUC__ (which is predefined by gcc
and clang but not by tcc) and __USE_GNU (I haven't bothered to look into
how that's defined).
On 2023-12-29, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
David Brown <david.brown@hesbynett.no> writes:
A useful tool that someone might like to write for this particular
situation would be a partial C preprocessor, letting you choose what
gets handled. You could choose to expand the code here for, say,
_GNU_SOURCE and _BSD_SOURCE - any use of these in #ifdef's and
conditional compilation would be expanded according to whether you
have defined the symbols or not, leaving an output that is easier to
understand while keeping most of the pre-processor stuff unchanged (so
not affecting #includes, and leaving #define'd macros and constants
untouched and therefore more readable).
The unifdef tool does some of this. (I haven't used it much.)
GNU cpp has an option which is something like this: -fdirectives-only.
It causes it not to expand macros.
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
[...]
BTW, is 'inline' meanwhile C standard? (I know that from C++ but
haven't done much C for long now.)
C added inline in C99 (the 1999 edition of the ISO C standard, the same
one that removed implicit int).
I think C and C++ have subtly different semantics for inline.
It flattens include files, processes conditionals, and keeps #defines unchanged.
However, it turns gcc's sys/stat.h from 300 lines into 3000 lines.
On Fri, 29 Dec 2023 22:40:47 +0000, Bart wrote:
It flattens include files, processes conditionals, and keeps #defines
unchanged.
However, it turns gcc's sys/stat.h from 300 lines into 3000 lines.
Still, merging all the stuff into fewer files likely means it loads
faster.
Includes files are kludge, and all these attempts to improve them are a kludge on top of a kludge. This is why better-designed languages have a proper module system that solves the whole issue.
Why not have a dedicated header file that is the specific to a
particular version of a C compiler for a given platform? That it can be streamlined for that purpose.
Why not have a dedicated header file that is the specific to a
particular version of a C compiler for a given platform? That it can be streamlined for that purpose.
Many people's assumption, that one needs "one true C compiler" (with
people debating GCC cs Clang, mostly ignoring any other possibilities)
and then using the same C library, etc, everywhere, may potentially
actually be doing the world a disservice.
So, why don't the vendors of the library do that exercise?
On Sat, 30 Dec 2023 01:58:53 +0000, Bart wrote:
So, why don't the vendors of the library do that exercise?
Maybe because most of the “vendors” of proprietary libraries have gone extinct. What we have now is “developers” and “contributors” to open- source projects. And if you have a bright idea for how they can do things better, you are free to contribute it.
Note that (unlike ELF on Linux), it is not currently possible to
directly share global variables across DLL boundaries.
On Sat, 30 Dec 2023 19:14:55 -0600, BGB wrote:I think that that limitation was specific to BGB's handling of DLLs; it
Note that (unlike ELF on Linux), it is not currently possible to
directly share global variables across DLL boundaries.
Windows is broken in so many ways ... when you achieve something, it’s
like getting a bear to dance: it’s not that it dances badly, but that it dances at all.
Windows is broken in so many ways
Then again, not like there is much hope of getting stuff ported whenin general, nearly all of the Linux software seems to take a "you need
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
[...]
BTW, is 'inline' meanwhile C standard? (I know that from C++ but
haven't done much C for long now.)
C added inline in C99 (the 1999 edition of the ISO C standard, the same
one that removed implicit int).
I think C and C++ have subtly different semantics for inline.
Mostly related to symbol visibility, IIRC.
On 29/12/2023 20:23, Kaz Kylheku wrote:
On 2023-12-29, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
David Brown <david.brown@hesbynett.no> writes:
A useful tool that someone might like to write for this particular
situation would be a partial C preprocessor, letting you choose what
gets handled. You could choose to expand the code here for, say,
_GNU_SOURCE and _BSD_SOURCE - any use of these in #ifdef's and
conditional compilation would be expanded according to whether you
have defined the symbols or not, leaving an output that is easier to
understand while keeping most of the pre-processor stuff unchanged (so >>>> not affecting #includes, and leaving #define'd macros and constants
untouched and therefore more readable).
The unifdef tool does some of this. (I haven't used it much.)
GNU cpp has an option which is something like this: -fdirectives-only.
It causes it not to expand macros.
It flattens include files, processes conditionals, and keeps #defines unchanged.
However, it turns gcc's sys/stat.h from 300 lines into 3000 lines.
If I apply it to my stat.h (also my stddef.h which it includes), which
are 110 lines together, it produces 900 lines. Most of that consists of
lots of built-in #defines with __ prefixes (each complete with a line
saying it is built-in).
When I use my own conversion tool (designed to turn C headers that
define APIs into declarations in my language), the output is 65 lines.
The gcc option does not expand typedefs or macros. So if there is a declaration using a type which uses both, that is unchanged, which is
not helpful. (At least not if trying to create bindings for your FFI.)
gcc with just -E will expand macros but still keep typedefs.
So, for producing a streamlined standard header, it still leaves a lot
to be desired. And for trying to flatten layers of macros and typedefs,
to reveal the underlying types, it's not that great either.
Purpose-built tools are always better, but dealing with C is not trivial anyway, and dealing with gnu- and gcc-specific features is even harder.
On 29/12/2023 23:40, Bart wrote:
On 29/12/2023 20:23, Kaz Kylheku wrote:
On 2023-12-29, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
David Brown <david.brown@hesbynett.no> writes:
A useful tool that someone might like to write for this particular
situation would be a partial C preprocessor, letting you choose what >>>>> gets handled. You could choose to expand the code here for, say,
_GNU_SOURCE and _BSD_SOURCE - any use of these in #ifdef's and
conditional compilation would be expanded according to whether you
have defined the symbols or not, leaving an output that is easier to >>>>> understand while keeping most of the pre-processor stuff unchanged (so >>>>> not affecting #includes, and leaving #define'd macros and constants
untouched and therefore more readable).
The unifdef tool does some of this. (I haven't used it much.)
GNU cpp has an option which is something like this: -fdirectives-only.
It causes it not to expand macros.
It flattens include files, processes conditionals, and keeps #defines
unchanged.
However, it turns gcc's sys/stat.h from 300 lines into 3000 lines.
If I apply it to my stat.h (also my stddef.h which it includes), which
are 110 lines together, it produces 900 lines. Most of that consists
of lots of built-in #defines with __ prefixes (each complete with a
line saying it is built-in).
When I use my own conversion tool (designed to turn C headers that
define APIs into declarations in my language), the output is 65 lines.
The gcc option does not expand typedefs or macros. So if there is a
declaration using a type which uses both, that is unchanged, which is
not helpful. (At least not if trying to create bindings for your FFI.)
gcc with just -E will expand macros but still keep typedefs.
Note that typedefs are part of the core C language, not the
preprocessor, so there could not possibly be a cpp option to do anything
with typedefs (the phrase "expand typedefs" is entirely wrong).
I realise that you (and possibly others) might find it useful for a tool
to replace typedef identifiers with their definitions, but it could only
be done for some cases, and is not as simple as macro substitution.
On 12/30/2023 8:18 PM, Bart wrote:
On 31/12/2023 01:34, Lawrence D'Oliveiro wrote:
On Sat, 30 Dec 2023 19:14:55 -0600, BGB wrote:I think that that limitation was specific to BGB's handling of DLLs;
Note that (unlike ELF on Linux), it is not currently possible to
directly share global variables across DLL boundaries.
Windows is broken in so many ways ... when you achieve something, it’s >>> like getting a bear to dance: it’s not that it dances badly, but that it >>> dances at all.
it was not made clear.
Yes.
In my case (for my custom target) I am using a modified version of
PE/COFF (with some tweaks, *), but it has the issue that there is not currently (any) mechanism for sharing variables across DLL boundaries
apart from getter/setter functions or similar.
Each DLL exports certain symbols such as the addresses of functions
and variables. So no reason you can't access a variable exported from
any DLL, unless perhaps multiple instances of the same DLL have to
share the same static data, but that sounds very unlikely, as little
would work.
Much past roughly Win9x or so, it has been possible to use "__declspec(dllimport)" on global variables in Windows (in an earlier
era, it was not possible to use the __declspec's, but instead necessary
to manage DLL import/exports by writing out lists in ".DEF" files).
It isn't entirely transparent, but yes, on actual Windows, it is very
much possible to share global variables across DLL boundaries.
Just, this feature is not (yet) supported by my compiler. Personally, I
don't see this as a huge loss (even if it did work; I personally see it
as "poor coding practice").
This [somes] my experience of software originating in Linux. This is why
Windows had to acquire CYGWIN then MSYS then WSL. You can't build the
simplest program without involving half of Linux.
Yes, and it is really annoying sometimes.
For the most part, Linux software builds and works fairly well... if one
is using a relatively mainline and relatively up-to-date Linux distro...
But, if one is not trying to build in or for a typical Linux style / GNU based userland; it is straight up pain...
Like, typically either the "./configure" script is going to go down in a crap-storm of error messages (say, if the shell is not "bash", or some commands it tries to use are absent or don't accept the same
command-line arguments, etc); or libraries are going to be missing; or
the build just ends up dying due to compiler errors (say, which headers
exist are different, or their contents are different, ...).
Within the code itself, it often doesn't take much looking to find one of:
Pointer arithmetic on "void *";
Various GCC specific "__attribute__((whatever))" modifiers;
Blobs of GAS specific inline ASM;
...
Whereas in more cross-platform code, one will usually find stuff like:
#ifdef __GNUC__
... GCC specific stuff goes here ...
#endif
#ifdef _MSC_VER
... MSVC specific stuff goes here ...
#endif
...
My compiler uses sort of an intermediate C dialect, but is more
conservative by default in some areas, such as treating things like TBAA
as "opt-in" features, rather than "opt-out", ...
Though, I did designate various cases as "no consistent or sensible
behavior exists", so "whatever happens, happens". Separating out cases
that are "technically undefined, but has a conventionally accepted
behavior" (such as using pointer casts for type punning, etc), vs "no accepted behavior and any behavior that may result is effectively a dice roll..." (a lot of cases involving out-of-bounds memory access, etc).
Some amount of the extensions have more MSVC-like syntax (albeit the ASM syntax itself is more derived from GAS style ASM syntax than Intel style syntax). Though, in particular, it is derived from "GAS SuperH" (which
falls into a similar category as M68K and PDP-11 ASM syntax):
R4 //this is a register
@R4 //memory with address in R4
(R4) //same as @R4
(R4,32) //displacement
32(R4) //same as (R4,32)
and file handles. With DLL, a pointer malloc-ed in the host cannot be
freed within the DLL and vice versa.)
In C, the "inline" qualifier is pretty much a message from the
programmer saying "I think the resulting code would be more efficient
if this is inlined by the optimiser".
On 31/12/2023 01:36, Lawrence D'Oliveiro wrote:
On Sat, 30 Dec 2023 01:58:53 +0000, Bart wrote:
So, why don't the vendors of the library do that exercise?
Maybe because most of the “vendors” of proprietary libraries have gone >> extinct. What we have now is “developers” and “contributors” to open-
source projects. And if you have a bright idea for how they can do things
better, you are free to contribute it.
I have plenty of ideas, but people are generally not interested.
On 12/30/2023 12:51 AM, Blue-Maned_Hawk wrote:
Bart wrote:
Why not have a dedicated header file that is the specific to a
particular version of a C compiler for a given platform? That it can be
streamlined for that purpose.
In fact, this is similar to exactly what Plan 9 did: for include files
that had arch-dependent content, it'd have a separate version of them for
each arch, with an arch's versions of headers like these all stored in
their own directory.
Yeah, we don't actually need big monolithic C libraries, nor big
monolithic C compilers, ...
Though, it seems like a different kind of strategy could be possible:
Split compilers into frontends and backends (which may exist as >semi-independent projects).
Frontend deals with source languages, compiles down to a common IR (with
the IR taking the place of object files);
Backend compiles this to the actual machine code, and does any linking, >before emitting the actual binary.
On 31/12/2023 15:25, David Brown wrote:
I realise that you (and possibly others) might find it useful for a tool
to replace typedef identifiers with their definitions, but it could only
be done for some cases, and is not as simple as macro substitution.
Take this program, which uses two nested typedefs and one macro:
typedef short T;
typedef T U;
#define V U
typedef struct R {
V a, b, c;
} S;
Passed through 'gcc -E', it manages to expand the V in the struct with
U. (-fdirectives-only doesn't even do that).
So what are the types of 'a, b, c'? Across 1000s of line of code, they
may need tracking down. At least, for someone not using your super-duper >tools.
Take this program, which uses two nested typedefs and one macro:
typedef short T;
typedef T U;
#define V U
typedef struct R {
V a, b, c;
} S;
Passed through 'gcc -E', it manages to expand the V in the struct with
U. (-fdirectives-only doesn't even do that).
So what are the types of 'a, b, c'? Across 1000s of line of code, they
may need tracking down. At least, for someone not using your super-duper tools.
On 2023-12-31, Bart <bc@freeuk.cm> wrote:
Take this program, which uses two nested typedefs and one macro:
typedef short T;
typedef T U;
#define V U
typedef struct R {
V a, b, c;
} S;
Passed through 'gcc -E', it manages to expand the V in the struct with
U. (-fdirectives-only doesn't even do that).
So what are the types of 'a, b, c'? Across 1000s of line of code, they
may need tracking down. At least, for someone not using your super-duper
tools.
Super duper tools like, oh, Exuberant Ctags from 2011, packaged in Ubuntu:
$ ctags --version
Exuberant Ctags 5.9~svn20110310, Copyright (C) 1996-2009 Darren Hiebert
Addresses: <dhiebert@users.sourceforge.net>, http://ctags.sourceforge.net
Optional compiled features: +wildcards, +regex
We run that and then super-duper editor Vim will use the tags
file to jump to the definition of V, and of U.
On 2023-12-31, Bart <bc@freeuk.cm> wrote:
and file handles. With DLL, a pointer malloc-ed in the host cannot be
freed within the DLL and vice versa.)
What??? All DLLs are in the same address space. malloc and free are
sister functions that typically live in the same DLL, and don't
care what calls them.
Maybe you're referring to one DLL's free not being able to handle
pointers produced by another DLL's malloc.
On 2023-12-31, Bart <bc@freeuk.cm> wrote:
Take this program, which uses two nested typedefs and one macro:
typedef short T;
typedef T U;
#define V U
typedef struct R {
V a, b, c;
} S;
Passed through 'gcc -E', it manages to expand the V in the struct with
U. (-fdirectives-only doesn't even do that).
So what are the types of 'a, b, c'? Across 1000s of line of code, they
may need tracking down. At least, for someone not using your super-duper
tools.
Super duper tools like, oh, Exuberant Ctags from 2011, packaged in Ubuntu:
$ ctags --version
Exuberant Ctags 5.9~svn20110310, Copyright (C) 1996-2009 Darren Hiebert
Addresses: <dhiebert@users.sourceforge.net>, http://ctags.sourceforge.net
Optional compiled features: +wildcards, +regex
We run that and then super-duper editor Vim will use the tags
file to jump to the definition of V, and of U.
On 31/12/2023 17:26, Kaz Kylheku wrote:
On 2023-12-31, Bart <bc@freeuk.cm> wrote:
and file handles. With DLL, a pointer malloc-ed in the host cannot be
freed within the DLL and vice versa.)
What??? All DLLs are in the same address space. malloc and free are
sister functions that typically live in the same DLL, and don't
care what calls them.
Maybe you're referring to one DLL's free not being able to handle
pointers produced by another DLL's malloc.
I'm referring to the possibilty that, if host and DLL both import say msvcrt.dll, that each may have its own instance of msvcrt.dll, with its
own static data. That would also be the case with two DLLs.
But I can't reproduce the kind of error that would cause.
So I was either mistaken, or it's been fixed in the last decade, or
maybe my original test failed for other reasons, eg. because of
statically linked libraries not shared ones, or mixed compilers (and do libraries) were used.
I realise that you (and possibly others) might find it useful for a tool
to replace typedef identifiers with their definitions, but it could only
be done for some cases, and is not as simple as macro substitution.
I have plenty of ideas, but people are generally not interested.
On 31/12/2023 18:44, Kaz Kylheku wrote:
On 2023-12-31, Bart <bc@freeuk.cm> wrote:
Take this program, which uses two nested typedefs and one macro:
typedef short T;
typedef T U;
#define V U
typedef struct R {
V a, b, c;
} S;
Passed through 'gcc -E', it manages to expand the V in the struct with
U. (-fdirectives-only doesn't even do that).
So what are the types of 'a, b, c'? Across 1000s of line of code, they
may need tracking down. At least, for someone not using your super-duper >>> tools.
Super duper tools like, oh, Exuberant Ctags from 2011, packaged in
Ubuntu:
$ ctags --version
Exuberant Ctags 5.9~svn20110310, Copyright (C) 1996-2009 Darren Hiebert
Addresses: <dhiebert@users.sourceforge.net>,
http://ctags.sourceforge.net
Optional compiled features: +wildcards, +regex
We run that and then super-duper editor Vim will use the tags
file to jump to the definition of V, and of U.
Actually ctags appears to be also part of Windows.
We run that and then super-duper editor Vim will use the tags
file to jump to the definition of V, and of U.
By the way , this kind of question is more appropriate for comp.unix.programmer .
This somes my experience of software originating in Linux. This is why Windows had to acquire CYGWIN then MSYS then WSL. You can't build the simplest program without involving half of Linux.
This is a CPP question that arose last month. It's not about an
actual issue with the software, just out of curiosity and to be sure
it works reliable (it seemingly does).
In a C99 program on Linux (Ubuntu) I intended to use usleep() and
then also strnlen().
When I added usleep() and its include file I got an error and was
asked to define the CPP tag '_BSD_SOURCE'. I did so, and because I
wanted side effects of that tag kept as small as possible I
prepended it just before the respective #include and put it at the
end of my #include list
...other #includes...
#define _BSD_SOURCE
#include <unistd.h>
But as got obvious *that* way there had been side-effects and I
had to put the tag at the beginning of all include files (which
astonished me)
#define _BSD_SOURCE
#include <unistd.h>
...other #includes here...
For the strnlen() function I needed another CPP tag, '_GNU_SOURCE'.
So now I have both CPP tag definitions before the includes
#define _GNU_SOURCE /* necessary for strnlen() in string.h */
#define _BSD_SOURCE /* necessary for usleep() in unistd.h */
...all #includes here...
On 12/31/2023 1:44 PM, Lawrence D'Oliveiro wrote:
If you want to see the right way to do macros, look at LISP, where they
are token-based, and much more robust as as result. I think they even
manage to apply scoping rules to macro definitions as well.
Fwiw, check this out:
[dead project with spam link omitted]
On 31/12/2023 23:46, Lawrence D'Oliveiro wrote:
If you meant “you can’t build the simplest
program *on Windows* without involving half of Linux” ... well, that’s >> just a reflection on the deficiencies of Windows. On Linux, you already
have the *whole* of Linux to start with.
And developers feel it necessary to USE everything that it provides!
I've never managed to build the GMP library on Windows for example (it
only comes as source code), because it requires that 30,000-line bash
script which in turn needs sed and m4 and all the rest.
Why? It's a numeric library. Why should it be dependent on OS?
Or maybe Linux developers NEED all that hand-holding and have no idea
how to build using a bare compiler.
Remember that end-users building
such projects are only doing a one-time build to get a working binary.
On Sun, 31 Dec 2023 02:18:25 +0000, Bart wrote:
This somes my experience of software originating in Linux. This is why
Windows had to acquire CYGWIN then MSYS then WSL. You can't build the
simplest program without involving half of Linux.
On Linux, we have package managers that only pull in the needed
dependencies. Windows just seems actively hostile to that kind of infrastructure management. If you meant “you can’t build the simplest program *on Windows* without involving half of Linux” ... well, that’s just a reflection on the deficiencies of Windows. On Linux, you already
have the *whole* of Linux to start with.
On Linux, we have package managers that only pull in the needed dependencies. Windows just seems actively hostile to that kind of
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
[...]
BTW, is 'inline' meanwhile C standard? (I know that from C++ but
haven't done much C for long now.)
C added inline in C99 (the 1999 edition of the ISO C standard, the same
one that removed implicit int).
I think C and C++ have subtly different semantics for inline.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
This is a CPP question that arose last month. It's not about an
actual issue with the software, just out of curiosity and to be sure
it works reliable (it seemingly does).
In a C99 program on Linux (Ubuntu) I intended to use usleep() and
then also strnlen().
When I added usleep() and its include file I got an error and was
asked to define the CPP tag '_BSD_SOURCE'. I did so, and because I
wanted side effects of that tag kept as small as possible I
prepended it just before the respective #include and put it at the
end of my #include list
...other #includes...
#define _BSD_SOURCE
#include <unistd.h>
But as got obvious *that* way there had been side-effects and I
had to put the tag at the beginning of all include files (which
astonished me)
#define _BSD_SOURCE
#include <unistd.h>
...other #includes here...
For the strnlen() function I needed another CPP tag, '_GNU_SOURCE'.
So now I have both CPP tag definitions before the includes
I second the recommendations of Lowell Gilbert and others not to
define _BSD_SOURCE or _GNU_SOURCE (especially not _GNU_SOURCE)
but instead seek alternatives, which are readily available for
the two functionalities being sought in this case.
#define _GNU_SOURCE /* necessary for strnlen() in string.h */
#define _BSD_SOURCE /* necessary for usleep() in unistd.h */
...all #includes here...
For strnlen(), put an inline definition in a header file:
#ifndef HAVE_strnlen_dot_h_header
#define HAVE_strnlen_dot_h_header
#include <stddef.h>
static inline size_t
strnlen( const char *s, size_t n ){
extern void *memchr( const void *, int, size_t );
const char *p = memchr( s, 0, n );
return p ? (size_t){ p-s } : n;
}
#include <string.h>
#endif
Disclaimer: this code has been compiled but not tested.
strnlen() is specified by POSIX. It might make sense to
re-implement it if your code needs to work on a non-POSIX system
(that doesn't also provide it). Why would you want to do so
otherwise?
memchr() is declared in <string.h>. Why would you duplicate its
declaration rather than just using `#include <string.h>`?
For usleep(), define an alternate function usnooze(), to be used
in place of usleep(). In header file usnooze.h:
[snip]
If your code doesn't need to be portable to systems that don't
provide usleep(), you can just use usleep(). If it does, its
probably better to modify the code so it uses nanosleep().
On 31/12/2023 19:37, Bart wrote:
Actually ctags appears to be also part of Windows.
That's not the case, it just happened to be bundled with Windows.
How does Vim know where in the file to look? The TAGS files I've
managed > to produce doesn't have that info, and I can't see anything
in the help to add it.
How do you get it to look inside header files, or do those have to be submitted manually?
On 12/31/2023 3:46 PM, Lawrence D'Oliveiro wrote:
On Linux, we have package managers that only pull in the needed
dependencies. Windows just seems actively hostile to that kind of
infrastructure management. If you meant “you can’t build the simplest
program *on Windows* without involving half of Linux” ... well, that’s >> just a reflection on the deficiencies of Windows. On Linux, you already
have the *whole* of Linux to start with.
Check this out:
https://vcpkg.io/en/
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2023-12-31, Bart <bc@freeuk.cm> wrote:[...]
How do you get it to look inside header files, or do those have to be
submitted manually?
ctags -R will process a tree recursively, including any C header files.
However, things that are only declared and not defined in header files,
like function declarations, are not indexed.
The exuberant-ctags version I have on Ubuntu and Cygwin doesn't seem to
have an option to process a tree recursively, or to deal with
directories at all. (I find that a little surprising; it would be
useful functionality.) The -R option is documented as follows:
-R, --no-regex
Don't do any more regexp matching on the following files. May
be freely intermixed with filenames and the --regex option.
[...]
[...]
It was just an example of some fairly hard core macro
magic.
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2023-12-31, Bart <bc@freeuk.cm> wrote:[...]
How do you get it to look inside header files, or do those have to be
submitted manually?
ctags -R will process a tree recursively, including any C header files.
However, things that are only declared and not defined in header files,
like function declarations, are not indexed.
The exuberant-ctags version I have on Ubuntu and Cygwin doesn't seem to
have an option to process a tree recursively, or to deal with
directories at all. (I find that a little surprising; it would be
useful functionality.) The -R option is documented as follows:
-R, --no-regex
Don't do any more regexp matching on the following files. May
be freely intermixed with filenames and the --regex option.
On Sun, 31 Dec 2023 22:57:12 -0800, Chris M. Thomasson wrote:
It was just an example of some fairly hard core macro
magic.
String-based macros aren’t “magic”, they’re just sad.
On 12/31/23 08:40, David Brown wrote:
...
In C, the "inline" qualifier is pretty much a message from the
programmer saying "I think the resulting code would be more efficient
if this is inlined by the optimiser".
Actually, what the C standard says is "Making a function an
inline function suggests that calls to the function be as fast as
possible". The standard does not specify how this is to be achieved, it merely imposes some requirements that constrain how it could be
achieved. Inlining a function call is just one way to do that.
On Mon, 1 Jan 2024 01:33:38 +0000, Bart wrote:
On 31/12/2023 23:46, Lawrence D'Oliveiro wrote:
If you meant “you can’t build the simplest
program *on Windows* without involving half of Linux” ... well, that’s >>> just a reflection on the deficiencies of Windows. On Linux, you already
have the *whole* of Linux to start with.
And developers feel it necessary to USE everything that it provides!
It’s called “code reuse”. A well-designed package-management system just
makes it so much easier to do.
I've never managed to build the GMP library on Windows for example (it
only comes as source code), because it requires that 30,000-line bash
script which in turn needs sed and m4 and all the rest.
Why? It's a numeric library. Why should it be dependent on OS?
Those are just standard file-manipulation tools that any decent OS should provide.
Or maybe Linux developers NEED all that hand-holding and have no idea
how to build using a bare compiler.
If only you could do that on Windows ... but no. Look at all the C runtime stuff needed just to build a simple “Hello World” program ... because Windows automatically assumes that every program must have a GUI.
Remember that end-users building
such projects are only doing a one-time build to get a working binary.
They find it easier to do “apt-get install”, or a GUI wrapper around same,
like Synaptic.
Bart <bc@freeuk.cm> writes:
On 31/12/2023 01:36, Lawrence D'Oliveiro wrote:
On Sat, 30 Dec 2023 01:58:53 +0000, Bart wrote:
So, why don't the vendors of the library do that exercise?
Maybe because most of the “vendors” of proprietary libraries have gone >>> extinct. What we have now is “developers” and “contributors” to open-
source projects. And if you have a bright idea for how they can do things >>> better, you are free to contribute it.
I have plenty of ideas, but people are generally not interested.
Perhaps they are not very good ideas, then....
Frankly, your obsession with header files is puzzling. 99.9%
percent of C/C++ programmers don't care.
Bart <bc@freeuk.cm> writes:
On 31/12/2023 01:36, Lawrence D'Oliveiro wrote:
On Sat, 30 Dec 2023 01:58:53 +0000, Bart wrote:
So, why don't the vendors of the library do that exercise?
Maybe because most of the “vendors” of proprietary libraries have gone >>> extinct. What we have now is “developers” and “contributors” to open-
source projects. And if you have a bright idea for how they can do things >>> better, you are free to contribute it.
I have plenty of ideas, but people are generally not interested.
Perhaps they are not very good ideas, then....
Frankly, your obsession with header files is puzzling. 99.9%
percent of C/C++ programmers don't care.
On Sun, 31 Dec 2023 16:25:08 +0100, David Brown wrote:
I realise that you (and possibly others) might find it useful for a tool
to replace typedef identifiers with their definitions, but it could only
be done for some cases, and is not as simple as macro substitution.
String-based macros are nothing but trouble.
Typedefs are scoped, string
macros are not.
On 31/12/2023 15:25, David Brown wrote:
On 29/12/2023 23:40, Bart wrote:
On 29/12/2023 20:23, Kaz Kylheku wrote:
On 2023-12-29, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
David Brown <david.brown@hesbynett.no> writes:
A useful tool that someone might like to write for this particular >>>>>> situation would be a partial C preprocessor, letting you choose what >>>>>> gets handled. You could choose to expand the code here for, say, >>>>>> _GNU_SOURCE and _BSD_SOURCE - any use of these in #ifdef's and
conditional compilation would be expanded according to whether you >>>>>> have defined the symbols or not, leaving an output that is easier to >>>>>> understand while keeping most of the pre-processor stuff unchanged >>>>>> (so
not affecting #includes, and leaving #define'd macros and constants >>>>>> untouched and therefore more readable).
The unifdef tool does some of this. (I haven't used it much.)
GNU cpp has an option which is something like this: -fdirectives-only. >>>> It causes it not to expand macros.
It flattens include files, processes conditionals, and keeps #defines
unchanged.
However, it turns gcc's sys/stat.h from 300 lines into 3000 lines.
If I apply it to my stat.h (also my stddef.h which it includes),
which are 110 lines together, it produces 900 lines. Most of that
consists of lots of built-in #defines with __ prefixes (each complete
with a line saying it is built-in).
When I use my own conversion tool (designed to turn C headers that
define APIs into declarations in my language), the output is 65 lines.
The gcc option does not expand typedefs or macros. So if there is a
declaration using a type which uses both, that is unchanged, which is
not helpful. (At least not if trying to create bindings for your FFI.)
gcc with just -E will expand macros but still keep typedefs.
Note that typedefs are part of the core C language, not the
preprocessor, so there could not possibly be a cpp option to do
anything with typedefs (the phrase "expand typedefs" is entirely wrong).
I realise that you (and possibly others) might find it useful for a
tool to replace typedef identifiers with their definitions, but it
could only be done for some cases, and is not as simple as macro
substitution.
Take this program, which uses two nested typedefs and one macro:
typedef short T;
typedef T U;
#define V U
typedef struct R {
V a, b, c;
} S;
Passed through 'gcc -E', it manages to expand the V in the struct with
U. (-fdirectives-only doesn't even do that).
So what are the types of 'a, b, c'? Across 1000s of line of code, they
may need tracking down. At least, for someone not using your super-duper tools.
If I use my compiler with 'mcc -mheaders', I get an output file that
includes this:
record R = $caligned
i16 a
i16 b
i16 c
end
It gives all the information I might need. Including the fact that it
uses default C alignment rules.
Notice however the name of the record is R not S; here it needs a
struct-tag to avoid an anonymous name. The typedef name is harder to use
as it is replaced early on in compilation.
Here, I'm effectively expanding a typedef. The output could just as
equally have been C source code.
On Mon, 1 Jan 2024 01:33:38 +0000, Bart wrote:
I've never managed to build the GMP library on Windows for example (it
only comes as source code), because it requires that 30,000-line bash
script which in turn needs sed and m4 and all the rest.
Why? It's a numeric library. Why should it be dependent on OS?
Those are just standard file-manipulation tools that any decent OS
should provide.
On 31/12/2023 22:44, Lawrence D'Oliveiro wrote:
On Sun, 31 Dec 2023 16:25:08 +0100, David Brown wrote:
I realise that you (and possibly others) might find it useful for a tool >>> to replace typedef identifiers with their definitions, but it could only >>> be done for some cases, and is not as simple as macro substitution.
String-based macros are nothing but trouble.
What a strange thing to say.
Macros based on textual substitution have their advantages and their disadvantages. It is a reasonable generalisation to say you should
prefer alternatives when available, such as inline functions, const
objects, enum constants, typedefs, etc., rather than macro equivalents.
But there are plenty of situations where C's pre-processor macros are extremely useful in writing good, clear and maintainable code.
Typedefs are scoped, string
macros are not.
True. Sometimes that is an advantage, sometimes a disadvantage.
Like any powerful tool, macros can be abused or misused, leading to
poorer results - but that does not mean they are "nothing but trouble".
On 1/1/2024 5:56 AM, Bart wrote:
Quite a substantial program that can be built effortlessly on either
Windows or Linux. Using Tiny C, it apparently compiles it in 75ms.
Try the same exercise with any equivalentally-sized program
originating on Linux. That is end, end up with only a C file (even
multiple C files) that I can build with a bare compiler.
That seems to be beyond most Linux developers.
For a lot of stuff, it is surprising what one can get done with a
one-liner shell script or batch file:
gcc -o prog_name sources... options...
Or:
cl /Feprog_name sources... options...
And, then 'make' is still plenty usable.
The more complex build systems often don't really help, they often make
the problem worse, and are needlessly overkill for most programs.
Yeah, and if one does a console program, the code is basically identical either way:
#include <stdio.h>
int main(int argc, char *argv[])
{
printf("Hello World\n");
return(0);
}
If building it from the command line with MSVC:
cl helloworld.c
Not a big issue...
GUI is a pain, but this is true regardless of OS.
Would be good though if there were a "good" portable way to do cross
platform GUI programs, but alas.
A common strategy is for a program does all the UI drawing itself, can
avoid need for platform-specific window-management code by using SDL or similar.
One other drawback of GTK is that ...
We don't need OS repos, because program installers can be casually
downloaded off the internet.
I guess, technically there is now the "Microsoft Store" (which is sort
of like "Google Play" on Android), but haven't really used it for much.
On 01/01/2024 02:00, Lawrence D'Oliveiro wrote:
Those are just standard file-manipulation tools that any decent OS
should provide.
What's that got to do with building able to build programs easily?
The more complex build systems often don't really help, they often make
the problem worse, and are needlessly overkill for most programs.
Macros based on textual substitution have their advantages and their disadvantages.
On 12/31/2023 8:48 PM, Lawrence D'Oliveiro wrote:
On Sun, 31 Dec 2023 20:06:38 -0800, Chris M. Thomasson wrote:
On 12/31/2023 3:46 PM, Lawrence D'Oliveiro wrote:
On Linux, we have package managers that only pull in the needed
dependencies. Windows just seems actively hostile to that kind of
infrastructure management. If you meant “you can’t build the simplest >>>> program *on Windows* without involving half of Linux” ... well,
that’s just a reflection on the deficiencies of Windows. On Linux,
you already have the *whole* of Linux to start with.
Check this out:
https://vcpkg.io/en/
How wonderful. From Microsoft, yet it is only for C/C++? Not even
supporting Microsoft’s flagship language, C#?
I don't think it supports C# at all, highly doubt it.
On Mon, 1 Jan 2024 15:44:42 +0100, David Brown wrote:
Macros based on textual substitution have their advantages and their
disadvantages.
They may have some advantage over not having macro-substitution facilities
at all, but they have no important advantage over token-based macro
systems.
I mentioned “homoiconicity” elsewhere: that is the right and robust way to
do it, letting you achieve useful and powerful things without running the real risk that it will all come down around your ears like a house of
cards.
On Mon, 1 Jan 2024 11:56:00 +0000, Bart wrote:
On 01/01/2024 02:00, Lawrence D'Oliveiro wrote:
Those are just standard file-manipulation tools that any decent OS
should provide.
What's that got to do with building able to build programs easily?
You have effectively answered that question yourself. Why were you trying
to build GNU GMP on Windows? Isn’t there something equally capable and yet more, shall we say, “native” to Windows, that you can build and use in a more “Windows-native” fashion?
Obviously the answer is no. And why? Because the Windows environment is simply not conducive to the development of such software.
Which is why the
software that exists is primarily oriented towards Linux-compatible
systems, and why you struggle to get it going on Windows.
On 01/01/2024 21:38, Lawrence D'Oliveiro wrote:
Obviously the answer is no. And why? Because the Windows environment is
simply not conducive to the development of such software.
Why not?
The software in question is a bunch of C files with a smattering of .s assembly files for a range of specific architectures.
On 01/01/2024 23:10, Lawrence D'Oliveiro wrote:
You have the answer right in front of your nose: look at the source of
the software itself and figure it out.
So you don't know. You choose to believe that it HAS to be done that
way.
On Mon, 1 Jan 2024 22:51:52 +0000, Bart wrote:
On 01/01/2024 21:38, Lawrence D'Oliveiro wrote:
Obviously the answer is no. And why? Because the Windows environment is
simply not conducive to the development of such software.
Why not?
You have the answer right in front of your nose: look at the source of the software itself and figure it out.
The software in question is a bunch of C files with a smattering of .s
assembly files for a range of specific architectures.
If you can come up with a simpler build system, why not publish your
version? And everybody else will adopt it because it will be so much
simpler than the present arrangement.
My point is that microsoft offers vcpkg. That's all.
On Mon, 1 Jan 2024 23:45:18 +0000, Bart wrote:
On 01/01/2024 23:10, Lawrence D'Oliveiro wrote:
You have the answer right in front of your nose: look at the source of
the software itself and figure it out.
So you don't know. You choose to believe that it HAS to be done that
way.
Out of curiosity, I *did* have a look at the build system for libgmp. The “configure” file is over 30,000 lines, but that is generated (via m4 macro
expansion) from a much smaller (and easier to read) “configure.ac” of about 4,000 lines. And the contents of the latter file are quite
interesting.
First of all, it includes code for building on a whole lot of proprietary Unix systems with non-GCC compilers, some of which maybe don’t quite conform to official C/C++ standards. You would think all those systems are extinct now, but clearly there are users who still care about them.
And that’s not counting the different ways GMP offers for you to build it. For example, do you want profiling? Omit procedure call frames? Static or shared library? Do you want to build the test programs? (Yes, the project includes its own test suite.) Do you want the test programs to use GNU readline to get their input? (Sounds like a handy option to me, at least
you have the choice.)
That is an amazingly long list of CPU architectures on which you can build it, all the way up to IBM mainframes. And it is in the nature of a library like this, doing CPU-intensive things, that it will need to take account
of CPU-specific quirks in order to get maximum efficiency. Hence a whole
load of special cases, particularly in code generation, for all these architectures.
Look at the ABI options: maybe you want a 32-bit build (where the CPU supports it) rather than a 64-bit one; on some architectures, things get a bit more complicated than just those two options.
And so on and so on. Feel free to tell us which parts of these you would
get rid of, without antagonizing users who might depend on them.
I would have no interest in building it, I wanted a ready-made DLL, for
that little-known niche platform known as 'Windows'.
However, there is no reliable repository of such libraries, so I can't
use it as dependency.
So, yes, I want the set of files specific to my platform; I don't care
about all the obscure ones it might support.
This does not sound unreasonable ...
And, if the Makefile isn't some mountain of auto-generated crap, then
people can fix it (without too much effort) if something is broken.
Of these, CMake is at least competent at being cross-platform (much
better than autotools in this regard).
On 2024-01-02, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
First of all, it includes code for building on a whole lot of
proprietary Unix systems with non-GCC compilers, some of which maybe
don’t quite conform to official C/C++ standards.
When people use Autoconf to configure their program for building, the generated script includes cruft for platforms on which they never tested
the program, and never will, and on which it won't work for reasons not related to those tests.
Look at the ABI options: maybe you want a 32-bit build (where the CPU
supports it) rather than a 64-bit one; on some architectures, things
get a bit more complicated than just those two options.
These kinds of options can be passed in by CFLAGS, LDFLAGS and LDLIBS. Support those three variables, and that's it.
On Mon, 1 Jan 2024 23:45:18 +0000, Bart wrote:
On 01/01/2024 23:10, Lawrence D'Oliveiro wrote:
You have the answer right in front of your nose: look at the source of
the software itself and figure it out.
So you don't know. You choose to believe that it HAS to be done that
way.
Out of curiosity, I *did* have a look at the build system for libgmp. The “configure” file is over 30,000 lines, but that is generated (via m4 macro
expansion) from a much smaller (and easier to read) “configure.ac” of about 4,000 lines. And the contents of the latter file are quite
interesting.
First of all, it includes code for building on a whole lot of proprietary Unix systems with non-GCC compilers, some of which maybe don’t quite conform to official C/C++ standards. You would think all those systems are extinct now, but clearly there are users who still care about them.
Look at the ABI options: maybe you want a 32-bit build (where the CPU supports it) rather than a 64-bit one; on some architectures, things get a bit more complicated than just those two options.
On 31/12/2023 18:33, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 31/12/2023 01:36, Lawrence D'Oliveiro wrote:
On Sat, 30 Dec 2023 01:58:53 +0000, Bart wrote:
So, why don't the vendors of the library do that exercise?
Maybe because most of the “vendors” of proprietary libraries have gone >>>> extinct. What we have now is “developers” and “contributors” to open-
source projects. And if you have a bright idea for how they can do
things
better, you are free to contribute it.
I have plenty of ideas, but people are generally not interested.
Perhaps they are not very good ideas, then....
Frankly, your obsession with header files is puzzling. 99.9%
percent of C/C++ programmers don't care.
Well, they should care more.
Even considering only C, using libraries like SDL2, Windows, GTK looks
simple enough; you just write #include <header.h>, but behind that
header could be 100s of nested header files and 100s of 1000s of lines
of code.
On 01/01/2024 14:44, David Brown wrote:
On 31/12/2023 22:44, Lawrence D'Oliveiro wrote:
On Sun, 31 Dec 2023 16:25:08 +0100, David Brown wrote:
I realise that you (and possibly others) might find it useful for a
tool
to replace typedef identifiers with their definitions, but it could
only
be done for some cases, and is not as simple as macro substitution.
String-based macros are nothing but trouble.
What a strange thing to say.
Macros based on textual substitution have their advantages and their
disadvantages. It is a reasonable generalisation to say you should
prefer alternatives when available, such as inline functions, const
objects, enum constants, typedefs, etc., rather than macro
equivalents. But there are plenty of situations where C's
pre-processor macros are extremely useful in writing good, clear and
maintainable code.
The cases where macros are used sensibly would be better replaced by apt language features (D does this for example as it is no preprocessor).
But I tend to come across the ones where macros are overused or abused.
Some people seem to delight in doing so.
When working the Lua sources a few months ago, that makes heavy uses of macros defined in lower case which look just like functions calls.
One complex expression that my compiler had trouble with, when expanded,
resulted in a line hundreds of character long, and using 11 levels of nested parentheses, the most I've ever come across.
I really doubt that the authors would have written code that convoluted
if macros had not been available.
Typedefs are scoped, string
macros are not.
True. Sometimes that is an advantage, sometimes a disadvantage.
When it is an advantage?
My systems language only acquired macros with parameters a year or two
ago. They can only be used to expand to well-formed terms of
expressions, and are used very sparingly.
They have normal scoping, so can be shadowed or have distinct versions
in different scopes and functions can define their own macros; they can
be imported or exported just like functions and variables.
The sorts of macros that C has seem stone age by comparison. And crude,
in being able to define random bits of syntax, or synthesise tokens from multiple parts.
Some people will obviously love that, but imagine trying to search for
some token with a text editor, but it won't find it because it only
exists after preprocessing.
Like any powerful tool, macros can be abused or misused, leading to
poorer results - but that does not mean they are "nothing but trouble".
I wouldn't use the work 'powerful'; 'dangerous' and 'crazy' might be
more apt.
Imagine a CPP that could also map one letter another; more 'powerful',
or just more 'crazy'?
On Tue, 2 Jan 2024 06:23:03 -0000 (UTC), Kaz Kylheku wrote:
On 2024-01-02, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
First of all, it includes code for building on a whole lot of
proprietary Unix systems with non-GCC compilers, some of which maybe
don’t quite conform to official C/C++ standards.
When people use Autoconf to configure their program for building, the
generated script includes cruft for platforms on which they never tested
the program, and never will, and on which it won't work for reasons not
related to those tests.
This is a GNU project, so you can be sure they have done pretty good
testing on those platforms and those options. If they were no longer supported, they would have been dropped from the code. This isn’t like proprietary software, which tends to accumulate cruft that nobody dares
touch because they no longer understand what it does.
Look at the ABI options: maybe you want a 32-bit build (where the CPU
supports it) rather than a 64-bit one; on some architectures, things
get a bit more complicated than just those two options.
These kinds of options can be passed in by CFLAGS, LDFLAGS and LDLIBS.
Support those three variables, and that's it.
If you have a look, that’s what the script does. Only you just have to specify one option, and it generates the right value for *all* the
relevant variables, instead of you having to specify them individually.
One complex expression that my compiler had trouble with, when expanded,
resulted in a line hundreds of character long, and using 11 levels of nested parentheses, the most I've ever come across.
When working the Lua sources a few months ago, that makes heavy uses ofThat's fine, if they also act just like function calls.
macros defined in lower case which look just like functions calls.
On 01/01/2024 16:54, Bart wrote:
I use macros for things like "Assert" and "Panic", where the controlling expression gets "stringified" and the function name, file and line
numbers are included in the message that is printed and stored in logs.
You can't do that without a preprocessor, unless the language supports a level of reflection well beyond the very limited forms found in D and
C++ (and even the proposals for C++).
I use macros for giving neater names to things that can't be made as functions, constants, etc., such as gcc __attributes__ or C++
attributes, or lists of pragmas, especially in connection with
conditional compilation to adapt to different compilers or platforms.
I use macros if I need to replace something temporarily, such as for debugging or tracing part of a program.
I use the preprocessor for generating unique names for things that need unique names for the language syntax, but for which I will never refer
to by name.
The most complex preprocessor stuff I have is probably use of so-called "x-macros". I've used these to build simple command-line interfaces
(with commands, sub-commands, and parameters, along with help text) and hierarchical menu systems.
One complex expression that my compiler had trouble with, when
expanded, resulted in a line hundreds of character long, and using
11 levels of nested parentheses, the most I've ever come across.
That's great - something that can help you find bugs or unnecessary
limits in your compiler.
If you want to have something that works outside of scopes, being
unscoped is an advantage. It does not happen often, but it happens.
Maybe you've got functions that needs a lot of calculations, using lines
that follow a similar pattern. Putting those patterns in a macro saves repetition in the source code, reduces the risk of errors, and makes the whole thing clearer. But if the identifiers in the pattern have
different scopes when they are used (perhaps they are in different functions), you can take advantage of macros' independence of scopes to
avoid having to pass local data as parameters.
They have normal scoping, so can be shadowed or have distinct versions
in different scopes and functions can define their own macros; they
can be imported or exported just like functions and variables.
So why not just use functions?
Or implement more advanced core language
features, like templates?
The point of C preprocessor macros - the reason that they are useful in
ways that cannot be handled by core language features - is that they are purely textual. They exist outside of scoping, and language syntax.
They can have unmatched brackets, or construct identifiers on the fly,
or do all kinds of manipulation of code. That allows for very powerful
uses - and, of course, abuses.
It is certainly the case that some common uses of macros in C have beenI haven't noticed. Things are bad enough now; are you saying they were
made redundant by better language features in C++, D, and even in later versions of C. Most common uses of #define'd constants are better
handled by "static const" or "enum". Most function-like macros are
better handled by static inline functions, or C++/D templates. Ugly
C90/C99 style static_assert macros are best done with real
_Static_assert from C11. Many "tricks" that previously needed macros to
get efficient code generation are made unnecessary by modern optimising compilers.
So if you compare decades-old C code with modern C++, you should see a dramatic reduction in macros and pre-processor usage.
You say "crude", others say "powerful" and "flexible". The others would
be right.
Some people will obviously love that, but imagine trying to search for
some token with a text editor, but it won't find it because it only
exists after preprocessing.
Imagine /not/ having macros, and having to type out those tokens again
and again, when a macro could be defined once. Then imagine wanting to change the tokens, and having to do so everywhere in the code instead of
once in the definition.
It is not often that you get a free lunch - power and flexibility in one
way will often limit things in other ways. There are always tradeoffs,
with all features, in all languages - I would have thought you might
have learned that by now.
On Sun, 31 Dec 2023 22:57:12 -0800, Chris M. Thomasson wrote:
It was just an example of some fairly hard core macro magic.
String-based macros aren’t “magic”, they’re just sad.
Bart wrote:
One complex expression that my compiler had trouble with, when expanded,
resulted in a line hundreds of character long, and using 11 levels of
nested parentheses, the most I've ever come across.
Happen to remember particularly which that one was you?
On 02/01/2024 10:42, David Brown wrote:
On 01/01/2024 16:54, Bart wrote:
I use macros for things like "Assert" and "Panic", where the
controlling expression gets "stringified" and the function name, file
and line numbers are included in the message that is printed and
stored in logs. You can't do that without a preprocessor, unless the
language supports a level of reflection well beyond the very limited
forms found in D and C++ (and even the proposals for C++).
I use macros for giving neater names to things that can't be made as
functions, constants, etc., such as gcc __attributes__ or C++
attributes, or lists of pragmas, especially in connection with
conditional compilation to adapt to different compilers or platforms.
I use macros if I need to replace something temporarily, such as for
debugging or tracing part of a program.
I use the preprocessor for generating unique names for things that
need unique names for the language syntax, but for which I will never
refer to by name.
The most complex preprocessor stuff I have is probably use of
so-called "x-macros". I've used these to build simple command-line
interfaces (with commands, sub-commands, and parameters, along with
help text) and hierarchical menu systems.
X-macro are ugly. They are unreadable.
At best, if someone has already
done the hard work of setting up a working set of macros, then you will
be able to see how to modify, add or delete entries.
For the purposes of creating parallel sets of enums and associated data,
I use special syntax which makes for a clearer and simpler feature.
Have you ever considered that if C didn't have macros, or they weren't powerful, then it could have evolved superior, built-in alternatives?
One complex expression that my compiler had trouble with, when
expanded, resulted in a line hundreds of character long, and using
11 levels of nested parentheses, the most I've ever come across.
That's great - something that can help you find bugs or unnecessary
limits in your compiler.
It wasn't a compiler limit - it was mine! The resulting expression was completely unreadable.
In the end I resorted to looking at the AST as
that was clearer than C source code.
If you want to have something that works outside of scopes, being
unscoped is an advantage. It does not happen often, but it happens.
Maybe you've got functions that needs a lot of calculations, using
lines that follow a similar pattern. Putting those patterns in a
macro saves repetition in the source code, reduces the risk of errors,
and makes the whole thing clearer. But if the identifiers in the
pattern have different scopes when they are used (perhaps they are in
different functions), you can take advantage of macros' independence
of scopes to avoid having to pass local data as parameters.
The scope I'm talking about is the name of the macro, not that of the
macro parameters.
They have normal scoping, so can be shadowed or have distinct
versions in different scopes and functions can define their own
macros; they can be imported or exported just like functions and
variables.
So why not just use functions?
There are a few uses where functions won't work or would be inefficient.
My macros are easier to implement than inlined functions.
Half my use-cases involve inline assembly.
Others involve creating
aliases for things like module names:
module mm_mcldecls as md
Now I can use md.F to disambiguate F instead of mm_mcldecls.F (when F is exported by more than one module). Here the macro mechanism is used internally.
The alternative here would be a special-purpose 'alias' feature but this seems to work too.
Or implement more advanced core language features, like templates?
The point of C preprocessor macros - the reason that they are useful
in ways that cannot be handled by core language features - is that
they are purely textual. They exist outside of scoping, and language
syntax. They can have unmatched brackets, or construct identifiers on
the fly, or do all kinds of manipulation of code. That allows for
very powerful uses - and, of course, abuses.
Even when not being abused, their very use can cause problems, for
example when they appear in APIs for libraries that could be used across
an FFI.
Macros are a C language artefact, and what they expand to is arbitrary C syntax that can be meaningless elsewhere.
(When I processed the GTK headers from 350,000 lines of C to 25,000
lines in my syntax, the last 4,000 lines were C macros that weren't identifiable as simple named literals (eg #define A 100) and that would
need dealing with manually.)
It is certainly the case that some common uses of macros in C haveI haven't noticed. Things are bad enough now; are you saying they were
been made redundant by better language features in C++, D, and even in
later versions of C. Most common uses of #define'd constants are
better handled by "static const" or "enum". Most function-like macros
are better handled by static inline functions, or C++/D templates.
Ugly C90/C99 style static_assert macros are best done with real
_Static_assert from C11. Many "tricks" that previously needed macros
to get efficient code generation are made unnecessary by modern
optimising compilers.
So if you compare decades-old C code with modern C++, you should see a
dramatic reduction in macros and pre-processor usage.
worse?
You say "crude", others say "powerful" and "flexible". The others
would be right.
It's crude for many reasons, here's one:
#define x y
That is intended to replace instances of the top level name 'x' with
'y'. But in C, it also replaces 'x' in 'p.x' with 'p.y'.
In fact, any macro name you create is at risk of clashing with struct
member names. Such names are usually considered safe in a private
namespace; not in C!
Some people will obviously love that, but imagine trying to search
for some token with a text editor, but it won't find it because it
only exists after preprocessing.
Imagine /not/ having macros, and having to type out those tokens again
and again, when a macro could be defined once. Then imagine wanting
to change the tokens, and having to do so everywhere in the code
instead of once in the definition.
I don't like macros, even the advanced ones in modern languages. I
consider them an anti-feature. And I did without them for decades, other
than for a time, simple ones with no parameters were used.
The parameterised ones I have now are an experiment. But every time I
use one, I get the same feeling as when I write 'goto'.
It is not often that you get a free lunch - power and flexibility in
one way will often limit things in other ways. There are always
tradeoffs, with all features, in all languages - I would have thought
you might have learned that by now.
It's not a free lunch.
A full CPP is quite difficult to implement,
possibly harder than C itself.
The one I have is perhaps 90% there; it
will not do esoteric stuff.
On Sun, 31 Dec 2023 16:25:08 +0100, David Brown wrote:
I realise that you (and possibly others) might find it useful for a tool
to replace typedef identifiers with their definitions, but it could only
be done for some cases, and is not as simple as macro substitution.
String-based macros are nothing but trouble. Typedefs are scoped, string macros are not.
If you want to see the right way to do macros, look at LISP, where they
are token-based, and much more robust as as result. I think they even
manage to apply scoping rules to macro definitions as well.
By the way, I'm not sure what system you are referring to by
"string-based macros", but in case you might be thinking of C
preprocessing, be advised that it's token-based.
You're going to defend this to the death aren't you? Be funny if at some point some GMP II was produced whose main new benefit was a vastly
simplified build!
By doing searches, I found a bunch of libxxx.a files with today's date
in various locations.
So the outputs are archive files for gcc, I guess intended for static linking.
The INSTALL file talks about reading detailed instructions in gmp_info,
but this file is gobbledygook. You need to view the instructions using a program called 'info' - a Linux utility that doesn't exist on Windows.
So I need Linux even just to look at a bunch of instructions?
What, God forbid, if 'gcc -E' was not supported?
And why aren't these checks everytime you build /any/ program?
What a total waste of time.
GMP, when it's actually made available, is actually an incredibly fast library. Some talented people went into creating it. I can't say the
same for those responsible for the build system who seem to have made it
as complicated and convoluted as they possibly can.
On 02/01/2024 15:10, Blue-Maned_Hawk wrote:
Bart wrote:
One complex expression that my compiler had trouble with, when expanded, >>> resulted in a line hundreds of character long, and using 11 levels of >>> nested parentheses, the most I've ever come across.
Happen to remember particularly which that one was you?
No. You can have a look yourself if you like, just apply -E to the sources.
Here's one with quite a long expansion but not too deeply nested. This
is the line inside lvm.c at line 1463:
op_arith(L, l_addi, luai_numadd);
The expansion is:
{TValue*v1=(&((base+((((int)((((i)>>((((0+7)+8)+1)))&((~((~ (Instruction)0)<<(8)))<<(0)))))))))->val);TValue*v2=(&((base+((((int) ((((i)>>(((((0+7)+8)+1)+8)))&((~((~(Instruction)0)<<(8))) <<(0)))))))))->val);{StkId ra=(base+(((int)((((i)>>((0+7)))&((~((~ (Instruction)0)<<(8)))<<(0)))))));if(((((v1))->tt_)==(((3)|((0)<<4))))&& ((((v2))->tt_)==(((3)|((0)<<4))))){lua_Integer i1=(((v1)->value_).i); lua_Integer i2=(((v2)->value_).i);pc++;{TValue*io=((&(ra)->val)); ((io)->value_).i=(((lua_Integer)(((lua_Unsigned)(i1))+((lua_Unsigned)(i2)))));
((io)->tt_=(((3)|((0)<<4))));};}else{lua_Number n1;lua_Number n2;if((((((v1))->tt_)==(((3)|((1)<<4))))?((n1)=(((v1)->value_).n),1): (((((v1))->tt_)==(((3)|((0)<<4))))?((n1)=((lua_Number) (((((v1)->value_).i)))),1):0))&&(((((v2))->tt_)==(((3)|((1)<<4))))? ((n2)=(((v2)->value_).n),1):(((((v2))->tt_)==(((3)|((0)<<4))))?((n2)= ((lua_Number)(((((v2)->value_).i)))),1):0))){pc++;{TValue*io= ((&(ra)->val));((io)->value_).n=(((n1)+(n2)));((io)->tt_=(((3)| ((1)<<4))));};}};};};
This is the definition of that macro:
#define op_arith(L,iop,fop) { \
TValue *v1 = vRB(i); \
TValue *v2 = vRC(i); \
op_arith_aux(L, v1, v2, iop, fop); }
were 'op_arith_aux' is another macro. Actually, probably everything is.
This looks fun to try to understand, debug, or port.
Here you might well ask why inlined functions weren't better suited.
On Mon, 1 Jan 2024 11:56:00 +0000, Bart wrote:
On 01/01/2024 02:00, Lawrence D'Oliveiro wrote:
Those are just standard file-manipulation tools that any decent OS
should provide.
What's that got to do with building able to build programs easily?
You have effectively answered that question yourself. Why were you trying
to build GNU GMP on Windows? Isn’t there something equally capable and yet >more, shall we say, “native” to Windows, that you can build and use in a >more “Windows-native” fashion?
On Tue, 2 Jan 2024 20:23:24 +0100, Janis Papanagnou wrote:
That said; during my early K&R C era we had 'register' declarations, but
I rarely saw them, they seem to have quickly vanished from usage. Now
I've heard that 'inline' optimizations have been introduced in C. Isn't
that considered a task for the compiler?
Kids’ stuff. Want to see how a _real_ compiler does it? <https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Attribute-Syntax.html>
That said; during my early K&R C era we had 'register' declarations, but
I rarely saw them, they seem to have quickly vanished from usage. Now
I've heard that 'inline' optimizations have been introduced in C. Isn't
that considered a task for the compiler?
On 02/01/2024 06:47, Lawrence D'Oliveiro wrote:
On Tue, 2 Jan 2024 06:23:03 -0000 (UTC), Kaz Kylheku wrote:
On 2024-01-02, Lawrence D'Oliveiro <ldo@nz.invalid> wrote:
First of all, it includes code for building on a whole lot of
proprietary Unix systems with non-GCC compilers, some of which maybe
don’t quite conform to official C/C++ standards.
When people use Autoconf to configure their program for building, the
generated script includes cruft for platforms on which they never tested >>> the program, and never will, and on which it won't work for reasons not
related to those tests.
This is a GNU project, so you can be sure they have done pretty good
testing on those platforms and those options. If they were no longer
supported, they would have been dropped from the code. This isn’t like
proprietary software, which tends to accumulate cruft that nobody dares
touch because they no longer understand what it does.
Look at the ABI options: maybe you want a 32-bit build (where the CPU
supports it) rather than a 64-bit one; on some architectures, things
get a bit more complicated than just those two options.
These kinds of options can be passed in by CFLAGS, LDFLAGS and LDLIBS.
Support those three variables, and that's it.
If you have a look, that’s what the script does. Only you just have to
specify one option, and it generates the right value for *all* the
relevant variables, instead of you having to specify them individually.
You're going to defend this to the death aren't you? Be funny if at some >point some GMP II was produced whose main new benefit was a vastly
simplified build!
I tried to build gmp today using WSL. The process took just under five >minutes. However, where and what is the output? For all the verbosity,
that simple fact is omitted.
Bart <bc@freeuk.cm> writes:
You're going to defend this to the death aren't you? Be funny if at some
point some GMP II was produced whose main new benefit was a vastly
simplified build!
You still haven't provided a better system that accomplishes
everthing the existing system does.
Quit bitching about it and _do_ something about it.
It's an open
source system, you are free to contribute a new build system for
it (which clearly needs to support all the capabilities of the
existing system otherwise you'll never have your suggestions
accepted).
Don't, however, expect anyone to cater to you.
I tried to build gmp today using WSL. The process took just under five
minutes. However, where and what is the output? For all the verbosity,
that simple fact is omitted.
You presumably specified that on the ./configure with the --prefix
option. Or did you forget to RTFM first?
On 02/01/2024 17:12, Bart wrote:
X-macro are ugly. They are unreadable.
Speaking as someone who has used them in real code, rather than someone
with a pathological hatred of macros and who prefers knee-jerk reaction
to their uses instead of applying a few minutes objective thought,
"x-macros" can make code significantly simpler, clearer, and much easier
to maintain correctly.
At best, if someone has already done the hard work of setting up a
working set of macros, then you will be able to see how to modify, add
or delete entries.
They are not hard.
For the purposes of creating parallel sets of enums and associated
data, I use special syntax which makes for a clearer and simpler feature.
And that is, obviously, utterly useless - because it is not C or even a commonly supported extension. Oh, and even if it /were/ part of C, it
would not help because that's not what I was doing.
Have you ever considered that if C didn't have macros, or they weren't
powerful, then it could have evolved superior, built-in alternatives?
The C preprocessor is part of C - it is already built in.
It wasn't a compiler limit - it was mine! The resulting expression was
completely unreadable.
I wonder if the code authors also found this expanded expression
unreadable.
I guess not - because they used macros so that they could
write it in a clear and understandable way, and never have to look at
the expansion! You don't seem to comprehend the point of macros at all.
In the end I resorted to looking at the AST as that was clearer than C
source code.
Why not just look at the code as written, if you want to understand it?
Or would doing something that sensible put a damper on your rants?
Half my use-cases involve inline assembly.
Why can't that be in inlined functions?
Again, you are not showing that
these limited macros have any benefits compared to what is found in C,
Macros are a C language artefact, and what they expand to is arbitrary
C syntax that can be meaningless elsewhere.
Yes - C code is C code. It is not Pascal, or Fortran, or Ada. Should
this be a surprise?
You already have a habit of cherry-picking example code that is as "Bart-unfriendly" as you can find.
I don't have your experience, but I find that /very/ hard to believe. I should imagine it's primarily a matter of working through the
"translation phases" and "preprocessing directives" described in the C standards, step by step.
On Tue, 2 Jan 2024 18:16:36 -0000 (UTC), Kaz Kylheku wrote:
By the way, I'm not sure what system you are referring to by
"string-based macros", but in case you might be thinking of C
preprocessing, be advised that it's token-based.
Now go learn what “homoiconicity” means.
You missed my point. Take a tiny feature like being able to easily get
the size of a fixed-length array. You commonly see macro like this:
#define LENGTH(a) (sizeof(a)/size(a[0]))
On 02/01/2024 20:11, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
You're going to defend this to the death aren't you? Be funny if at some >>> point some GMP II was produced whose main new benefit was a vastly
simplified build!
You still haven't provided a better system that accomplishes
everthing the existing system does.
I need some basic information about what's what about in the 26MB of
content in 4000 files spread over 190 directories.
But it's clear that it's not interested in making things simple. (The
nearest they've come is with the 'mini-gmp' version. But since the >performance of that is on a par with my own library, I'll stick with
mine, which also has a higher spec.)
I have done this exercise with projects like LIBJPEG, LUA5.4 and
TCC0.9.27. Long ago I did it with SEED7.
Once I can extract the necessary info - IT ALWAYS COMES DOWN TO A LIST
OF FILES TO SUBMIT TO A COMPILER FGS - then I can whizz through it easily.
Quit bitching about it and _do_ something about it.
Funnily enough, I did. I wrote my own library, which I know is always >available, and has some unique features, eg. it uses decimal, and has a >single integer/float type.
I once linked to a C port of it.
It's not fast, but it's mine and it's in my language. (It accounts for
12KB of my interpreter.)
It's an open
source system, you are free to contribute a new build system for
it (which clearly needs to support all the capabilities of the
existing system otherwise you'll never have your suggestions
accepted).
Don't, however, expect anyone to cater to you.
There must be one person on the planet who understands how to produce a >Windows DLL of this thing. And funnily enough, a single 64-bit DLL is
all any Windows user on the planet needs (that and either gmp.h or >language-neutral docs).
I tried to build gmp today using WSL. The process took just under five
minutes. However, where and what is the output? For all the verbosity,
that simple fact is omitted.
You presumably specified that on the ./configure with the --prefix
option. Or did you forget to RTFM first?
I following these instructions:
--------------
Here are some brief instructions on how to install GMP. First you need to >compile. Since you're impatient, try this
./configure
make
make check <= VERY IMPORTANT!!
I did the first two line before looking for the binaries.
Bart <bc@freeuk.cm> writes:
On 02/01/2024 20:11, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
You're going to defend this to the death aren't you? Be funny if at some >>>> point some GMP II was produced whose main new benefit was a vastly
simplified build!
You still haven't provided a better system that accomplishes
everthing the existing system does.
I need some basic information about what's what about in the 26MB of
content in 4000 files spread over 190 directories.
But it's clear that it's not interested in making things simple. (The
nearest they've come is with the 'mini-gmp' version. But since the
performance of that is on a par with my own library, I'll stick with
mine, which also has a higher spec.)
I have done this exercise with projects like LIBJPEG, LUA5.4 and
TCC0.9.27. Long ago I did it with SEED7.
Once I can extract the necessary info - IT ALWAYS COMES DOWN TO A LIST
OF FILES TO SUBMIT TO A COMPILER FGS - then I can whizz through it easily. >>
Quit bitching about it and _do_ something about it.
Funnily enough, I did. I wrote my own library, which I know is always
available, and has some unique features, eg. it uses decimal, and has a
single integer/float type.
I once linked to a C port of it.
It's not fast, but it's mine and it's in my language. (It accounts for
12KB of my interpreter.)
It's an open
source system, you are free to contribute a new build system for
it (which clearly needs to support all the capabilities of the
existing system otherwise you'll never have your suggestions
accepted).
Don't, however, expect anyone to cater to you.
There must be one person on the planet who understands how to produce a
Windows DLL of this thing. And funnily enough, a single 64-bit DLL is
all any Windows user on the planet needs (that and either gmp.h or
language-neutral docs).
I tried to build gmp today using WSL. The process took just under five >>>> minutes. However, where and what is the output? For all the verbosity, >>>> that simple fact is omitted.
You presumably specified that on the ./configure with the --prefix
option. Or did you forget to RTFM first?
I following these instructions:
--------------
Here are some brief instructions on how to install GMP. First you need to >> compile. Since you're impatient, try this
./configure
make
make check <= VERY IMPORTANT!!
I did the first two line before looking for the binaries.
Try RTFM:
./configure --help
always useful.
On 1/2/24 21:24, Bart wrote:
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
You guys don't seem to get it.
On 1/2/24 21:24, Bart wrote:
You missed my point. Take a tiny feature like being able to easily get
the size of a fixed-length array. You commonly see macro like this:
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
On 1/2/24 21:24, Bart wrote:
You missed my point. Take a tiny feature like being able to easily get
the size of a fixed-length array. You commonly see macro like this:
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
On Wed, 3 Jan 2024 02:08:40 +0000, Bart wrote:
You guys don't seem to get it.
Another thing we do is, not to bother repeating work that others have
already done.
sudo apt-get install libgmp-dev
Job done.
On 03/01/2024 02:40, Lawrence D'Oliveiro wrote:should be available to Windows users? Why are Linux users so blessed?
On Wed, 3 Jan 2024 02:08:40 +0000, Bart wrote:
You guys don't seem to get it.
Another thing we do is, not to bother repeating work that others have
already done.
sudo apt-get install libgmp-dev
Job done.
OK, I've done that. But it didn't take 5 minutes as it did to compile it.
So I guess it is installing ready-built binaries?
In that case, why are people having a go at me for suggesting they
On 03/01/2024 02:40, Lawrence D'Oliveiro wrote:
On Wed, 3 Jan 2024 02:08:40 +0000, Bart wrote:
You guys don't seem to get it.
Another thing we do is, not to bother repeating work that others have
already done.
sudo apt-get install libgmp-dev
Job done.
OK, I've done that. But it didn't take 5 minutes as it did to compile it.
So I guess it is installing ready-built binaries?
In that case, why are people having a go at me for suggesting they
should be available to Windows users? Why are Linux users so blessed?
On 02/01/2024 23:24, tTh wrote:
On 1/2/24 21:24, Bart wrote:
You missed my point. Take a tiny feature like being able to easily get
the size of a fixed-length array. You commonly see macro like this:
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
Where do I start?
* Why is it necessary for a million programmers to each come up with
their own solutions for something so basic? Eg. they will all use
diferent names.
Bart <bc@freeuk.cm> writes:
On 02/01/2024 23:24, tTh wrote:
On 1/2/24 21:24, Bart wrote:
You missed my point. Take a tiny feature like being able to easily get >>>> the size of a fixed-length array. You commonly see macro like this:
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
Where do I start?
* Why is it necessary for a million programmers to each come up with
their own solutions for something so basic? Eg. they will all use
diferent names.
Why is it necessary to exaggerate?
static const size_t PENTIUM_MSR_PT_CNT =
sizeof(pentium_msr_passthrough) / sizeof(pentium_msr_passthrough[0]);
On 02/01/2024 17:34, David Brown wrote:
On 02/01/2024 17:12, Bart wrote:
X-macro are ugly. They are unreadable.
Speaking as someone who has used them in real code, rather than
someone with a pathological hatred of macros and who prefers knee-jerk
reaction to their uses instead of applying a few minutes objective
thought, "x-macros" can make code significantly simpler, clearer, and
much easier to maintain correctly.
I'm sorry, but I *DO* find them utterly impossible. This is a simple
example from Stackoverflow of someone wanting to define a set of enums
with an accompanying table, whose problem was in ensuring the two were
kept in sync.
The X-macro solution was this, adapted from the first answer here (https://stackoverflow.com/questions/6635851/real-world-use-of-x-macros); assume those functions are in scope:
-------
#define STATE_TABLE \
ENTRY(STATE0, func0) \
ENTRY(STATE1, func1) \
ENTRY(STATE2, func2) \
ENTRY(STATE3, func3) \
enum
{
#define ENTRY(a,b) a,
STATE_TABLE
#undef ENTRY
NUM_STATES
};
void* jumptable[NUM_STATES] =
{
#define ENTRY(a,b) b,
STATE_TABLE
#undef ENTRY
};
-------
With my feature, it is merely this:
global enumdata []ref void jumptable =
(state1, func1), # comment 1
(state2, func2),
(state3, func3), # comment 3
(state4, func4),
end
Notice:
(1) I don't need any of those weird macros.
(2) I don't need those backslashes
(3) I can add comments to any entry in the table.
(4) The 'global' attribute means both enumeration names and the array
are exported to other modules. Doing the same in C is tricky.
But I'm wasting my time because you are never going to admit my feature
is superior to X-macros for this purpose.
(The author of this example goes on to say how it can also be used to
define a set of function prototypes. I can't do that with my feature.
But then I don't need function prototypes!)
At best, if someone has already done the hard work of setting up a
working set of macros, then you will be able to see how to modify,
add or delete entries.
They are not hard.
Maybe not for you, but it would take me ages to come up with the right incantations to make it work.
For the purposes of creating parallel sets of enums and associated
data, I use special syntax which makes for a clearer and simpler
feature.
And that is, obviously, utterly useless - because it is not C or even
a commonly supported extension. Oh, and even if it /were/ part of C,
it would not help because that's not what I was doing.
I've given an example of a use-case for C macros which have a special
purpose feature in other languages. So of cause it's not in C.
Have you ever considered that if C didn't have macros, or they
weren't powerful, then it could have evolved superior, built-in
alternatives?
The C preprocessor is part of C - it is already built in.
You missed my point. Take a tiny feature like being able to easily get
the size of a fixed-length array. You commonly see macro like this:
#define LENGTH(a) (sizeof(a)/size(a[0]))
What incentive is there to properly add a built-in way to do that, when
it can be done, badly, and in 1000 different ways by each user, in a
one-line macro?
Another example is GETBIT(a, n).
It wasn't a compiler limit - it was mine! The resulting expression
was completely unreadable.
I wonder if the code authors also found this expanded expression
unreadable.
Look at the example I posted in reply to Blue-Maned_Hawk.
In particular consider my last comment.
I guess not - because they used macros so that they could write it in
a clear and understandable way, and never have to look at the
expansion! You don't seem to comprehend the point of macros at all.
In the end I resorted to looking at the AST as that was clearer than
C source code.
Why not just look at the code as written, if you want to understand it?
In my case I was debugging my C compiler. Then you need to examine the
actual expanded code in detail. Excessive use of macros makes that much harder.
That example came from Lua 5.4, an interpeter whose performance on a par
with my own product running from HLL-only code. That doesn't doesn't use deeply-nested macro like this, and it doesn't appear to suffer,.
Or would doing something that sensible put a damper on your rants?
Look, I'm written a C preprocessor (I suspect you haven't). If I wanted,
I could adapt it to work on my own languages.
But I /don't/ want to. Obviously I see something undesirable about them
(I don't like metaprogramming in general; people get too carried away
with it and want to show off their skills).
Why can't you accept that?
Half my use-cases involve inline assembly.
Why can't that be in inlined functions?
That doesn't work in inline assembly.
Again, you are not showing that these limited macros have any benefits
compared to what is found in C,
That's true. I can't write something like:
#define M a+b)*c
(M;
Macros are a C language artefact, and what they expand to is
arbitrary C syntax that can be meaningless elsewhere.
Yes - C code is C code. It is not Pascal, or Fortran, or Ada. Should
this be a surprise?
It's inconvenient. This is a macro from SDL2:
#define SDL_CompilerBarrier() \
{ SDL_SpinLock _tmp = 0; SDL_AtomicLock(&_tmp);
SDL_AtomicUnlock(&_tmp); }
Rather than declaring a function which resides in a language-neutral
DLL, it declares it here, in the form of actual C code.
Even given a language with an FFI that is capable of calling external functions compiled as C, what are they supposed to do with this?
That example was in a conditional block; I don't know the context. But
here's a juicy one that is the result of an attempt to automatically translate the C SDL2 API to my syntax:
global macro SDL_FOURCC(A,B,C,D) = ((SDL_static_cast(Uint32,SDL_static_cast(Uint8,(A)))<<0)|(SDL_static_cast(Uint32,SDL_static_cast(Uint8,(B)))<<8)|(SDL_static_cast(Uint32,SDL_static_cast(Uint8,(C)))<<16)|(SDL_static_cast(Uint32,SDL_static_cast(Uint8,(D)))<<24))
This also contains arbitrary C code; it needs a much more sophisticated
tool than one that just does declaration. Basically a transpiler from
C to the language one is writing bindinds for.
You already have a habit of cherry-picking example code that is as
"Bart-unfriendly" as you can find.
Try testing your own C compiler on just about any codebase. You don't
have to look far for extreme examples.
I don't have your experience, but I find that /very/ hard to believe.
I should imagine it's primarily a matter of working through the
"translation phases" and "preprocessing directives" described in the C
standards, step by step.
The C standard has very little to say about. I once found a very, very
long article which went into considerably more detail, now lost.
But even 1/3 through I decided that it was impossible.
In 2017 when I created my CPP, many compilers would give different
results with various edge cases of macros. Which were right and which
were wrong? You'd think there'd be a definitive reference for it.
Here's an example I've just found:
#include <stdio.h>
int main(void) {
#define FOO
#define BAR defined(FOO)
#if BAR
puts("BAR");
#else
puts("FOO");
#endif
}
Most compilers will display BAR; MSVC I think is the only one showing
FOO. (Some smaller compilers I tried on godbolt.org failed to compile it.)
Mine showed BAR (obviously!)
It's almost like they did it on purpose.
It does sound as though the GMP people crammed in as
many Linux dependencies as possible just to spite MS.
On 03/01/2024 18:14, Bart wrote:
On 03/01/2024 15:32, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 02/01/2024 23:24, tTh wrote:
On 1/2/24 21:24, Bart wrote:
You missed my point. Take a tiny feature like being able to easily get >>>>>> the size of a fixed-length array. You commonly see macro like this: >>>>>>
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
Where do I start?
* Why is it necessary for a million programmers to each come up with
their own solutions for something so basic? Eg. they will all use >>>> diferent names.
Why is it necessary to exaggerate?
How do you know whether I'm exaggerating or not? There have surely been
many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly, but
not all.
On 02.01.2024 20:35, Lawrence D'Oliveiro wrote:
On Tue, 2 Jan 2024 20:23:24 +0100, Janis Papanagnou wrote:
That said; during my early K&R C era we had 'register' declarations, but >>> I rarely saw them, they seem to have quickly vanished from usage. Now
I've heard that 'inline' optimizations have been introduced in C. Isn't
that considered a task for the compiler?
Kids’ stuff. Want to see how a _real_ compiler does it?
<https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Attribute-Syntax.html>
What shall I infer from that statement and link? - Mind to elaborate?
My original question was related to compiler (as opposed to programmer)
doing optimizations. I recall from decades ago that compilers will do optimizations e.g. on the attributed syntax tree level, while 'register'
or 'inline' seem very primitive constructs (on a comparable low level).
So I expressed my astonishment that 'inline' had been later introduced
in C, and I wonder why. (Note that the other poster also mentioned it
as a preferred way to replace [parameterized] macros, if I interpreted
him correctly.)
Janis
On 03/01/2024 15:32, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 02/01/2024 23:24, tTh wrote:
On 1/2/24 21:24, Bart wrote:
You missed my point. Take a tiny feature like being able to easily get >>>>> the size of a fixed-length array. You commonly see macro like this:
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
Where do I start?
* Why is it necessary for a million programmers to each come up with
their own solutions for something so basic? Eg. they will all use
diferent names.
Why is it necessary to exaggerate?
How do you know whether I'm exaggerating or not? There have surely been
many millions of people who've programmed in C over the years.
And a large proportion may have needed the length of an array whose dimensions are not set by a constant, but by the number of data elements provided.
(Even in the former case, if you have int A[N] and B[N], I think it is
better to have 'LENGTH(A)' within the subsequent code, rather than a
bare 'N' which could mean anything: is it the length or A, B, or does it
mean N by itself? What happens if you change it to A[M]?)
On 02/01/2024 21:24, Bart wrote:
I personally like to use a slightly different syntax, but that's just
detail and style choice.
#define DoStateList(DO) \
DO(state0, func0) /* Comment */ \
DO(state1, func1) \
DO(state2, func2) \
DO(state3, func3) \
#define GetState(state, func) state,
#define GetFunc(state, func) func,
#define Counter(state, func) +1
enum { num_of_states = DoStateList(Counter) };
enum States {
DoStateList(GetState)
};
p_func_t jump_table[] = {
DoStateList(GetFunc)
};
Note that the states can have comments. You do need a backslash at the
end of each line, and that means comments must be /* */ style, not //
style.
And since we have the x-macro in place, we can use it for more things.
Why manually declare "extern void func0(void);", etc., when you can do
it in two lines for all the functions?
#define DeclareFunc(state, func) extern void func(void); DoStateList(DeclareFunc)
Maybe you want a switch rather than a jump table - that could give more inlining opportunities, and is a common choice for the structure of
things like simple VM's and bytecode interpreters:
#define SwitchEntry(state, func) case state : func(); break;
void jump(enum States s) {
switch (s) {
DoStateList(SwitchEntry);
}
}
With my feature, it is merely this:
global enumdata []ref void jumptable =
(state1, func1), # comment 1
(state2, func2),
(state3, func3), # comment 3
(state4, func4),
end
That is an very niche and very limited use-case. Did you seriously add
a special feature to your language based on one example in one stack
overflow question?
X macros are just an imaginative way to use a
standard feature of C (and C++), giving vastly more flexibility than
your single-use language feature. I have difficulty seeing how it was
worth the effort putting that feature in your tools.
Let's try a more advanced example.
#define REMOVE_COMMAS_(_1, _2, _3, _4, _5, _6, _7, _8, ...) \
_1 _2 _3 _4 _5 _6 _7 _8
#define REMOVE_COMMAS(...) REMOVE_COMMAS_(__VA_ARGS__, ,,,,,,,,)
Now, some of that is messy - no doubts there. Some things could have
been a lot easier if C macros were more powerful,
with features such as
recursion or neater handling of variadic packs. Macro names scoped
within functions would also make it better. So there's plenty of room
for a new language to make things better than C.
But what do we get out of this? We have all our commands defined in one list, with the textual name of the command, the parameters, and a help text. You can't get things out of sync - if you add a command, or
change parameters, the help function, the declarations, and the
dispatcher all adapt automatically.
You might not think this is a good way to structure your source code.
There are many possibilities, including more manual work, or more
run-time work. You could use an external code generator. You could use
a language with much better reflection capabilities (like Python). But
this is something you can do, today, in plain C, and it shows that
x-macros can do vastly more than declare an enumeration and a table of pointers.
Now, if your language had powerful reflection and metaprogramming
features, along with compile-time execution, so that this kind of thing
could be done within the language without text-based macros, then I
would happily agree that it has better features. Perhaps the way to do
that would be to integrate your interpreted language in your compiler.
There is a marvellous tool that might help you here - it's called "the internet".
So the problem is your compiler, not other peoples' code? You think
That doesn't work in inline assembly.
So that's a limitation of your tools.
It has plenty to say about the C preprocessor - it fully defines it.
So MSVC has a bug. Report it if you like.
On 03/01/2024 18:14, Bart wrote:
How do you know whether I'm exaggerating or not? There have surely
been many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly, but
not all. I can't say I have ever had one in my code as far as I
remember. Occasionally I find it convenient to calculate the size of an existing array, but I'll just write it manually as an expression (like
Scott seems to do).
Generally if I need the size of an array, I already
know it - I can use the same "no_of_samples" (or whatever) constant I
used when defining the array in the first place.
And a large proportion may have needed the length of an array whose
dimensions are not set by a constant, but by the number of data
elements provided.
I have no basis to guess whether this proportion is large or small. I
don't imagine you do either.
(Even in the former case, if you have int A[N] and B[N], I think it is
better to have 'LENGTH(A)' within the subsequent code, rather than a
bare 'N' which could mean anything: is it the length or A, B, or does
it mean N by itself? What happens if you change it to A[M]?)
The trick is not to use single letter identifiers when you want their
meaning to be clear.
I would not object to there being a standard C macro for finding the
size of an array. But I think it would be out of character for the
standard library.
It would make more sense if the language had more
support for arrays and allowing them as values, parameters, and in expressions - then a standard "size" feature would be expected.
On 03/01/2024 19:16, David Brown wrote:
On 03/01/2024 18:14, Bart wrote:
How do you know whether I'm exaggerating or not? There have surely
been many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly, but
not all. I can't say I have ever had one in my code as far as I
remember. Occasionally I find it convenient to calculate the size of an
existing array, but I'll just write it manually as an expression (like
Scott seems to do).
So writing the same long identifier twice?
And hoping there's no typo in one?
B are arrays.
Bart <bc@freeuk.cm> writes:
On 03/01/2024 19:16, David Brown wrote:
On 03/01/2024 18:14, Bart wrote:
How do you know whether I'm exaggerating or not? There have surely
been many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly, but
not all. I can't say I have ever had one in my code as far as I
remember. Occasionally I find it convenient to calculate the size of an >>> existing array, but I'll just write it manually as an expression (like
Scott seems to do).
So writing the same long identifier twice?
Yes. Good editors mean you don't need to type it
twice (ywllp).
And hoping there's no typo in one?
If there's a typo, the compiler will note it and I'll
fix it. But, see above.
Because sizeof(A)/sizeof(B[0]) would be legal code when both A and
B are arrays.
Good thing I don't use single letter identifiers.
On 2024-01-03, David Brown <david.brown@hesbynett.no> wrote:
On 03/01/2024 18:14, Bart wrote:
On 03/01/2024 15:32, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 02/01/2024 23:24, tTh wrote:
On 1/2/24 21:24, Bart wrote:
You missed my point. Take a tiny feature like being able to easily get >>>>>>> the size of a fixed-length array. You commonly see macro like this: >>>>>>>
#define LENGTH(a) (sizeof(a)/size(a[0]))
And why is that bad? That's a real question.
Where do I start?
* Why is it necessary for a million programmers to each come up with >>>>> their own solutions for something so basic? Eg. they will all use >>>>> diferent names.
Why is it necessary to exaggerate?
How do you know whether I'm exaggerating or not? There have surely been
many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly, but
not all.
I've seen it a lot. If it didn't have issues, it would be an excellent inclusion in <stddef.h>, along with offsetof(type, member) and such.
Macros with issues should not be standardized though. For instance min
and max macros appear regularly in C programs, but feature multiple
argument evaluation.
For these kinds of things, it's better to wait until the language
develops a good solution. min and max want to be type-generic inline functions. I think that this is doable in C with _Generic. In the
April 2023 draft, I don't see any min functions other than fminf,
fmin and fminl, which are float, double and long double. No generic
min and max are mentioned for <tgmath.h>
There are probably too many combinations to handle; you need
two levels of _Generic selection, each switching on a number of integer
and floating-point types. (There being reams and reams of templates
doesn't stop C++, though.)
For counting the elements in an array, we really want a sizeof-like
keyword, which takes a parenthesized type or expression. That expression
is constrained to be of array type.
(Might it be possible with _Generic to detect that we have an operand
which is an "array of anything"? I'm guessing not.)
On 03/01/2024 19:16, David Brown wrote:
On 03/01/2024 18:14, Bart wrote:
How do you know whether I'm exaggerating or not? There have surely
been many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly, but
not all. I can't say I have ever had one in my code as far as I
remember. Occasionally I find it convenient to calculate the size of
an existing array, but I'll just write it manually as an expression
(like Scott seems to do).
So writing the same long identifier twice? And hoping there's no typo in
one? Because sizeof(A)/sizeof(B[0]) would be legal code when both A and
B are arrays.
(I already know your counter-argument: what stops someone writing
LENGTH(B) instead of LENGTH(A) anyway. Well, writing it twice gives two opportunities to get it wrong!)
Generally if I need the size of an array, I already know it - I can
use the same "no_of_samples" (or whatever) constant I used when
defining the array in the first place.
This is my A/B/N argument below. Maybe 'no_of_samples' is used as the dimension for more than one array.
And a large proportion may have needed the length of an array whose
dimensions are not set by a constant, but by the number of data
elements provided.
I have no basis to guess whether this proportion is large or small. I
don't imagine you do either.
Well, here is an extract from sqlite3.c:
On 04/01/2024 00:48, Bart wrote:
On 03/01/2024 19:16, David Brown wrote:
On 03/01/2024 18:14, Bart wrote:
How do you know whether I'm exaggerating or not? There have surely
been many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly,
but not all. I can't say I have ever had one in my code as far as I
remember. Occasionally I find it convenient to calculate the size of
an existing array, but I'll just write it manually as an expression
(like Scott seems to do).
So writing the same long identifier twice? And hoping there's no typo
in one? Because sizeof(A)/sizeof(B[0]) would be legal code when both A
and B are arrays.
If you have difficulty typing out the same identifier twice, then you
have picked a poor name for the identifier! The whole point of
identifiers is to identify things - you give the thing a name, so that
you can refer to it later by that same name and know you are referring
to the same thing. If find this difficult when using the same array identifier twice in one expression, then you have vastly bigger problems
with your coding than getting the size of an array.
Yes, maybe it is. But I know the size of the arrays I use. If I don't know, I look at the definition to see. I don't code blindly.
Well, here is an extract from sqlite3.c:
I didn't write sqlite3.c. I am not responsible for the choices of identifiers, or coding style, or macro usage, or anything else in it. I
am telling you why /I/ don't need or use any kind of "array_size" macro.
Other people may write their code differently.
David Brown <david.brown@hesbynett.no> writes:
On 02/01/2024 21:24, Bart wrote:[...]
The X-macro solution was this, adapted from the first answer here
(https://stackoverflow.com/questions/6635851/real-world-use-of-x-macros); assume those functions are in scope:
-------
#define STATE_TABLE \
ENTRY(STATE0, func0) \
ENTRY(STATE1, func1) \
ENTRY(STATE2, func2) \
ENTRY(STATE3, func3) \
enum
{
#define ENTRY(a,b) a,
STATE_TABLE
#undef ENTRY
NUM_STATES
};
void* jumptable[NUM_STATES] =
{
#define ENTRY(a,b) b,
STATE_TABLE
#undef ENTRY
};
-------
(Why did you change the type from "p_func_t" to "void*" ? Was it just
to annoy myself and other C programmers with a pointless and
constraint-violating cast of a function pointer to "void*" ? Just add
a suitable typedef - "typedef void(*p_func_t)(void);" )
What constraint does it violate? And what cast are you referring to?
On 03/01/2024 22:42, Keith Thompson wrote:
David Brown <david.brown@hesbynett.no> writes:
On 02/01/2024 21:24, Bart wrote:[...]
The X-macro solution was this, adapted from the first answer here
(https://stackoverflow.com/questions/6635851/real-world-use-of-x-macros); assume those functions are in scope:
-------
#define STATE_TABLE \
ENTRY(STATE0, func0) \
ENTRY(STATE1, func1) \
ENTRY(STATE2, func2) \
ENTRY(STATE3, func3) \
enum
{
#define ENTRY(a,b) a,
STATE_TABLE
#undef ENTRY
NUM_STATES
};
void* jumptable[NUM_STATES] =
{
#define ENTRY(a,b) b,
STATE_TABLE
#undef ENTRY
};
-------
(Why did you change the type from "p_func_t" to "void*" ? Was it just
to annoy myself and other C programmers with a pointless and
constraint-violating cast of a function pointer to "void*" ? Just add
a suitable typedef - "typedef void(*p_func_t)(void);" )
What constraint does it violate? And what cast are you referring to?
I believe the initialisation follows the requirements for simple
assignment, and function pointers are not compatible with void*. Bart
(for reasons understood only by him) uses void* pointers when he wants generic function pointers. The stack overflow link uses "p_func_t",
which is a function pointer typedef.
On 03/01/2024 16:41, David Brown wrote:
On 02/01/2024 21:24, Bart wrote:
I personally like to use a slightly different syntax, but that's just
detail and style choice.
#define DoStateList(DO) \
DO(state0, func0) /* Comment */ \
DO(state1, func1) \
DO(state2, func2) \
DO(state3, func3) \
#define GetState(state, func) state,
#define GetFunc(state, func) func,
#define Counter(state, func) +1
enum { num_of_states = DoStateList(Counter) };
enum States {
DoStateList(GetState)
};
p_func_t jump_table[] = {
DoStateList(GetFunc)
};
I admit that your version is cleaner than other versions of X-macros
I've come across. I can almost even understand it.
But nearly every version I've seen /has/ been ugly, including one which
may have been in a version of Lua. (One problem still is recognising
uses of X-macros, as there is no prefix like 'xmacro' to look for. So
the current version may well have them; I can't tell.)
Note that the states can have comments. You do need a backslash at
the end of each line, and that means comments must be /* */ style, not
// style.
I tried various combinations but not /*...*/ before the \.
And since we have the x-macro in place, we can use it for more things.
Why manually declare "extern void func0(void);", etc., when you can do
it in two lines for all the functions?
#define DeclareFunc(state, func) extern void func(void);
DoStateList(DeclareFunc)
You can also ask why need the prototype.
(My first big C app, I used a
script to process my code, which also generated two sets of function prototypes: one for locals, one for exported. This applied to all define functions not just the ones relating to enums.)
Maybe you want a switch rather than a jump table - that could give
more inlining opportunities, and is a common choice for the structure
of things like simple VM's and bytecode interpreters:
#define SwitchEntry(state, func) case state : func(); break;
void jump(enum States s) {
switch (s) {
DoStateList(SwitchEntry);
}
}
That's an interesting use. But rather limited as it is, if it only
contains a function call; a table of functions is better. Here you'd
want to capture the generated output, and use that as a framework to
populate with manual code later on.
With my feature, it is merely this:
global enumdata []ref void jumptable =
(state1, func1), # comment 1
(state2, func2),
(state3, func3), # comment 3
(state4, func4),
end
That is an very niche and very limited use-case. Did you seriously
add a special feature to your language based on one example in one
stack overflow question?
It's not limited at all. I use it very, very extensively. Virtually all
of my enums are written in this form, as most will at least have
associated names, if not other related data.
I rarely use bare enums. By contrast, most C source code uses bare enum lists; there is very little use of X-macros.
I wonder if that would be different if they were a built-in,
easier-to-use feature?
X macros are just an imaginative way to use a standard feature of C
(and C++), giving vastly more flexibility than your single-use
language feature. I have difficulty seeing how it was worth the
effort putting that feature in your tools.
That 'feature' used to done with an external script, with input coming
from text files. Putting it into the language added only one 130-line function, and was superior and much tidier.
Let's try a more advanced example.
#define REMOVE_COMMAS_(_1, _2, _3, _4, _5, _6, _7, _8, ...) \
_1 _2 _3 _4 _5 _6 _7 _8
#define REMOVE_COMMAS(...) REMOVE_COMMAS_(__VA_ARGS__, ,,,,,,,,)
<snip>
I'm sorry, but this where people get crazy with macros.
It's not just x-macros anymore, but just macros.
(Here, you will appreciate having one simple dedicated feature that you
KNOW does one thing: declare parallel enums/arrays, or arrays/arrays, in table form.)
If I have time, I will figure out what your code is meant to do later
(it has some functions that need to be added before I can run it to see
what it does).
Then I will post a more readable version.
... OK, I had a look. I think it is good example of using macros
ingeniously, but a poor example of code as it looks terrible. I think
far from drawing things together, it looks all over the place.
I tweaked your version into a runnable program, which took 66 lines of
code. Then I took the expanded, non-macro version and tidied it up; it
was 45 lines and far more readable.
I doubt there would be that much savings in vertical space if scaled up
to a lot more commands.
It's hard to offer alternatives, since the task is unclear (using two
levels of handler code).
But, in my stuff this sort of thing typically makes use of function reflection. So given a command "show", I can scan function names for one called "cmd_show", but this tends to be done in a setup step that
populates a table.
The associated help text is harder; two features that might have helped (function metadata strings and docstrings) I used to have, but have
since dropped.
However there are lots of alternatives that still be clearer than your example (one more is given below).
Now, some of that is messy - no doubts there. Some things could have
been a lot easier if C macros were more powerful,
This is the danger - piling on even more features to a language already
ten times harder to code in that C.
If you're going to be adding features, how about fixing the main language?
with features such as recursion or neater handling of variadic packs.
Macro names scoped within functions would also make it better. So
there's plenty of room for a new language to make things better than C.
But what do we get out of this? We have all our commands defined in
one list, with the textual name of the command, the parameters, and a
help text. You can't get things out of sync - if you add a command,
or change parameters, the help function, the declarations, and the
dispatcher all adapt automatically.
If you adopted enums to represent commands, things can stay in sync too (using my dynamic language so no types):
enumdata cmdnames, cmdhelp, cmdhandlers =
(showcmd, "show", "show help", cmd_show),
...
Look up a command in 'cmdnames[]'; if found this can be used to index 'cmdhelp[]' and 'cmdhandlers[]'. Each handler function is conventional.
You might not think this is a good way to structure your source code.
There are many possibilities, including more manual work, or more
run-time work. You could use an external code generator. You could
use a language with much better reflection capabilities (like
Python). But this is something you can do, today, in plain C, and it
shows that x-macros can do vastly more than declare an enumeration and
a table of pointers.
It simply highlights my point that /because/ C can offer such untidy, half-way solutions, it is less likely that anyone is going to touch the
main language.
Now, if your language had powerful reflection and metaprogramming
features, along with compile-time execution, so that this kind of
thing could be done within the language without text-based macros,
then I would happily agree that it has better features. Perhaps the
way to do that would be to integrate your interpreted language in your
compiler.
I can tell you that my 'mcc' compiler had some trouble with those macros
(but it seemed OK if I preprocessed it first then compiled that). I
don't relish going to back to my CPP 7 years on. So any solution that
doesn't stress it is preferable!
There is a marvellous tool that might help you here - it's called "the
internet".
That's great; I can finally learn Chinese too!
Meanwhile in these 49 numbers you will find next week's winning lottery numbers: 1 2 3 ... 49.
So the problem is your compiler, not other peoples' code? You think
With such a project you are forced to delve more deeply into other
people's source codes and headers than most. It can be revealing.
That doesn't work in inline assembly.
So that's a limitation of your tools.
Invoking a function from assembly involves a sequence of instructions.
In order to inline the call, means knowing how many of those
instructions are involved so that they can be replaced. Maybe the ones directly to do with the call are intermingled with others.
It is just not practical.
It has plenty to say about the C preprocessor - it fully defines it.
Many would disagree. Like those who have tried to implement them:
'The C99 standard is about 500 pages, but only 19 of them are dedicated
to describing how the C preprocessor should work. Most of the specs are
high level qualitative descriptions that are intended to leave lots of freedom to the compiler implementor. It is likely that this vagueness in
this specification was intentional so that it would not cause existing mainstream compilers (and code) to become non-conforming. Design freedom
is good for allowing people to create novel optimizations too, but to
much freedom can lead to competing interpretations.
Bjarne Stroustrup even points out that the standard isn't clear about
what should happen in function macro recursion[1]. With reference to a specific example he says "The question is whether the use of NIL in the
last line of this sequence qualifies for non-replacement under the cited text. If it does, the result will be NIL(42). If it does not, the
result will be simply 42.". In 2004, a decision was made to leave the standard in its ambiguous state: "The committee's decision was that no realistic programs "in the wild" would venture into this area, and
trying to reduce the uncertainties is not worth the risk of changing conformance status of implementations or programs."'
https://blog.robertelder.org/7-weird-old-things-about-the-c-preprocessor/
So MSVC has a bug. Report it if you like.
Or maybe all the others do!
On 04/01/2024 08:55, David Brown wrote:
On 04/01/2024 00:48, Bart wrote:
On 03/01/2024 19:16, David Brown wrote:
On 03/01/2024 18:14, Bart wrote:
How do you know whether I'm exaggerating or not? There have surely
been many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly,
but not all. I can't say I have ever had one in my code as far as I
remember. Occasionally I find it convenient to calculate the size
of an existing array, but I'll just write it manually as an
expression (like Scott seems to do).
So writing the same long identifier twice? And hoping there's no typo
in one? Because sizeof(A)/sizeof(B[0]) would be legal code when both
A and B are arrays.
If you have difficulty typing out the same identifier twice, then you
have picked a poor name for the identifier! The whole point of
identifiers is to identify things - you give the thing a name, so that
you can refer to it later by that same name and know you are referring
to the same thing. If find this difficult when using the same array
identifier twice in one expression, then you have vastly bigger
problems with your coding than getting the size of an array.
You can make typos. You can accidentally use the name of some other
similarly named but unrelated variable.
You really can't understand that the opportunities for such errors
increases with the amount of code you have to write? As well as the
amount of obscuring clutter?
It's not just writing two long identifiers instead of one; here those additionally have to match.
Yes, maybe it is. But I know the size of the arrays I use. If I
don't know, I look at the definition to see. I don't code blindly.
You have this (and please don't complain about these terse names; a real program will be much larger, much busier, and have much longer names):
enum {N = 10;}
int A[N];
int B[N];
int C[N];
for (i = 0; i<N; ++N)
... Do something with A ...
for (i = 0; i<N; ++N)
... Do something with B ...
(I will leave the typos I've just noticed. Bloody stupid C for loops...)
Later I change A[N] to be A[M] as I said a few posts ago. Now you have
to go through the program looking for things that reference the length
of A.
But you only have N; you can't tell from that itself whether it I specifically used as the length of A or not. The job is harder. Now
consider this version:
for (i = 0; i<LENGTH(A); ++i)
... Do something with A ...
for (i = 0; i<LENGTH(B); ++i)
... Do something with B ...
Which version will be easier to update? (Hint: this one doesn't need any changes.)
Well, here is an extract from sqlite3.c:
I didn't write sqlite3.c. I am not responsible for the choices of
identifiers, or coding style, or macro usage, or anything else in it.
I am telling you why /I/ don't need or use any kind of "array_size"
macro. Other people may write their code differently.
So in you code, if I see:
sizeof(expr1)/sizeof(expr2)
what I am supposed to deduce from that? That it /might/ be an idiom for getting the number of elements in array? Then it might involving looking
in more detail to see that expr2 is really just expr1[0].
On the other hand, if I see:
ArraySize(expr)
then you are expressing what your intentions are more clearly.
On 04/01/2024 01:57, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 03/01/2024 19:16, David Brown wrote:
On 03/01/2024 18:14, Bart wrote:
How do you know whether I'm exaggerating or not? There have surely
been many millions of people who've programmed in C over the years.
And how many of them define or use such a macro? Some, certainly, but >>>> not all. I can't say I have ever had one in my code as far as I
remember. Occasionally I find it convenient to calculate the size of an >>>> existing array, but I'll just write it manually as an expression (like >>>> Scott seems to do).
So writing the same long identifier twice?
Yes. Good editors mean you don't need to type it
twice (ywllp).
And hoping there's no typo in one?
If there's a typo, the compiler will note it and I'll
fix it. But, see above.
Because sizeof(A)/sizeof(B[0]) would be legal code when both A and
B are arrays.
Good thing I don't use single letter identifiers.
What are you, 5 years old?
A and B are placeholders for arbitrary identifiers of any length.
How about if I change my remark to:
"Because >sizeof(pentium_msr_passthrough_with_knobs_on)/sizeof(k6_msr_passthrough_with_knobs_on)
would be legal code when both pentium_msr_passthrough_with_knobs_on and >k6_msr_passthrough_with_knobs_on are arrays."
On 03/01/2024 22:42, Keith Thompson wrote:
David Brown <david.brown@hesbynett.no> writes:
On 02/01/2024 21:24, Bart wrote:[...]
The X-macro solution was this, adapted from the first answer here
(https://stackoverflow.com/questions/6635851/real-world-use-of-x-macros); >>>> assume those functions are in scope:
-------
#define STATE_TABLE \
ENTRY(STATE0, func0) \
ENTRY(STATE1, func1) \
ENTRY(STATE2, func2) \
ENTRY(STATE3, func3) \
enum
{
#define ENTRY(a,b) a,
STATE_TABLE
#undef ENTRY
NUM_STATES
};
void* jumptable[NUM_STATES] =
{
#define ENTRY(a,b) b,
STATE_TABLE
#undef ENTRY
};
-------
(Why did you change the type from "p_func_t" to "void*" ? Was it just
to annoy myself and other C programmers with a pointless and
constraint-violating cast of a function pointer to "void*" ? Just add
a suitable typedef - "typedef void(*p_func_t)(void);" )
What constraint does it violate? And what cast are you referring to?
I believe the initialisation follows the requirements for simple
assignment, and function pointers are not compatible with void*.
On 04/01/2024 11:46, David Brown wrote:
On 03/01/2024 22:42, Keith Thompson wrote:
David Brown <david.brown@hesbynett.no> writes:
On 02/01/2024 21:24, Bart wrote:[...]
The X-macro solution was this, adapted from the first answer here
(https://stackoverflow.com/questions/6635851/real-world-use-of-x-macros); assume those functions are in scope:
-------
#define STATE_TABLE \
ENTRY(STATE0, func0) \
ENTRY(STATE1, func1) \
ENTRY(STATE2, func2) \
ENTRY(STATE3, func3) \
enum
{
#define ENTRY(a,b) a,
STATE_TABLE
#undef ENTRY
NUM_STATES
};
void* jumptable[NUM_STATES] =
{
#define ENTRY(a,b) b,
STATE_TABLE
#undef ENTRY
};
-------
(Why did you change the type from "p_func_t" to "void*" ? Was it just >>>> to annoy myself and other C programmers with a pointless and
constraint-violating cast of a function pointer to "void*" ? Just add >>>> a suitable typedef - "typedef void(*p_func_t)(void);" )
What constraint does it violate? And what cast are you referring to?
I believe the initialisation follows the requirements for simple
assignment, and function pointers are not compatible with void*. Bart
(for reasons understood only by him) uses void* pointers when he wants
generic function pointers. The stack overflow link uses "p_func_t",
which is a function pointer typedef.
The SO link didn't define p_func_t that I could see. ...
... I changed it to
void* to avoid the bother of doing so when testing, and to avoid having
to include that in my example that would be a distraction.
Bart <bc@freeuk.cm> writes:
On 04/01/2024 01:57, Scott Lurndal wrote:
Good thing I don't use single letter identifiers.
What are you, 5 years old?
A and B are placeholders for arbitrary identifiers of any length.
How about if I change my remark to:
"Because
sizeof(pentium_msr_passthrough_with_knobs_on)/sizeof(k6_msr_passthrough_with_knobs_on)
would be legal code when both pentium_msr_passthrough_with_knobs_on and
k6_msr_passthrough_with_knobs_on are arrays."
It's pretty obvious to even the meanest programmer that code is incorrect.
On 1/4/24 07:37, Bart wrote:
The SO link didn't define p_func_t that I could see. ...
It also didn't provide declarations for func0, func1, func2, func3, or
FuncX. From C's rules for initialization, you should have concluded that
it was a pointer to a function type that is compatible with the types of those functions.
... I changed it to
void* to avoid the bother of doing so when testing, and to avoid having
to include that in my example that would be a distraction.
You don't consider it distracting that void* makes it a constraint
violation, rendering the behavior of any program that uses such code undeifined?
However, I think it is all too late for that. It certainly can't be
added to <stddef.h> or <stdlib.h> - what would you call it without conflicting with existing names in existing code?
Now look at the basketcase that is C's attempt. C could EASILY have had
an equivalent op that works like 'size'of'; it chose not to.
David Brown <david.brown@hesbynett.no> writes:
On 03/01/2024 22:42, Keith Thompson wrote:
David Brown <david.brown@hesbynett.no> writes:
On 02/01/2024 21:24, Bart wrote:[...]
What constraint does it violate? And what cast are you referringThe X-macro solution was this, adapted from the first answer here
(https://stackoverflow.com/questions/6635851/real-world-use-of-x-macros); assume those functions are in scope:
-------
#define STATE_TABLE \
ENTRY(STATE0, func0) \
ENTRY(STATE1, func1) \
ENTRY(STATE2, func2) \
ENTRY(STATE3, func3) \
enum
{
#define ENTRY(a,b) a,
STATE_TABLE
#undef ENTRY
NUM_STATES
};
void* jumptable[NUM_STATES] =
{
#define ENTRY(a,b) b,
STATE_TABLE
#undef ENTRY
};
-------
(Why did you change the type from "p_func_t" to "void*" ? Was it just >>>> to annoy myself and other C programmers with a pointless and
constraint-violating cast of a function pointer to "void*" ? Just add >>>> a suitable typedef - "typedef void(*p_func_t)(void);" )
to?
I believe the initialisation follows the requirements for simple
assignment, and function pointers are not compatible with void*. Bart
(for reasons understood only by him) uses void* pointers when he wants
generic function pointers. The stack overflow link uses "p_func_t",
which is a function pointer typedef.
Bart's code initializes array elements of type void* with values that
are presumably of pointer-to-function type. That is indeed a constraint violation (one that, unfortunately, gcc doesn't even warn about with
default options).
David, your use of the word "cast" confused me. What's happening in the
code is an implicit conversion, not a cast (or it would be if that
particular implicit conversion were defined in the language, or if
you're using a non-standard C dialect that defines it).
On 04.01.2024 19:35, Bart wrote:
Now look at the basketcase that is C's attempt. C could EASILY have had
an equivalent op that works like 'size'of'; it chose not to.
You're asking for an, say, 'nof_elements(A)', where there's currently
only a 'sizeof(A)/sizeof(A[0])' existing? - Is that the whole resume
of this overly bulky subthread?
On 04/01/2024 16:08, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 04/01/2024 01:57, Scott Lurndal wrote:
Good thing I don't use single letter identifiers.
What are you, 5 years old?
A and B are placeholders for arbitrary identifiers of any length.
How about if I change my remark to:
"Because
sizeof(pentium_msr_passthrough_with_knobs_on)/sizeof(k6_msr_passthrough_with_knobs_on)
would be legal code when both pentium_msr_passthrough_with_knobs_on and
k6_msr_passthrough_with_knobs_on are arrays."
It's pretty obvious to even the meanest programmer that code is incorrect.
Is it? How do you know? The code is valid:
--------
int pentium_msr_passthrough_with_knobs_on;
int k6_msr_passthrough_with_knobs_on;
int main(void) { >sizeof(pentium_msr_passthrough_with_knobs_on)/sizeof(k6_msr_passthrough_with_knobs_on);
}
--------
This compiles fine.
Bart <bc@freeuk.cm> writes:
This compiles fine.
So what? Any programmer will see that they don't
match immediately. It is obviously wrong.
Give it up already. Propose a modification to the
C language to the standards committee. Bitching about
it is just noise.
On 04/01/2024 17:51, James Kuyper wrote:...
You don't consider it distracting that void* makes it a constraint
violation, rendering the behavior of any program that uses such code
undeifined?
Only to people obsessed with Standards minutiae.
On 2024-01-04, Scott Lurndal <scott@slp53.sl.home> wrote:
Bart <bc@freeuk.cm> writes:
This compiles fine.
So what? Any programmer will see that they don't
match immediately. It is obviously wrong.
Give it up already. Propose a modification to the
C language to the standards committee. Bitching about
it is just noise.
Oh boy and isn't there a lot of it - yawn!!!!!
I admire the preserverance
On 04/01/2024 22:07, Jim Jackson wrote:
On 2024-01-04, Scott Lurndal <scott@slp53.sl.home> wrote:
Bart <bc@freeuk.cm> writes:
This compiles fine.
So what? Any programmer will see that they don't
match immediately. It is obviously wrong.
Give it up already. Propose a modification to the
C language to the standards committee. Bitching about
it is just noise.
Oh boy and isn't there a lot of it - yawn!!!!!
I admire the preserverance
For 40 years 99% of my coding has been in a language where you just write:
A.len
instead of:
sizeof(A)/size(A[0])
For 40 years 99% of my coding has been in a language where you just
write:
A.len
On 2024-01-04, Bart <bc@freeuk.cm> wrote:
On 04/01/2024 22:07, Jim Jackson wrote:
I admire the preserverance
For 40 years 99% of my coding has been in a language where you just write: >>
A.len
instead of:
sizeof(A)/size(A[0])
But I suspect that language can pass the array to a function, and
that function also do A.len, right?
Even when we have a macro or language extension that calculates sizeof(A)/sizeof(*A), it is only usable for arrays that were not passed
by pointer.
On Thu, 4 Jan 2024 22:48:17 +0000, Bart wrote:
For 40 years 99% of my coding has been in a language where you just
write:
A.len
Better still:
len(A)
which is just a generic wrapper around
A.__len__()
so it will work with your custom object types as well, all they have to do
is define a method with that name.
If a language has a built-in like A.len or len(A) or A'len and so on,
then it can choose to allow overloading for that operator. The syntax
chosen doesn't matter.
It doesn't need to take a heavy-handed approach like C++ or Python.
On 04/01/2024 19:55, Janis Papanagnou wrote:
On 04.01.2024 19:35, Bart wrote:
Now look at the basketcase that is C's attempt. C could EASILY have had
an equivalent op that works like 'size'of'; it chose not to.
You're asking for an, say, 'nof_elements(A)', where there's currently
only a 'sizeof(A)/sizeof(A[0])' existing? - Is that the whole resume
of this overly bulky subthread?
Yes exactly.
If you're arguing that it is such a trivial matter that it is hardly
worth discussing, I can equally say that if it is that trivial, why
isn't it just built-in? How hard can it be?
The subthread was actually about the macro system, and this example was
one I gave where macros make it possible for every man and his dog to
create their own sloppy, incompatible workaround.
Every other app seems to provide their own version; I gave three where
so far we have ArraySize, SDL_arraysize, and ARRAY_SIZE.
You yourself have suggested 'nof_elements'. You see the problem?
I'm not actually too bothered about that (I know C is never going to
have anything that is that sensible), I'm arguing against attitudes like yours that it doesn't matter.
My original point may have been this: suppose C's macro system didn't
exist. Would that have given enough impetus to whoever was or were responsible for the language, to actually do it properly?
Since my conjecture was that C's macro scheme stopped it evolving normally.
On 04/01/2024 23:25, Lawrence D'Oliveiro wrote:
On Thu, 4 Jan 2024 22:48:17 +0000, Bart wrote:
For 40 years 99% of my coding has been in a language where you just
write:
A.len
Better still:
len(A)
which is just a generic wrapper around
A.__len__()
so it will work with your custom object types as well, all they have
to do
is define a method with that name.
If a language has a built-in like A.len or len(A) or A'len and so on,
then it can choose to allow overloading for that operator. The syntax
chosen doesn't matter.
It doesn't need to take a heavy-handed approach like C++ or Python.
On 04/01/2024 23:25, Lawrence D'Oliveiro wrote:
On Thu, 4 Jan 2024 22:48:17 +0000, Bart wrote:
For 40 years 99% of my coding has been in a language where you just
write:
A.len
Better still:
len(A)
which is just a generic wrapper around
A.__len__()
so it will work with your custom object types as well, all they have to do >> is define a method with that name.
If a language has a built-in
On 05/01/2024 02:53, Bart wrote:
On 04/01/2024 23:25, Lawrence D'Oliveiro wrote:
On Thu, 4 Jan 2024 22:48:17 +0000, Bart wrote:
For 40 years 99% of my coding has been in a language where you just
write:
A.len
Better still:
len(A)
which is just a generic wrapper around
A.__len__()
so it will work with your custom object types as well, all they have
to do
is define a method with that name.
If a language has a built-in like A.len or len(A) or A'len and so on,
then it can choose to allow overloading for that operator. The syntax
chosen doesn't matter.
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a simple
and limited feature that works for arrays of known size and gives a hard (compile-time) error when the size is not known. The one you have in
your own language covers most of that, except for the insanity of
evaluating to 0 when given a pointer/reference to an "unbounded" array.
The "ARRAY_SIZE" macro in Linux is almost perfect - the only
disadvantage is that it relies on compiler extensions:
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a),
typeof(b)))
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
Bart <bc@freeuk.cm> writes:
On 04/01/2024 23:25, Lawrence D'Oliveiro wrote:
On Thu, 4 Jan 2024 22:48:17 +0000, Bart wrote:
For 40 years 99% of my coding has been in a language where you just
write:
A.len
Better still:
len(A)
which is just a generic wrapper around
A.__len__()
so it will work with your custom object types as well, all they have to do >>> is define a method with that name.
If a language has a built-in
The C language doesn't have such a built-in. Any discussion that
assumes it does, or will, is pointless.
David Brown <david.brown@hesbynett.no> writes:
[...]
One option for a useful array length operator/function/macro is a[...]
simple and limited feature that works for arrays of known size and
gives a hard (compile-time) error when the size is not known. The one
you have in your own language covers most of that, except for the
insanity of evaluating to 0 when given a pointer/reference to an
"unbounded" array. The "ARRAY_SIZE" macro in Linux is almost perfect -
the only disadvantage is that it relies on compiler extensions:
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) +
__must_be_array(arr))
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a),
typeof(b)))
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
When you wrote "in Linux", I wondered if you were being imprecise, but
in fact that code is in the Linux kernel.
That means the macros aren't directly available to normal C code, but
you can always copy their definitions (*if* you're using a compiler
that supports the __builtin_types_compatible_p extension).
When you wrote "in Linux", I wondered if you were being imprecise, but
in fact that code is in the Linux kernel.
That means the macros aren't directly available to normal C code, but
you can always copy their definitions (*if* you're using a compiler
that supports the __builtin_types_compatible_p extension).
Why so paranoid defensive? :-)
On 05/01/2024 14:05, David Brown wrote:
On 05/01/2024 02:53, Bart wrote:
On 04/01/2024 23:25, Lawrence D'Oliveiro wrote:
On Thu, 4 Jan 2024 22:48:17 +0000, Bart wrote:
For 40 years 99% of my coding has been in a language where you just
write:
A.len
Better still:
len(A)
which is just a generic wrapper around
A.__len__()
so it will work with your custom object types as well, all they have
to do
is define a method with that name.
If a language has a built-in like A.len or len(A) or A'len and so on,
then it can choose to allow overloading for that operator. The syntax
chosen doesn't matter.
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a simple
and limited feature that works for arrays of known size and gives a hard
(compile-time) error when the size is not known. The one you have in
your own language covers most of that, except for the insanity of
evaluating to 0 when given a pointer/reference to an "unbounded" array.
The "ARRAY_SIZE" macro in Linux is almost perfect - the only
disadvantage is that it relies on compiler extensions:
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) +
__must_be_array(arr))
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a),
typeof(b)))
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
At least Linux recognises some of the problems of the
sizeof(A)/sizeof(A[10]) idiom.
Which is more than can said for Scott, who can't see any possible issues
at all. Apparently writing B for one of those As is impossible:
"Any programmer will see that they don't match immediately. It is
obviously wrong."
No matter long or short or similar they are.
Bart <bc@freeuk.cm> writes:
On 05/01/2024 14:05, David Brown wrote:
One option for a useful array length operator/function/macro is a simple >>> and limited feature that works for arrays of known size and gives a hard >>> (compile-time) error when the size is not known. The one you have in
your own language covers most of that, except for the insanity of
evaluating to 0 when given a pointer/reference to an "unbounded" array.
The "ARRAY_SIZE" macro in Linux is almost perfect - the only
disadvantage is that it relies on compiler extensions:
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) +
__must_be_array(arr))
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a),
typeof(b)))
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
At least Linux recognises some of the problems of the
sizeof(A)/sizeof(A[10]) idiom.
Which is more than can said for Scott, who can't see any possible issues
at all. Apparently writing B for one of those As is impossible:
Even David has claimed to use the sizeof/sizeof[0] idiom.
"Any programmer will see that they don't match immediately. It is
obviously wrong."
No matter long or short or similar they are.
There are thousands of ways programmers can introduce
bugs far more easily which are significantly harder
to find and fix; and yes, I expect programmers to
fully understand the code they're writing and test
their code to find bugs which they then will subsequently
fix.
Do lookup the word 'strawman' as it relates to discourse.
On 05/01/2024 18:44, Scott Lurndal wrote:
Do lookup the word 'strawman' as it relates to discourse.
And yet, despite it having no problems according to you, people still
feel the need to create macros like ARRAY_SIZE.
So I'm puzzled. Why do they do that? What imaginary problem (according
to you) do they solve?
On 04.01.2024 21:17, Bart wrote:
Specifically when discussing C there's IMO countless examples of what's missing, and what's wrongly designed, and whatnot. - Thus I am merely astonished that you fastened on that bit so vigorously, and also with
so much text and repetitions.
There's (IMO) so much wrong with arrays in C that the discussed special
case (which is also not generally applicable as pointed out repeatedly)
is of little interest (for me; YMMV).
Mind, I started with K&R C in the 1980's, at a time where much more sophisticated languages had already been present; the formally much
more appealing Algol(68), the concepts-outstanding Simula(67), even
Pascal
Above we identified a "resume" of the previous posts. Another one
shall finalize this post; no one is forced to use C.
In my static language, '.len' works on two types (arrays and slices; for
the latter it happens at runtime).
In C, sizeof() is usually compile-time, except when used on VLAs, and here, as usual for VLAs, it is a lot more complicated than you'd think:
--------
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int n=rand()+1;
typedef char (**T)[n];
n=-777;
T x;
printf("%zu\n", sizeof(**x));
}
--------
This actually displays 42 (really!). 'x' is a pointer (to a pointer to array). There is no actual array allocated.
But it needs somewhere to store
the size of the type which is the target of those pointers.
I was hoping to do it without creating `x`, but I didn't know how.
On 05/01/2024 02:53, Bart wrote:
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a simple
and limited feature that works for arrays of known size and gives a hard (compile-time) error when the size is not known. The one you have in
your own language covers most of that, except for the insanity of
evaluating to 0 when given a pointer/reference to an "unbounded" array.
The "ARRAY_SIZE" macro in Linux is almost perfect - the only
disadvantage is that it relies on compiler extensions:
#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
#define __same_type(a, b) __builtin_types_compatible_p(typeof(a),
typeof(b)))
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:-!!(e); }))
The other option is a general mechanism for "length" as an operator/function/macro that can be applied to a wide range of types - including user-made types, assuming the language supports user-defined types. Whether that is done by function overloads, custom methods,
etc., is a design choice.
Bart <bc@freeuk.cm> writes:
On 05/01/2024 18:44, Scott Lurndal wrote:[...]
There are thousands of ways programmers can introduce
bugs far more easily which are significantly harder
to find and fix; and yes, I expect programmers to
fully understand the code they're writing and test
their code to find bugs which they then will subsequently
fix.
Do lookup the word 'strawman' as it relates to discourse.
And yet, despite it having no problems according to you, people still
feel the need to create macros like ARRAY_SIZE.
Yes. That's how they use the idiom.
So I'm puzzled. Why do they do that? What imaginary problem (according
to you) do they solve?
I don't know what you're puzzled about.
Bart <bc@freeuk.cm> writes:and here,
In C, sizeof() is usually compile-time, except when used on VLAs,
as usual for VLAs, it is a lot more complicated than you'd think:
--------
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int n=rand()+1;
typedef char (**T)[n];
n=-777;
T x;
printf("%zu\n", sizeof(**x));
}
--------
This actually displays 42 (really!). 'x' is a pointer (to a pointer to
array). There is no actual array allocated.
More to the point, x is uninitialised and thus evaluating sizeof **x is undefined behaviour. gcc has some flags that could have helped you to
find that out.
But it needs somewhere to store
the size of the type which is the target of those pointers.
I was hoping to do it without creating `x`, but I didn't know how.
You don't need to create an array, nor a pointer to one, nor a pointer
to a pointer to one, in order to show that the compiler must record the
size of a variably modified array type at runtime:
int n = rand() + 1;
typedef char T[n];
printf("%zu\n", sizeof (T));
n = 42;
printf("%zu\n", sizeof (T));
n is something of a red herring here. If you write the expression in
the typedef itself you might have been less surprised:
typedef char T[rand() + 1]; // the size obviously needs to be stored
On 2024-01-05, Bart <bc@freeuk.cm> wrote:
In my static language, '.len' works on two types (arrays and slices; for
the latter it happens at runtime).
Yeah, but some widely-used languages also solve this.
Broadly speaking beyond the context of this array issue, those are your competition, not C.
It's like you're dwelling in a room full of retards in order to feel
smart.
for (i = 0; i<N; ++N)
... Do something with A ...
On 2024-01-05, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
When you wrote "in Linux", I wondered if you were being imprecise, but
in fact that code is in the Linux kernel.
That means the macros aren't directly available to normal C code, but
you can always copy their definitions (*if* you're using a compiler
that supports the __builtin_types_compatible_p extension).
You can always copy their definitions, if you're using a compiler
that doesn't arbitrarily define __GNUC__ without providing the
associated behaviors:
#ifdef __GNU__
// define array_size in the Linux kernel way
#else
#define arrray_size(x) (sizeof (x)/sizeof *(x))
#endif
If you regularly build the code with a compiler that provides GNU
extensions (like as part of your CI), you're covered, even if you're
going to production with something else.
I use C++ this way in C projects; I have some macro features that
provide extra checks under C++. I get the benefit even if I just
compile the code as C++ only once before every release.
And yet, despite it having no problems according to you, people still
feel the need to create macros like ARRAY_SIZE.
So I'm puzzled. Why do they do that? What imaginary problem (according
to you) do they solve?
On 05/01/2024 14:05, David Brown wrote:
On 05/01/2024 02:53, Bart wrote:
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a
simple and limited feature that works for arrays of known size and
gives a hard (compile-time) error when the size is not known. The one
you have in your own language covers most of that, except for the
insanity of evaluating to 0 when given a pointer/reference to an
"unbounded" array.
It gives zero because that is actually the compile-time type of the
array. But it is low priority because it is never used that way.
In C, sizeof() is usually compile-time, except when used on VLAs, and
here, as usual for VLAs, it is a lot more complicated than you'd think:
On 05/01/2024 23:19, Bart wrote:
On 05/01/2024 14:05, David Brown wrote:
On 05/01/2024 02:53, Bart wrote:
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a
simple and limited feature that works for arrays of known size and
gives a hard (compile-time) error when the size is not known. The
one you have in your own language covers most of that, except for the
insanity of evaluating to 0 when given a pointer/reference to an
"unbounded" array.
It gives zero because that is actually the compile-time type of the
array. But it is low priority because it is never used that way.
That's the kind of potentially confusing short-cut you can make when
only you ever use the language. It is also fine for when people make
their own array_length macros for use in their own code. When you know
all the details of the operator/macro/function, and you are the only one using it, it's okay if the specification is weird and unhelpful for some cases. It's a different matter entirely if you are making something
that other people will use.
In C, sizeof() is usually compile-time, except when used on VLAs, and
here, as usual for VLAs, it is a lot more complicated than you'd think:
No it is not. I don't think "garbage in, garbage out" is a complicated concept. You can write pathological shit in any language and get meaningless results out - it has no bearing on anything in real code.
(It can be fun to play around with, but it does not mean there is a
problem with the language feature.)
On 05/01/2024 23:19, Bart wrote:
On 05/01/2024 14:05, David Brown wrote:
On 05/01/2024 02:53, Bart wrote:
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a
simple and limited feature that works for arrays of known size and
gives a hard (compile-time) error when the size is not known. The
one you have in your own language covers most of that, except for the
insanity of evaluating to 0 when given a pointer/reference to an
"unbounded" array.
It gives zero because that is actually the compile-time type of the
array. But it is low priority because it is never used that way.
That's the kind of potentially confusing short-cut you can make when
only you ever use the language.
On 06/01/2024 09:09, David Brown wrote:fixed-size arrays:
On 05/01/2024 23:19, Bart wrote:
At least it has '.len' and quite a few extra bits to do with
'M' Ci<sizeof(A)/sizeof(A[0]); ++i)
Value arrays Y N
Ptr to array params Y N (also Y but non-idiomatic)
Pass-by-reference Y N
Index a pointer N Y
Deref an array N Y
0-based arrays Y Y
1-based Y N
N-based Y N
Number of bytes X.bytes sizeof(X)
T.bytes sizeof(T)
Array Length X.len sizeof(X)/sizeof(X[0])
T.len ?
Array Lower Bound X.lwb 0
T.lwb 0
Array Upper Bound X.upb sizeof(X)/sizeof(X[0])-1
T.upb ?
Table Indexing A[i,j] A[i][j] or j[A[i]] or j[i[A]]
Loop over bounds For i in A.bounds do
for (int i=0;
Loop over values For x in A doi<sizeof(A)/sizeof(A[0]); ++i) {
For i,x in A do # (both)
for (int i=0;
T x = A[i]; ...
Zero array clear A memset(A, 0, sizeof(A)/sizeof(A[0]));
On 06/01/2024 09:09, David Brown wrote:
On 05/01/2024 23:19, Bart wrote:
On 05/01/2024 14:05, David Brown wrote:
On 05/01/2024 02:53, Bart wrote:
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a
simple and limited feature that works for arrays of known size and
gives a hard (compile-time) error when the size is not known. The
one you have in your own language covers most of that, except for
the insanity of evaluating to 0 when given a pointer/reference to an
"unbounded" array.
It gives zero because that is actually the compile-time type of the
array. But it is low priority because it is never used that way.
That's the kind of potentially confusing short-cut you can make when
only you ever use the language. It is also fine for when people make
their own array_length macros for use in their own code. When you
know all the details of the operator/macro/function, and you are the
only one using it, it's okay if the specification is weird and
unhelpful for some cases. It's a different matter entirely if you are
making something that other people will use.
In C, sizeof() is usually compile-time, except when used on VLAs, and
here, as usual for VLAs, it is a lot more complicated than you'd think:
No it is not. I don't think "garbage in, garbage out" is a
complicated concept. You can write pathological shit in any language
and get meaningless results out - it has no bearing on anything in
real code. (It can be fun to play around with, but it does not mean
there is a problem with the language feature.)
And yet, in my example, lccwin32 gave the wrong result, and in one case crashed. (I no longer have other smaller C compilers to try out.)
It suggests the problem is not trivial.
The concept is not that easy to get your head around either, the idea
that the variable aspects of a VLA are associated with its type, not its value or instance.
In my example, there were no instances of any actual arrays, not even if
I declared an instance of one of the pointers involved, yet space had to
be allocated for its size. And I kept the type simple.
I remember also trying out VLAs within a struct, and there I got a wider variance of results.
You can keep saying that VLAs are trivial to understand and implement; I won't believe you.
I have no idea why you are doing this. I have no idea why you thought
it was a good idea to make your own personal language and write a
million lines that no one can ever use again, or work with, change,
re-use, maintain or update. I have no idea why you hate C, why you
obsess about a language you hate and why you rave about your language
that no one else uses and no one else cares about in a newsgroup for C.
I have no idea why you continue to think that your language is God's
gift to programming and that everyone else, the world over, is crazy to imagine there are any reasons to do something differently from the way
you do things in your language.
Yes, "why?" is a good question.
On 06/01/2024 09:09, David Brown wrote:
On 05/01/2024 23:19, Bart wrote:
On 05/01/2024 14:05, David Brown wrote:
On 05/01/2024 02:53, Bart wrote:
It doesn't need to take a heavy-handed approach like C++ or Python.
One option for a useful array length operator/function/macro is a
simple and limited feature that works for arrays of known size and
gives a hard (compile-time) error when the size is not known. The
one you have in your own language covers most of that, except for
the insanity of evaluating to 0 when given a pointer/reference to an
"unbounded" array.
It gives zero because that is actually the compile-time type of the
array. But it is low priority because it is never used that way.
That's the kind of potentially confusing short-cut you can make when
only you ever use the language.
I spent 10 mins trying to fix this today. So that '.len' on bounds
specified as `[]` rather than `[0]`, and not later set by init data,
would be an error.
Then I got stuck on examples like '[]int a = ()' and realised it's
tricky. And then I thought, why the hell am I doing this? A language
I've already used for a million lines of code.
On 1/6/24 9:28 AM, David Brown wrote:
I have no idea why you are doing this. I have no idea why you thought
it was a good idea to make your own personal language and write a
million lines that no one can ever use again, or work with, change,
re-use, maintain or update. I have no idea why you hate C, why you
obsess about a language you hate and why you rave about your language
that no one else uses and no one else cares about in a newsgroup for
C. I have no idea why you continue to think that your language is
God's gift to programming and that everyone else, the world over, is
crazy to imagine there are any reasons to do something differently
from the way you do things in your language.
Yes, "why?" is a good question.
I think the big reason for his "Why?" is that he KNOWS that his language isn't sufficient, but will need to interface to things that others have
done,
His complaint is that while C has become the common-tounge for
inter-language linkage, it really isn't designed for that.
On 06/01/2024 13:53, Bart wrote:
On 06/01/2024 09:09, David Brown wrote:
On 05/01/2024 23:19, Bart wrote:
On 05/01/2024 14:05, David Brown wrote:
On 05/01/2024 02:53, Bart wrote:
It doesn't need to take a heavy-handed approach like C++ or Python. >>>>>>
One option for a useful array length operator/function/macro is a
simple and limited feature that works for arrays of known size and
gives a hard (compile-time) error when the size is not known. The
one you have in your own language covers most of that, except for
the insanity of evaluating to 0 when given a pointer/reference to
an "unbounded" array.
It gives zero because that is actually the compile-time type of the
array. But it is low priority because it is never used that way.
That's the kind of potentially confusing short-cut you can make when
only you ever use the language.
I spent 10 mins trying to fix this today. So that '.len' on bounds
specified as `[]` rather than `[0]`, and not later set by init data,
would be an error.
Then I got stuck on examples like '[]int a = ()' and realised it's
tricky. And then I thought, why the hell am I doing this? A language
I've already used for a million lines of code.
I have no idea why you are doing this. I have no idea why you thought
it was a good idea to make your own personal language and write a
million lines that no one can ever use again, or work with, change,
re-use, maintain or update.
I have no idea why you hate C, why you
obsess about a language you hate and why you rave about your language
that no one else uses and no one else cares about in a newsgroup for C.
I have no idea why you continue to think that your language is God's
gift to programming and that everyone else, the world over, is crazy to imagine there are any reasons to do something differently from the way
you do things in your language.
Yes, "why?" is a good question.
His complaint is that while C has become the common-tounge for
inter-language linkage, it really isn't designed for that.
The "shims" to perform this tend to need some part of it to be written
in C.
But it is true that many APIs (not ABIs) are defined in C terms or as C headers.
Bart <bc@freeuk.cm> writes:
[...]
The concept is not that easy to get your head around either, the idea
that the variable aspects of a VLA are associated with its type, not
its value or instance.
Oh? I don't find it difficult at all. It's the kind of thing you learn once, and then you know it. Now you know it, but you're still
complaining for some reason.
On Sat, 6 Jan 2024 09:56:05 -0500, Richard Damon wrote:
His complaint is that while C has become the common-tounge for
inter-language linkage, it really isn't designed for that.
The "shims" to perform this tend to need some part of it to be written
in C.
There is a thing called libffi, which is commonly used by many high-level languages to interface to C code (or to code that assumes that it is being called from C code). This is the basis of the ctypes module in the
standard Python library, for example.
I have successfully used ctypes to produce “Pythonic” wrappers for several
useful facilities (e.g. Cairo graphics, D-Bus, inotify). By “Pythonic” I mean that they try to look as though the underlying facility was designed
to be used from Python.
(When you do need something like LIBFFI, then it is another of those hard-to-build C libraries.
typedef int T[n]; // here the size is stored with the type
int A[n]; // here you'd expect it with each variable int
B[n];
T C[n]; // and here in both
C uses a "declaration follows usage" rule (though not with 100%
consistency).
On Sun, 7 Jan 2024 00:09:11 +0000, Bart wrote:
typedef int T[n]; // here the size is stored with the type
int A[n]; // here you'd expect it with each variable int
B[n];
T C[n]; // and here in both
The array size is in the wrong place.
Java at least puts it in a more natural place:
int[n] A;
... etc ...
though unfortunately it forgets to include typedefs.
Type specifications in C are all backwards, anyway. They should have
adopted the Pascal syntax for that.
Bart <bc@freeuk.cm> writes:
On 05/01/2024 22:50, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
On 05/01/2024 18:44, Scott Lurndal wrote:[...]
Yes. That's how they use the idiom.There are thousands of ways programmers can introduce
bugs far more easily which are significantly harder
to find and fix; and yes, I expect programmers to
fully understand the code they're writing and test
their code to find bugs which they then will subsequently
fix.
Do lookup the word 'strawman' as it relates to discourse.
And yet, despite it having no problems according to you, people still
feel the need to create macros like ARRAY_SIZE.
So I'm puzzled. Why do they do that? What imaginary problem (according >>>> to you) do they solve?I don't know what you're puzzled about.
I'm puzzled because people like Scott are suggesting those macros are
a waste of time, and yet you find them in big, important software.
Oh, so *that's* what you're complaining about.
I just took a look at Scott's article to which you replied, and its
parent, and so on all the way to the first article in this thread.
Nowhere did Scott suggest that those macros are a waste of time.
He did write some things that suggest that he considers the
`sizeof arr / sizeof arr[0]` idiom clear enough to be written directly
as part of an expression rather than as a macro (and I agree), but I
didn't see him objecting to the idea of using a macro.
I can't speak for him, but I think he was objecting to your insistence
that using the idiom directly is unacceptable, and you interpreted that
as him insisting that wrapping the idiom in a macro is useless. He
didn't say that.
I'm making an assumption here, that you find code that uses the idiom >directly more objectionable than code that wraps it in a macro.
Personally, I'd be likely to define a macro if I'm going to be using it >multiple times. In an example I mentioned recently, where I modified
some existing code to use `sizeof array / sizeof array[0]`, I was only
doing it once, so just wrote the expression directly. Either method is
fine with me, and I suspect Scott would agree.
On 07/01/2024 01:12, Janis Papanagnou wrote:
On 07.01.2024 01:16, Lawrence D'Oliveiro wrote:
On Sun, 7 Jan 2024 00:09:11 +0000, Bart wrote:
typedef int T[n]; // here the size is stored with the type >>>>
int A[n]; // here you'd expect it with each variable
int
B[n];
T C[n]; // and here in both
The array size is in the wrong place.
You mean that the poster has a misconception about the declaration
mapping to the actual formal semantics? (That might at least explain
why he's confused by the C way.)
There was nothing wrong with the C code. (The quote has garbled the
'int' belong to B.)
LD'O is saying the C syntax puts it in the wrong place.
Java at least puts it in a more natural place:
int[n] A;
... etc ...
Or Algol(68) that I upthread mentioned for its formal sophistication
[n] int A;
though unfortunately it forgets to include typedefs.
where Algol has 'mode' declarations for types
mode intarr = [n] int A;
Interesting. My syntax uses:
[n]int A
type intarr = [n]int A
On Sun, 7 Jan 2024 00:21:00 +0000, Bart wrote:
(When you do need something like LIBFFI, then it is another of those
hard-to-build C libraries.
sudo apt-get install libffi-dev
does it for me.
If you want to learn how to build things from source, maybe look at how a source-based distro like Gentoo does it?
They provide scripts to do
automatically what you struggle to manage with your human brain. Maybe too much exposure to Microsoft Windows?
On 07.01.2024 01:16, Lawrence D'Oliveiro wrote:
On Sun, 7 Jan 2024 00:09:11 +0000, Bart wrote:
typedef int T[n]; // here the size is stored with the type
int A[n]; // here you'd expect it with each variable int >>> B[n];
T C[n]; // and here in both
The array size is in the wrong place.
You mean that the poster has a misconception about the declaration
mapping to the actual formal semantics? (That might at least explain
why he's confused by the C way.)
Java at least puts it in a more natural place:
int[n] A;
... etc ...
Or Algol(68) that I upthread mentioned for its formal sophistication
[n] int A;
though unfortunately it forgets to include typedefs.
where Algol has 'mode' declarations for types
mode intarr = [n] int A;
And then what? LIBFFI is still hard to use.
Or Algol(68) that I upthread mentioned for its formal sophistication
[n] int A;
Bart <bc@freeuk.cm> writes:
On 06/01/2024 21:40, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
[...]
The concept is not that easy to get your head around either, the ideaOh? I don't find it difficult at all. It's the kind of thing you
that the variable aspects of a VLA are associated with its type, not
its value or instance.
learn
once, and then you know it. Now you know it, but you're still
complaining for some reason.
Maybe you've never had to implement it. It is certainly not intuitive:
No, I haven't. I worked on compilers in the distant past (for Ada,
which doesn't distinguish on the language level between array with
constant and non-constant bounds), but I haven't implemented VLAs for C.
We were talking about the fact that, as you say, "variable aspects of a
VLA are associated with its type, not its value or instance". Are you
saying that makes implementing it unreasonably difficult?
int n=rand();
typedef int T[n]; // here the size is stored with the type
Yes, of course it is. And keep in mind that typedef doesn't create a
new type. A compiler will create, at compile time, some internal node
(or whatever data structure it uses) representing the anonymous type `int[n]`, and will associate an implicitly created automatic object with
it to hold its length or size. It will then create a node representing
the typedef T, referring to the anonymous array type.
int A[n]; // here you'd expect it with each variable
Why would you expect that?
And if you have std::vector you don't need VLAs or slices. If you use a language other than C, you can avoid C's limitations.
On Sat, 06 Jan 2024 16:40:03 -0800, Keith Thompson wrote:
C uses a "declaration follows usage" rule (though not with 100%
consistency).
And putting the function result before the argument types turns out to
cause trouble when carried over to C++, when you try to express
dependencies between them. So they had to add a Pascal-style alternative syntax, with the function result declared after the arguments.
Then I got stuck on examples like '[]int a = ()' and realised it's
tricky. And then I thought, why the hell am I doing this? A language
I've already used for a million lines of code.
On 2024-01-06, Bart <bc@freeuk.cm> wrote:
Then I got stuck on examples like '[]int a = ()' and realised it's
tricky. And then I thought, why the hell am I doing this? A language
I've already used for a million lines of code.
That exact reasoning can be used to reject new features from C,
like an operator for the number of elements in an array.
People have written billions of lines of C without it!
On Sun, 7 Jan 2024 01:26:29 +0000, Bart wrote:
And then what? LIBFFI is still hard to use.
Using Python’s ctypes module, which is basically built on top of libffi, I have not found to be that hard at all.
Looking at the sizes of those particular Python wrappers I mentioned:
Cairo graphics <https://gitlab.com/ldo/qahirah> -- 8500 lines
D-bus <https://gitlab.com/ldo/dbussy/> -- 11,000 lines
inotify <https://gitlab.com/ldo/inotipy> -- under 600 lines
Using Python’s ctypes module, which is basically built on top oflibffi, I
have not found to be that hard at all.
I have not found to be that hard at all.
On Sat, 06 Jan 2024 16:40:03 -0800, Keith Thompson wrote:
C uses a "declaration follows usage" rule (though not with 100%
consistency).
And putting the function result before the argument types turns out to
cause trouble when carried over to C++, when you try to express
dependencies between them. So they had to add a Pascal-style alternative syntax, with the function result declared after the arguments.
Even pointer dereferencing should have been done with a postfix, not a
prefix operator. Consider why you need “->”: it’s purely syntactic sugar
to make things like
(*a).b
less awkward as
a->b
Whereas in Pascal, for example, there is no need for any alternative
syntax to
a^.b
On 07/01/2024 01:58, Lawrence D'Oliveiro wrote:
On Sat, 06 Jan 2024 16:40:03 -0800, Keith Thompson wrote:
C uses a "declaration follows usage" rule (though not with 100%
consistency).
And putting the function result before the argument types turns out to
cause trouble when carried over to C++, when you try to express
dependencies between them. So they had to add a Pascal-style alternative
syntax, with the function result declared after the arguments.
Even pointer dereferencing should have been done with a postfix, not a
prefix operator. Consider why you need “->”: it’s purely syntactic sugar
to make things like
(*a).b
less awkward as
a->b
Whereas in Pascal, for example, there is no need for any alternative
syntax to
a^.b
There are two kinds of programming languages. There are ones that that exist long enough and are popular enough for people to see that the
original design was not perfect and could have been done differently,
and languages that die away to irrelevance before long. No one thinks
the C way of doing things, or its syntax, is perfect - but a lot of
people think it is good enough that they can live with it.
A language has to either stick with the sub-optimal choices it made long
ago, as C has done, or it can try to make changes and suffers from
having to support new and old ideas, as C++ has done. Each technique
has its advantages and disadvantages.
Bart <bc@freeuk.cm> writes:
You might call it metadata, but being a type doesn't come to
mind. What would be the point of that; what could you do with that
type?
You could define objects of the type,
you could apply sizeof to it,
you
could define a type that points to it, or that's an array of it. You
know, all the stuff you can normally do with a type
Suppose you had a counted string variable S, currently set to "ABC" so
that its length (which you say is naturally part its type) is 3.
What exactly do you mean by a "counted string variable"? If you mean
you have a variable whose current value is "ABC", and you can update it
so its value becomes "ABCDEF", > then of course the count has to be associated with the object.
typedef int rvla[rand() % 10 + 1];
rvla A;
rvla B;
A and B are of the same type and therefore have the same length.
The
obvious way to implement that would be to store the length in an
anonymous object associated with the type. You're saying you'd expect
the size be associated with A and with B, not with the type.
Given:
int row_count = 1000;
int col_count = 2000;
int vla_2d[row_count][col_count];
you have 1000 rows of 2000 elements each. Would you expect to create
1000 implicit objects, one for each row, each holding the value 2000?
Is there much left for you to do?
On Sun, 7 Jan 2024 12:14:24 +0000, Bart wrote:
Is there much left for you to do?
Yes. Make it work as though it were written for Python programmers.
For example, the Cairo graphics API requires you to pass X- and Y- coordinates as separate arguments to every function. I wrap them up into a single “Vector” type, with its own arithmetic operators. For example, compare what you would have to do in C:
x0 = 0;
y0 = scope_radius;
x1 = x0 * cos(- trace_width_angle) - y0 * sin(- trace_width_angle);
y1 = x0 * sin(- trace_width_angle) + y0 * cos(- trace_width_angle);
cairo_line_to(ctx, x1, y1);
with what my Python wrapper allows:
ctx.line_to(Vector(0, - scope_radius).rotate(- trace_width_angle))
I had to write the code to do that.
Bart <bc@freeuk.cm> writes:
So, C has distinct concepts of 'zero-length' array and 'unbounded
array', so that sizeof/sizeof on the latter generates a diagnostic.
No, it doesn't. C doesn't have zero-length arrays. A fixed-length
array with a length of 0:
int arr[0];
is a constraint violation.
I don't know what you mean by "unbounded array".
I'm using parameter names so that, with more elaborate functions, I can
use keyword arguments. I can also define default values so that not all arguments need be supplied.
importdll cairo = # don't know the exact dll name
From your library:
def line_to(self, p) :
...
#end line_to
I see you have the same opinion of Python block syntax as I do.
Bart <bc@freeuk.cm> writes:
Here's a small non-executable example:
#include <stddef.h>
size_t foo_size;
size_t bar_size;
void func(void) {
int n = 42;
typedef int vla[n];
vla foo;
vla bar;
At the next level up, if row_count and col_count are really not known
until runtime (your example would be better off as enums), then yes,
there would be 1000 rows, and each row is a 2000-element array
containing its length.
If row_count and col_count were enums, they would be constants, and
vla_2d would not be a VLA. The whole point is that we're talking about
VLAs.
Yes, VLAs can result in stack overflows. So can ordinary fixed-length arrays.
If you define a VLA that you know can't be longer than N
elements, then that's no more dangerous that defining an ordinary array
with a length of exactly N.
This is clearly a bug in lccwin. Feel free to report it to jacob navia and/or post to comp.compilers.lcc. I don't think he'd be interested in hearing about it from me.
Meanwhile my dynamic language [...]
is irrelevant. If you want to discuss it, I suggest starting a thread
in comp.lang.misc. I might even participate. (Your point in bringing
up your own languages seems to be that other languages do some things
better than C does.
C *does not have* VLAs whose type is "determined at runtime but can
grow". Logically associating the length of a VLA type with the type
*works*.
Here's a small non-executable example:
#include <stddef.h>
size_t foo_size;
size_t bar_size;
void func(void) {
int n = 42;
typedef int vla[n];
vla foo;
vla bar;
foo_size = sizeof foo;
bar_size = sizeof bar;
}
When I compile it with "gcc -S", I get assembly code that appears to
store the value 42 just once:
movl $42, -52(%rbp)
and retrieves that value from the same place to copy it to foo_size and bar_size. (I'm not an expert in x86_64 assembly language, but I'm
fairly sure that's what's going on.) Please take a look at the
generated assembly code yourself, using any C compilers you like. Do
any of them store the sizes of foo and bar separately? Why do you think
it would be better to do so?
On 07/01/2024 14:48, David Brown wrote:
On 07/01/2024 01:58, Lawrence D'Oliveiro wrote:
On Sat, 06 Jan 2024 16:40:03 -0800, Keith Thompson wrote:
C uses a "declaration follows usage" rule (though not with 100%
consistency).
And putting the function result before the argument types turns out to
cause trouble when carried over to C++, when you try to express
dependencies between them. So they had to add a Pascal-style alternative >>> syntax, with the function result declared after the arguments.
Even pointer dereferencing should have been done with a postfix, not a
prefix operator. Consider why you need “->”: it’s purely syntactic sugar
to make things like
(*a).b
less awkward as
a->b
Whereas in Pascal, for example, there is no need for any alternative
syntax to
a^.b
There are two kinds of programming languages. There are ones that
that exist long enough and are popular enough for people to see that
the original design was not perfect and could have been done
differently, and languages that die away to irrelevance before long.
No one thinks the C way of doing things, or its syntax, is perfect -
but a lot of people think it is good enough that they can live with it.
A language has to either stick with the sub-optimal choices it made
long ago, as C has done, or it can try to make changes and suffers
from having to support new and old ideas, as C++ has done. Each
technique has its advantages and disadvantages.
I used to have that a^.b syntax (deref pointer then index).
But for a few years I've relaxed that so that the deref is done automatically:
a^.b becomes a.b
a^[b] becomes a[b]
a^(b) becomes a(b)
AFAICS, C can could also relax the (*a).b or a->b synax so that you just
do a.b. You could do that today, and nothing changes. (Of course it
would need a compiler update).
The others don't affect C so much: pointers to arrays, that would
require (*a)[i], are rarely used. Everybody uses a[i] anyway with 'a'
being a pointer to the first element.
And it already allows, via some mysterious rules, for (*a)(b) to be
written as a(b).
Bart <bc@freeuk.cm> writes:
It does however still bother me that a mere typedef, not actually used
for anything, could take up runtime resources.
It's not the typedef that takes up runtime resources. It's the array
type to which the typedef refers.
Consider this:
printf("%zu\n", sizeof (int[rand() % 10 } 1]);
That's a perfectly valid usage.
Earlier in this thread, you seemed to have the misconception that the
length of a VLA object could change during the object's lifetime.
So don't do that.
An int parameter can have a value up to INT_MAX. If you don't want a
stack overflow, then *write your code* so that n can't be too big.
malloc() can take any value up to SIZE_MAX.
Don't write code that
can actually attempt to allocate that that much memory.
No chance of what? Were you planning to implement C VLAs? You don't
seem to understand them well enough to use them, let alone to implement
them.
I seriously urge you to contact jacob navia, the author and maintainer
of lcc-win, and let him know about this bug.
On 07/01/2024 23:51, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
Here's a small non-executable example:
#include <stddef.h>
size_t foo_size;
size_t bar_size;
void func(void) {
int n = 42;
typedef int vla[n];
vla foo;
vla bar;
This isn't how VLAs are typically used. I think there are three kinds of people when it comes to VLAs:
(1) Those who don't know about them and only inadvertently create
them because the dimension is a variable not a compiler-time
expression
(2) Those who know about VLAs, but who will supply the dimensions to
the variable declaration
(3) A minority who know that VLA dimensions can also be used in
typedefs.
In the case of (2), what happens is that some storage is reserved for a
size. A clever compiler can use one lot of storage if it realises
multiple variables use that same size.
But whatever it is, you saying that bit of storage is to do with the
type; I'm saying it's to do with the instance of that type. In the end
it probably doesn't matter.
It does however still bother me that a mere typedef, not actually used
for anything, could take up runtime resources.
At the next level up, if row_count and col_count are really not known
until runtime (your example would be better off as enums), then yes,
there would be 1000 rows, and each row is a 2000-element array
containing its length.
If row_count and col_count were enums, they would be constants, and
vla_2d would not be a VLA. The whole point is that we're talking about
VLAs.
This is actually an important point. Many examples of VLAs I've seen are exactly like your example, due to point (1) above.
Yes, VLAs can result in stack overflows. So can ordinary fixed-length
arrays.
VLAs can do so much more easily:
void F(int n) { int A[n];}
What are the likely values of n? Without VLAs you have to knowingly use
large fixed values of n, and/or rely on deep recursion, to get overflow.
If you define a VLA that you know can't be longer than N
elements, then that's no more dangerous that defining an ordinary array
with a length of exactly N.
The compiler likely doesn't know. Some compilers (like gcc on Windows)
use a call like __checkstk() to allocate local storage which it either
knows will exceed 4KB, or it might do (like a VLA, even if n turns out
to be only 3).
This is clearly a bug in lccwin. Feel free to report it to jacob navia
and/or post to comp.compilers.lcc. I don't think he'd be interested in
hearing about it from me.
This is important too. If someone that experienced with implementing C
has some trouble, then I've got no chance.
(BTW why did VLAs become optional from C11?)
On 07/01/2024 16:34, Bart wrote:
I used to have that a^.b syntax (deref pointer then index).
But for a few years I've relaxed that so that the deref is done
automatically:
a^.b becomes a.b
a^[b] becomes a[b]
a^(b) becomes a(b)
AFAICS, C can could also relax the (*a).b or a->b synax so that you
just do a.b. You could do that today, and nothing changes. (Of course
it would need a compiler update).
You could, in the sense that (AFAICS) there would be no situation where
in code today "a.b" and "a->b" were both syntactically and semantically correct but meant different things. Then you could have a compiler
treat the syntax or constraint error "a.b" as intending to mean "a->b".
I don't think it would be a good idea - I think it just adds confusion because you easily lose track of what are structs and what are pointers
to structs.
I'd rather it be an error when you get these wrong in the
code.
My personal preference is either to say that everything
is always a reference (like Python), or everything is always a value
(like C) and do the dereferencing explicitly. Other people make think
such automatic dereferencing is a good idea, but I personally don't.
The others don't affect C so much: pointers to arrays, that would
require (*a)[i], are rarely used. Everybody uses a[i] anyway with 'a'
being a pointer to the first element.
And it already allows, via some mysterious rules, for (*a)(b) to be
written as a(b).
Think of it rather as C allows you to write function calls like
"foo(x)", and that considering function names as being function pointers
is a natural view that is easy to implement in compilers and keeps the C
to assembly conversion as lean as possible - it means "foo" is the
address of the function, rather than being the function itself. Being
able to write "foo(x)" as "(*foo)(x)" is just a byproduct of this - it
would need extra rules added to C to disallow it.
On 08/01/2024 02:32, Bart wrote:
On 07/01/2024 23:51, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
Here's a small non-executable example:
#include <stddef.h>
size_t foo_size;
size_t bar_size;
void func(void) {
int n = 42;
typedef int vla[n];
vla foo;
vla bar;
This isn't how VLAs are typically used. I think there are three kinds
of people when it comes to VLAs:
(1) Those who don't know about them and only inadvertently create
them because the dimension is a variable not a compiler-time
expression
(2) Those who know about VLAs, but who will supply the dimensions to
the variable declaration
(3) A minority who know that VLA dimensions can also be used in
typedefs.
I would have thought that a majority of C programmers know that an array
size can be used in a typedef.
An even cleverer compiler usually doesn't need to store the size at all, anywhere. A typical real-world implementation of a VLA like "T xs[n]"
will mean preserving the current stack pointer, then subtracting
sizeof(T) * n.
(Optimising register allocation, stack slot
usage, variable lifetimes, etc., - /that/ is hard work. Adding an
another constant variable to the function is not.)
(Since you are not trying to make a conforming compiler, you could quite reasonably allow such VLA's, treating them identically to normal arrays, while disallowing VLA's whose size is not known until compile time.)
What are the likely values of n? Without VLAs you have to knowingly
use large fixed values of n, and/or rely on deep recursion, to get
overflow.
This is a myth that is regularly trotted out by people who, for unknown reasons, don't like VLAs. They pretend that somehow heap allocation is "safer" because malloc will return 0 if an allocation fails, while VLA's
have no such mechanism. In reality, if you were to write "malloc(n)"
with no idea what "n" might be, your program is as hopelessly unsound as
one that writes "int A[n];" with no idea what "n" might be.
Keith wrote "that /you/ know can't be longer than N". The programmer is responsible for writing correct code, not the compiler.
/You/ are experienced in implementing C. But both you and Jacob suffer
from the same problem here - you are trying to do everything yourself.
Bart <bc@freeuk.cm> writes:
This is interesting:
unsigned long long int n=SIZE_MAX;
int A[n];
printf("%zu\n",sizeof(A));
printf("%zu\n",SIZE_MAX);
This outputs (with gcc):
18446744073709551612
18446744073709551615
It goes wrong if you try to write to A beyond whatever the stack size is.
But if you don't want to contact him, I'll do it myself. I'll post to comp.compilers.lcc, which he followed as of a few years ago. I can
credit you for finding the bug or leave you out of it, whichever you
prefer.
It's not even clear if it is actually wrong in your example other than
using more memory than expected. Look at the output of the gcc example
above: what's with the missing 3 bytes?
What missing 3 bytes? The gcc output looked correct to me.
On 08/01/2024 18:25, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
This is interesting:
unsigned long long int n=SIZE_MAX;
int A[n];
printf("%zu\n",sizeof(A));
printf("%zu\n",SIZE_MAX);
This outputs (with gcc):
18446744073709551612
18446744073709551615
It goes wrong if you try to write to A beyond whatever the stack size
is.
But if you don't want to contact him, I'll do it myself. I'll post to
comp.compilers.lcc, which he followed as of a few years ago. I can
credit you for finding the bug or leave you out of it, whichever you
prefer.
Best not to mention me; I'm already well-known to him as a bug-finder.
It's not even clear if it is actually wrong in your example other than
using more memory than expected. Look at the output of the gcc example
above: what's with the missing 3 bytes?
What missing 3 bytes? The gcc output looked correct to me.
Look carefully at the last digit. It should be '5' for SIZE_MAX, but
sizeof() reports a value with '2' at the end.
On 08/01/2024 12:50, David Brown wrote:
On 07/01/2024 16:34, Bart wrote:
I used to have that a^.b syntax (deref pointer then index).
But for a few years I've relaxed that so that the deref is done
automatically:
a^.b becomes a.b
a^[b] becomes a[b]
a^(b) becomes a(b)
AFAICS, C can could also relax the (*a).b or a->b synax so that you
just do a.b. You could do that today, and nothing changes. (Of course
it would need a compiler update).
You could, in the sense that (AFAICS) there would be no situation
where in code today "a.b" and "a->b" were both syntactically and
semantically correct but meant different things. Then you could have
a compiler treat the syntax or constraint error "a.b" as intending to
mean "a->b".
I don't think it would be a good idea - I think it just adds confusion
because you easily lose track of what are structs and what are
pointers to structs.
Yet this is exactly what happens with those other examples: you don't
know if the X in X[i] has type T[] or T*. (The use of (*X)[i] when X is
of type T(*)[] is rare.)
And you don't know if the F in F(x) is an actual function, or a pointer
to a function.
The "->" alternate is anyway a little strange:
(*P).m can be written as P->m
(**Q).m can only be reduced to (*Q)->m
So it only works on the last lot of indirection. There is also no
euivalent of just (*P), "->" needs to specify a member name as it
combines two operations.
I'd rather it be an error when you get these wrong in the code.
I had the same misgivings: there is a loss of transparency, but after I started using the auto-deref, the benefits outweighed that:
Code was remarkably free of clutter. (And in my case, I had sections of
code that could often be ported as-is to/from my other language that
didn't need those derefs.)
My personal preference is either to say that everything is always a
reference (like Python), or everything is always a value (like C) and
do the dereferencing explicitly. Other people make think such
automatic dereferencing is a good idea, but I personally don't.
This can occur with reference parameters too: I believe you get the same thing in C++.
The others don't affect C so much: pointers to arrays, that would
require (*a)[i], are rarely used. Everybody uses a[i] anyway with 'a'
being a pointer to the first element.
And it already allows, via some mysterious rules, for (*a)(b) to be
written as a(b).
Think of it rather as C allows you to write function calls like
"foo(x)", and that considering function names as being function
pointers is a natural view that is easy to implement in compilers and
keeps the C to assembly conversion as lean as possible - it means
"foo" is the address of the function, rather than being the function
itself. Being able to write "foo(x)" as "(*foo)(x)" is just a
byproduct of this - it would need extra rules added to C to disallow it.
It's worse than that.
Given an actual function:
void F(void){}
all of these calls are valid:
(&F)();
F();
(*F)();
(**F)();
(***F)();
(****F)(); // etc
across the half-dozen compilers I tried. Except for my 'mcc' which only allows up to (*F), and not (**F)() or beyond. I just thought it was silly.
David Brown <david.brown@hesbynett.no> writes:
On 08/01/2024 02:32, Bart wrote:[...]
(BTW why did VLAs become optional from C11?)
I don't know for sure. But I do know that not all C implementations
use a stack, and for some targets a "true VLA" (as distinct from a VLA
where the size is known at compile time) would be extremely
inefficient to implement. It is possible that this has something to
do with it.
I'm skeptical that that's the reason. Almost all C implementations do
use a "stack", in the sense of a contiguously allocated region of memory
in which automatic objects are allocated, with addresses uniformly
increasing or decreasing for new allocations. (All C implementations
use a "stack" in the sense of a last-in/first-out data structure.) I'm
not aware that any implementers of non-stack implementations objected to VLAs. For that matter, I don't know why VLAs would be extremely
inefficient in such an implementation. They need to have the ability to allocate new stack frames anyway, and determining the allocation size at
run time shouldn't be a huge burden.
I've seen suggestions that the intent was to make it possible to create conforming C implementations for small embedded processors. That might
make sense, but it could have been addressed by making VLAs optional
only for freestanding implementations.
I think Microsoft has chosen not to implement VLAs in their C compiler.
On 08/01/2024 15:00, David Brown wrote:
On 08/01/2024 02:32, Bart wrote:
On 07/01/2024 23:51, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
Here's a small non-executable example:
#include <stddef.h>
size_t foo_size;
size_t bar_size;
void func(void) {
int n = 42;
typedef int vla[n];
vla foo;
vla bar;
This isn't how VLAs are typically used. I think there are three kinds
of people when it comes to VLAs:
(1) Those who don't know about them and only inadvertently create
them because the dimension is a variable not a compiler-time
expression
(2) Those who know about VLAs, but who will supply the dimensions to
the variable declaration
(3) A minority who know that VLA dimensions can also be used in
typedefs.
I would have thought that a majority of C programmers know that an
array size can be used in a typedef.
Even that, having a fixed size array as a type, would be unusual, and typically used for small arrays with special uses, for example:
typedef float vector[4];
typedef float matrix[4][4];
typedef quad byte[4]; // an actual one of mine
I can't even think of a use for such an array to have a size known at runtime, unless somebody does this:
const int size = 4;
typedef float vector[size];
typedef float matrix[size][size];
(How many even know that you can typedef an actual function, not just a function pointer:
#include <stdio.h>
typedef int Op(int a, int b);
Op add {return a+b;}
Op sub {return a-b;}
Op mul {return a*b;}
Op div {return a/b;}
int main(void) {
printf("%d\n", add(2, mul(3,4)));
}
Here, most compilers accept that typedef.
However only two allow you to
use that typedef in the way shown, to provide a common template for a function signature: Tiny C, and mine. Those produce a program that
displays 14.)
An even cleverer compiler usually doesn't need to store the size at
all, anywhere. A typical real-world implementation of a VLA like "T
xs[n]" will mean preserving the current stack pointer, then
subtracting sizeof(T) * n.
That seems simple enough with one VLA in a function, and well structured flow.
In practice there could be a dozen active VLAs, there could be loops so
that some VLAs are destroyed and recreated as a different size; there
could be gotos in and out of blocks [jump into a VLA scope is an error; jumping out isn't, but a compiler needs to detect all this], early
returns and so on.
(Optimising register allocation, stack slot usage, variable lifetimes,
etc., - /that/ is hard work. Adding an another constant variable to
the function is not.)
That's also optional; you can make that as complex or simple as you
like. With a VLA the options are fewer.
(Since you are not trying to make a conforming compiler, you could
quite reasonably allow such VLA's, treating them identically to normal
arrays, while disallowing VLA's whose size is not known until compile
time.)
Any VLAs I might implement would use heap allocation (but that
introduces other matters of implicit calls to support functions that I
would prefer to keep out of a C implementation).
What are the likely values of n? Without VLAs you have to knowingly
use large fixed values of n, and/or rely on deep recursion, to get
overflow.
This is a myth that is regularly trotted out by people who, for
unknown reasons, don't like VLAs. They pretend that somehow heap
allocation is "safer" because malloc will return 0 if an allocation
fails, while VLA's have no such mechanism. In reality, if you were to
write "malloc(n)" with no idea what "n" might be, your program is as
hopelessly unsound as one that writes "int A[n];" with no idea what
"n" might be.
Except that the memory that malloc can draw on can be 1000 times larger
than stack memory.
In an earlier example, creating a VLA with int A[SIZE_MAX] didn't fail
(it would only do so if trying to write beyond the stack size), but malloc(SIZE_MAX) did return NULL.
Keith wrote "that /you/ know can't be longer than N". The programmer
is responsible for writing correct code, not the compiler.
Say you write a library function that looks like this:
int F(int n) {
int A[n];
....
or like this:
bool G(char* s) {
int A[strlen(s)];
....
You don't know who or what will call your function; what checks would
you insert?
What should your docs say about the range of n? You'd want the range to
be as high as possible, or to work with as long strings as possible.
At the same time, you want them to be efficient when n or strlen(s) is
small, which may be most of the time; you don't want the extra overheads
of VLAs, and on gcc-Windows, there can be extra overheads.
(My solution to those examples is extra code to either use a small, fixed-length array, or use the heap, depending on N.)
/You/ are experienced in implementing C. But both you and Jacob
suffer from the same problem here - you are trying to do everything
yourself.
Above you say that VLAs are simple to implement. Here you suggest it
might be too much for an individual.
Although two persons who don't know how it works won't help! You
probably mean looking for existing solutions and implementations from somebody who eventually figured it out.
I prefer language features that don't present difficulties. The only open-ended aspect of compilation I will acknowledge, is back-end optimisation. And I said that there, you can go as far as you like.
On 08/01/2024 16:53, Bart wrote:
This can occur with reference parameters too: I believe you get the
same thing in C++.
Not quite - that's an easy and common misunderstanding. (It is even
more understandable for you, since your language uses "ref" to mean what
is called a "pointer" in C and C++.) References in C++ are not "auto-dereferenced pointers" - they are alternative names for objects.
It is better to think of references as being ways to identify objects,
and pointers as being indirect references. C++ references are /not/ pointers.
It's worse than that.
No, it's not worse - it's just a side-effect of the convenience of notation. Disallowing some of the forms you show below would require
extra rules, adding complication to the standards,
while allowing them
is harmless (since people would not use them in code).
On 08/01/2024 16:53, Bart wrote:
On 08/01/2024 12:50, David Brown wrote:
On 07/01/2024 16:34, Bart wrote:
I used to have that a^.b syntax (deref pointer then index).
But for a few years I've relaxed that so that the deref is done
automatically:
a^.b becomes a.b
a^[b] becomes a[b]
a^(b) becomes a(b)
AFAICS, C can could also relax the (*a).b or a->b synax so that you
just do a.b. You could do that today, and nothing changes. (Of course
it would need a compiler update).
You could, in the sense that (AFAICS) there would be no situation
where in code today "a.b" and "a->b" were both syntactically and
semantically correct but meant different things. Then you could have
a compiler treat the syntax or constraint error "a.b" as intending to
mean "a->b".
I don't think it would be a good idea - I think it just adds confusion
because you easily lose track of what are structs and what are
pointers to structs.
Yet this is exactly what happens with those other examples: you don't
know if the X in X[i] has type T[] or T*. (The use of (*X)[i] when X is
of type T(*)[] is rare.)
And you don't know if the F in F(x) is an actual function, or a pointer
to a function.
The "->" alternate is anyway a little strange:
(*P).m can be written as P->m
(**Q).m can only be reduced to (*Q)->m
So it only works on the last lot of indirection. There is also no
euivalent of just (*P), "->" needs to specify a member name as it
combines two operations.
Yes, the short-cut only works for the (by far) most common case.
I'd rather it be an error when you get these wrong in the code.
I had the same misgivings: there is a loss of transparency, but after I
started using the auto-deref, the benefits outweighed that:
Code was remarkably free of clutter. (And in my case, I had sections of
code that could often be ported as-is to/from my other language that
didn't need those derefs.)
I didn't like automatic dereferencing (but I could live with it -
overall I found Delphi a very productive tool). But that's a
preference, and it is no surprise that other people have different preferences.
My personal preference is either to say that everything is always a
reference (like Python), or everything is always a value (like C) and
do the dereferencing explicitly. Other people make think such
automatic dereferencing is a good idea, but I personally don't.
This can occur with reference parameters too: I believe you get the same
thing in C++.
Not quite - that's an easy and common misunderstanding. (It is even
more understandable for you, since your language uses "ref" to mean what
is called a "pointer" in C and C++.) References in C++ are not "auto-dereferenced pointers" - they are alternative names for objects.
There is no way to distinguish a C++ reference from an
"auto-dereferenced pointer", though. [...]
Indeed. I do believe that absent standardization, using a macro
adds a level of indirection that may adversely affect the code
readability (whether it is ARRAY_SIZE, ARRAY_LENGTH, LENGTH, NUM_ELEMENTS,
or whatever, I would need to refer to the macro definition to
determine the intent of the programmer).
On 01.01.2024 01:07, Tim Rentsch wrote:
[...]
Thanks for your post and suggestions.
Some have already been addressed (and some also answered) in
this thread, [...]
I don't think it would be a good idea - I think it just adds confusion because you easily lose track of what are structs and what are pointers
to structs.
On 08/01/2024 19:50, David Brown wrote:
On 08/01/2024 16:53, Bart wrote:
This can occur with reference parameters too: I believe you get the
same thing in C++.
Not quite - that's an easy and common misunderstanding. (It is even
more understandable for you, since your language uses "ref" to mean
what is called a "pointer" in C and C++.) References in C++ are not
"auto-dereferenced pointers" - they are alternative names for objects.
It is better to think of references as being ways to identify objects,
and pointers as being indirect references. C++ references are /not/
pointers.
They look like auto-dereferenced pointers to me:
It's worse than that.
No, it's not worse - it's just a side-effect of the convenience of
notation. Disallowing some of the forms you show below would require
extra rules, adding complication to the standards,
My MCC compiler seems to manage it (I'm not sure how), and my main
compiler does even better: you can't even do the equivalent of (*F)(), because F is not a pointer to anything and can't be dereferenced.
(Yes, F by itself is still equal to &F, but not F().)
Whatever extra rules or logic are involved, they are insignificant
compared to those for VLAs, or for mixed sign arithmetic, for the
minimum groupings of {} around init data, or the algorithm for searching
for includes files, ...
On 09/01/2024 02:05, Bart wrote:
On 08/01/2024 19:50, David Brown wrote:
On 08/01/2024 16:53, Bart wrote:
This can occur with reference parameters too: I believe you get the
same thing in C++.
Not quite - that's an easy and common misunderstanding. (It is even
more understandable for you, since your language uses "ref" to mean
what is called a "pointer" in C and C++.) References in C++ are not
"auto-dereferenced pointers" - they are alternative names for
objects. It is better to think of references as being ways to
identify objects, and pointers as being indirect references. C++
references are /not/ pointers.
They look like auto-dereferenced pointers to me:
I know they look like that at first glance, but they are not auto-dereferenced pointers. It can sometimes be useful to understand
that when you move references around (such as for function parameters),
they are implemented as though they were a special kind of pointer -
that tells you how efficient they are. But conceptually, for their use
and understanding, I don't think it is helpful.
It's worse than that.
No, it's not worse - it's just a side-effect of the convenience of
notation. Disallowing some of the forms you show below would require
extra rules, adding complication to the standards,
My MCC compiler seems to manage it (I'm not sure how), and my main
compiler does even better: you can't even do the equivalent of (*F)(),
because F is not a pointer to anything and can't be dereferenced.
How is that in any way "better"? You have gone out of your way to make
your compiler non-conforming for the sake of stopping something no one
would ever write? It all sounds a bit pointless (but harmless) to me.
(Yes, F by itself is still equal to &F, but not F().)
Whatever extra rules or logic are involved, they are insignificant
compared to those for VLAs, or for mixed sign arithmetic, for the
minimum groupings of {} around init data, or the algorithm for
searching for includes files, ...
Whataboutism is a strong sign that the person arguing has lost track of
their point.
On 09/01/2024 07:30, David Brown wrote:
On 09/01/2024 02:05, Bart wrote:
On 08/01/2024 19:50, David Brown wrote:
On 08/01/2024 16:53, Bart wrote:
This can occur with reference parameters too: I believe you get the
same thing in C++.
Not quite - that's an easy and common misunderstanding. (It is even
more understandable for you, since your language uses "ref" to mean
what is called a "pointer" in C and C++.) References in C++ are not
"auto-dereferenced pointers" - they are alternative names for
objects. It is better to think of references as being ways to
identify objects, and pointers as being indirect references. C++
references are /not/ pointers.
They look like auto-dereferenced pointers to me:
I know they look like that at first glance, but they are not
auto-dereferenced pointers. It can sometimes be useful to understand
that when you move references around (such as for function
parameters), they are implemented as though they were a special kind
of pointer - that tells you how efficient they are. But conceptually,
for their use and understanding, I don't think it is helpful.
It's worse than that.
No, it's not worse - it's just a side-effect of the convenience of
notation. Disallowing some of the forms you show below would
require extra rules, adding complication to the standards,
My MCC compiler seems to manage it (I'm not sure how), and my main
compiler does even better: you can't even do the equivalent of
(*F)(), because F is not a pointer to anything and can't be
dereferenced.
How is that in any way "better"? You have gone out of your way to
make your compiler non-conforming for the sake of stopping something
no one would ever write? It all sounds a bit pointless (but harmless)
to me.
(Yes, F by itself is still equal to &F, but not F().)
Whatever extra rules or logic are involved, they are insignificant
compared to those for VLAs, or for mixed sign arithmetic, for the
minimum groupings of {} around init data, or the algorithm for
searching for includes files, ...
Whataboutism is a strong sign that the person arguing has lost track
of their point.
Not at all. You are defending some crazy anomaly, that you find nowhere
else, for some insubstantial reason.
It works like that because the language was poorly designed in the first place and nobody thought it worthwhile fixing it.
Now C does something odd: if T happens to be a function type, it
immediately turns that result back into *T.
On 09/01/2024 12:11, Bart wrote:
Not at all. You are defending some crazy anomaly, that you find
nowhere else, for some insubstantial reason.
I'm saying it doesn't matter. I have never heard it mentioned,
anywhere, except by you. It is completely irrelevant.
It works like that because the language was poorly designed in the
first place and nobody thought it worthwhile fixing it.
It works like that because the language was /well/ designed, for its
purpose at the time, with the requirements and limitations of the time.
Remember, as you always seem to forget, C was not designed with the sole purpose of making /your/ life as easy as possible, or suiting /your/ particular preferences. The "(***foo)" anomaly is a side-effect of the
way functions work in C. Leaving it in costs nothing, removing it would
be an effort and an inconvenience.
Do you /really/ think that, if this were important, no one would have suggested disallowing it? Do you /really/ think that, if this were
leading to poor code, misunderstandings or mistakes, that compiler
writers would not be able to add warnings about it? Do you /really/
think you are so vastly superior to everyone else in the C world that
you alone see the problem?
Sometimes you are like that old joke of the guy driving the wrong way
down the motorway, complaining that everyone else is wrong.
On 09/01/2024 14:56, David Brown wrote:
On 09/01/2024 12:11, Bart wrote:
Not at all. You are defending some crazy anomaly, that you find
nowhere else, for some insubstantial reason.
I'm saying it doesn't matter. I have never heard it mentioned,
anywhere, except by you. It is completely irrelevant.
So you see this in code:
(*F)(x);
and don't assume that F must be a pointer to function; why not? I
thought you wanted that transparency?
On 2024-01-09, Bart <bc@freeuk.cm> wrote:
On 09/01/2024 14:56, David Brown wrote:
On 09/01/2024 12:11, Bart wrote:
Not at all. You are defending some crazy anomaly, that you find
nowhere else, for some insubstantial reason.
I'm saying it doesn't matter. I have never heard it mentioned,
anywhere, except by you. It is completely irrelevant.
So you see this in code:
(*F)(x);
and don't assume that F must be a pointer to function; why not? I
thought you wanted that transparency?
The use of (*pf)(args, ...) is unnecessary and in my experience, rare.
It's a style used to emphasize that this is an indirect call.
I probably first saw that ages ago in some XWindow sources or examples.
I don't think I've ever seen that used on a function, as in
(*strcmp)(str, "foo")
which would be a kind of misuse, like a deliberately misleading comment.
(Was that explicit dereference ever required in some versions of C or C compilers?)
(*pf)(args, ...) style for indirections does have something
to recommend it. If it is consistently and appropriately used, it
clearly indicates the function indirections.
On 09/01/2024 14:56, David Brown wrote:
On 09/01/2024 12:11, Bart wrote:
Not at all. You are defending some crazy anomaly, that you find
nowhere else, for some insubstantial reason.
I'm saying it doesn't matter. I have never heard it mentioned,
anywhere, except by you. It is completely irrelevant.
So you see this in code:
(*F)(x);
and don't assume that F must be a pointer to function; why not? I
thought you wanted that transparency?
On 09/01/2024 18:46, Bart wrote:
On 09/01/2024 14:56, David Brown wrote:
On 09/01/2024 12:11, Bart wrote:
Not at all. You are defending some crazy anomaly, that you find
nowhere else, for some insubstantial reason.
I'm saying it doesn't matter. I have never heard it mentioned,
anywhere, except by you. It is completely irrelevant.
So you see this in code:
(*F)(x);
and don't assume that F must be a pointer to function; why not? I
thought you wanted that transparency?
Yes, I assume F is a pointer to a function - because I assume, unless
proven otherwise, that the author of the code is not a complete moron or
an evil maniac doing his or her best to confuse people.
(I haven't bothered responding to the rest of your post, because I can't
see a realistic and accurate response without sounding nasty, and I
don't want to do that. Suffice to say that C is inordinately successful
as a language, and your language is not - perhaps there are good reasons
for that.)
to an "unbounded" array... the insanity of evaluating to 0 when given a pointer/reference
BC:
DB:..it is low priority because it is never used that way.
That's the kind of potentially confusing short-cut you can make whenonly you ever use the language.
Bart <bc@freeuk.cm> writes:
[...]
I did then try to fix that, but stopped because of interactions with[...]
actual zero-length arrays (I would need to be fully immersed to make
the changes).
Then the next day I found that C doesn't even /have/ zero-length
arrays: you can't have an empty array; how about that? I thought C
was zero-based!
Have you considered learning the language before trying to implement it?
The use of (*pf)(args, ...) is unnecessary and in my experience, rare.
It's a style used to emphasize that this is an indirect call.
I probably first saw that ages ago in some XWindow sources or examples.
I don't think I've ever seen that used on a function, as in
(*strcmp)(str, "foo")
which would be a kind of misuse, like a deliberately misleading comment.
(Was that explicit dereference ever required in some versions of C or C compilers?)
This is the funny thing about C (well one of many funny things):
* It has types which are pointers to Arrays, Structs and Functions
* It has formal syntax to dereference all of those
* But, that syntax is never used!
This is what I mean:
Type Formal Syntax Idiomatic Use
A Pointer to Array (*A)[i] A[i]
P Pointer to Struct (*P).m P->m
F Pointer to Function (*F)(x) F(x)
You only ever see syntax in the last column, hardly ever that formal style.
Which is probably just as well, as it's pretty ugly, and harder to type.
With arrays, people don't even bother with actual array pointers.
Structs have that funny -> operator, and Functions have those special rules.
At least, with P->m, you know that P must be a pointer to a struct, but
you can't tell with those others.
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-01-09, Bart <bc@freeuk.cm> wrote:
Now C does something odd: if T happens to be a function type, it
immediately turns that result back into *T.
That's one way to have distinct concepts of function type and pointer to
that type, while preserving most the assembly language semantics of
working with functions as addresses.
On my x86_64 system with gcc, these lines of C:
func();
funcptr();
result in these call instructions:
call func
call *%rdx
This is the funny thing about C (well one of many funny things):
* It has types which are pointers to Arrays, Structs and Functions
* It has formal syntax to dereference all of those
* But, that syntax is never used!
This is what I mean:
Type Formal Syntax Idiomatic Use
A Pointer to Array (*A)[i] A[i]
On 09/01/2024 21:15, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
[...]
I did then try to fix that, but stopped because of interactions with[...]
actual zero-length arrays (I would need to be fully immersed to make
the changes).
Then the next day I found that C doesn't even /have/ zero-length
arrays: you can't have an empty array; how about that? I thought C
was zero-based!
Have you considered learning the language before trying to implement it?
Which version of C do you suggest; the one where:
#include <stdio.h>
int main(void) {
int A[0];
int B[]={};
printf("%zu\n", sizeof(A));
printf("%zu\n", sizeof(B));
}
compiles fine with tcc, gcc and clang, and displays 0 for the sizes?
Or the one where those produce warnings? Or the one where it actually fails?
On 09/01/2024 18:56, David Brown wrote:
On 09/01/2024 18:46, Bart wrote:
On 09/01/2024 14:56, David Brown wrote:
On 09/01/2024 12:11, Bart wrote:
Not at all. You are defending some crazy anomaly, that you find
nowhere else, for some insubstantial reason.
I'm saying it doesn't matter. I have never heard it mentioned,
anywhere, except by you. It is completely irrelevant.
So you see this in code:
(*F)(x);
and don't assume that F must be a pointer to function; why not? I
thought you wanted that transparency?
Yes, I assume F is a pointer to a function - because I assume, unless
proven otherwise, that the author of the code is not a complete moron or
an evil maniac doing his or her best to confuse people.
Maybe F used to be a pointer, but is now a normal function, and the code
was not updated. Or maybe a normal function named F is now shadowing the
more global function pointer F.
Don't tell me: gcc has some option to warn of using octal literals. So
your 'successful' language relies on extensive extra tools (all the
stuff in that support truck) to keep it useable.
Bart <bc@freeuk.cm> writes:
Which version of C do you suggest; the one where:
#include <stdio.h>
int main(void) {
int A[0];
int B[]={};
printf("%zu\n", sizeof(A));
printf("%zu\n", sizeof(B));
}
compiles fine with tcc, gcc and clang, and displays 0 for the sizes?
Or the one where those produce warnings? Or the one where it actually fails?
All of them. You've been told over, and over, and over.
$ cc --pedantic-errors -o /tmp/a /tmp/t.c
/tmp/t.c: In function 'main':
/tmp/t.c:4:7: error: ISO C forbids zero-size array 'A' [-Wpedantic]
int A[0];
^
/tmp/t.c:5:11: error: ISO C forbids empty initializer braces [-Wpedantic]
int B[]={};
^
/tmp/t.c:5:7: error: zero or negative size array 'B'
int B[]={};
^
$
On 09/01/2024 21:15, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
[...]
I did then try to fix that, but stopped because of interactions with[...]
actual zero-length arrays (I would need to be fully immersed to make
the changes).
Then the next day I found that C doesn't even /have/ zero-length
arrays: you can't have an empty array; how about that? I thought C
was zero-based!
Have you considered learning the language before trying to implement
it?
Which version of C do you suggest; the one where:
#include <stdio.h>
int main(void) {
int A[0];
int B[]={};
printf("%zu\n", sizeof(A));
printf("%zu\n", sizeof(B));
}
compiles fine with tcc, gcc and clang, and displays 0 for the sizes?
Or the one where those produce warnings? Or the one where it actually fails?
So, if C doesn't have zero-length arrays, why don't I get a fatal error?
The fact that you didn't get the required diagnostic message is because
none of those programs conforms to any version of the C standard in it's default mode. Since you've made it clear that putting them into
conforming mode is too complicated for you to understand,
On 09/01/2024 22:37, James Kuyper wrote:
The fact that you didn't get the required diagnostic message is because
none of those programs conforms to any version of the C standard in it's
default mode. Since you've made it clear that putting them into
conforming mode is too complicated for you to understand,
So why doesn't a /C/ compiler put itself into conforming mode?
It seems THAT is too hard for it to understand!
Seriously, why don't they do that?
Then lots of people who don't bother with those fiddly options don't
get the wrong impression about what C actually allows.
Bart <bc@freeuk.cm> writes:
On 09/01/2024 22:37, James Kuyper wrote:
The fact that you didn't get the required diagnostic message is because
none of those programs conforms to any version of the C standard in it's >>> default mode. Since you've made it clear that putting them into
conforming mode is too complicated for you to understand,
So why doesn't a /C/ compiler put itself into conforming mode?
It seems THAT is too hard for it to understand!
Seriously, why don't they do that?
Seriously, why are you asking us?
Then lots of people who don't bother with those fiddly options don't
get the wrong impression about what C actually allows.
That's a question about C compilers, not about the C language. I don't
think anyone here works on any of the major C compilers, so you're not
likely to get a definitive answer here.
You have your own C or C-like compiler, don't you? Does it attempt to
be fully conforming by default? Don't you have your own reasons for
that decision?
I *think* the general attitude of the gcc maintainers has been that
gcc's useful extensions are more important than ISO C conformance, and
that there's usually not much reason to use a C compiler other than gcc.
That's based on my vague memory of something I read some years ago.
clang tries to be closely compatible with gcc. tcc is small enough that
it doesn't claim to be conforming.
However, gcc does make it easy (yes, it's easy)
On 09/01/2024 22:37, James Kuyper wrote:
The fact that you didn't get the required diagnostic message is because
none of those programs conforms to any version of the C standard in it's
default mode. Since you've made it clear that putting them into
conforming mode is too complicated for you to understand,
So why doesn't a /C/ compiler put itself into conforming mode?
Then lots of people who don't bother with those fiddly options don't get
the wrong impression about what C actually allows.
On 10/01/2024 00:05, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
Meanwhile I routinely test C programs on half a dozen compilers. I can't
be bothered with all this crap.
On 10/01/2024 00:05, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:The first version did a better job than gcc of outlawing what I
considered out-dated or dangerous features WITHOUT NEEDING TO BE TOLD.
However because some legacy programs still used them (often
inadvertently because people simply didn't know about them; /their/
compiler said nothing), it was necessary to opt-in to build them.
On the update I did a few months ago, I decided I didn't care any more.
The option was removed. C is a lost cause.
I *think* the general attitude of the gcc maintainers has been that
gcc's useful extensions are more important than ISO C conformance, and
that there's usually not much reason to use a C compiler other than gcc.
That's based on my vague memory of something I read some years ago.
clang tries to be closely compatible with gcc. tcc is small enough that
it doesn't claim to be conforming.
However, gcc does make it easy (yes, it's easy)
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
Meanwhile I routinely test C programs on half a dozen compilers. I can't
be bothered with all this crap.
Here's a source file; it's C code; now go and compile it for me, and
tell me if there is anything badly wrong. And I mean properly wrong not
some ******* unused label which gets lumped in with a missing function prototype so that it thinks all the arguments are ints.
Bart <bc@freeuk.cm> writes:
You clearly do care a great deal. You care enough to spend a whole lot
of time and effort posting in a newsgroup that discusses the language
you say you don't care about, and refusing to learn things that we tell
you over and over again.
But I know how to make it do so, and I don't forget the relevant options
5 minutes after someone tells me about them.
Meanwhile I routinely test C programs on half a dozen compilers. I
can't be bothered with all this crap.
You use half a dozen compilers, you can't be bothered to use them
properly, and you complain every time someone tries to toll you how to
do so.
Feel free to waste your time, but please please stop wasting ours.
No, it doesn't. C doesn't have zero-length arrays.
Bart <bc@freeuk.cm> writes:
On 10/01/2024 00:05, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
$ functions c
function c
{
gcc -o "$1" -std=c11 -pedantic-errors "$1".c
}
$ cat a.c
#include <stdio.h>
int
main(int argc, const char **argv, const char **envp)
{
printf("Hello World\n");
return 0;
}
$ c a
$ ./a
Hello World
Can't get any more concise than that.
On 1/10/24 01:40, Bart wrote:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
Meanwhile I routinely test C programs on half a dozen compilers. I can't
be bothered with all this crap.
So do as all of us do : put this crap in a Makefile or
a shell script, and forget about it.
Please don't get Bart started on makefiles!
On 09/01/2024 18:56, David Brown wrote:
On 09/01/2024 18:46, Bart wrote:
On 09/01/2024 14:56, David Brown wrote:
On 09/01/2024 12:11, Bart wrote:
Not at all. You are defending some crazy anomaly, that you find
nowhere else, for some insubstantial reason.
I'm saying it doesn't matter. I have never heard it mentioned,
anywhere, except by you. It is completely irrelevant.
So you see this in code:
(*F)(x);
and don't assume that F must be a pointer to function; why not? I
thought you wanted that transparency?
Yes, I assume F is a pointer to a function - because I assume, unless
proven otherwise, that the author of the code is not a complete moron
or an evil maniac doing his or her best to confuse people.
Maybe F used to be a pointer, but is now a normal function, and the code
was not updated. Or maybe a normal function named F is now shadowing the
more global function pointer F.
On Wed, 10 Jan 2024 05:28:43 -0000 (UTC), Kaz Kylheku wrote:
Please don't get Bart started on makefiles!
Does he prefer Ninja?
On 09/01/2024 21:55, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
Which version of C do you suggest; the one where:
#include <stdio.h>
int main(void) {
int A[0];
int B[]={};
printf("%zu\n", sizeof(A));
printf("%zu\n", sizeof(B));
}
compiles fine with tcc, gcc and clang, and displays 0 for the sizes?
Or the one where those produce warnings? Or the one where it actually
fails?
All of them. You've been told over, and over, and over.
Been told what? That I should produce a compiler that, at anyone's whim,
can either pass, fail or warn about the same piece of code?
$ cc --pedantic-errors -o /tmp/a /tmp/t.c
/tmp/t.c: In function 'main':
/tmp/t.c:4:7: error: ISO C forbids zero-size array 'A' [-Wpedantic]
int A[0];
^
/tmp/t.c:5:11: error: ISO C forbids empty initializer braces [-Wpedantic]
int B[]={};
^
/tmp/t.c:5:7: error: zero or negative size array 'B'
int B[]={};
^
$
So, DOES C HAVE ZERO-LENGTH ARRAYS OR NOT?
It's a really simple question!
Because this program easily passes:
int A[0];
This one doesn't, qne without needing to use -pedantic or
-pedantic-errors:
int A[-1];
If the answer is No, why is gcc so reluctant to complain about it
compared with the -1 size?
On 1/9/2024 9:28 PM, Kaz Kylheku wrote:
On 2024-01-10, tTh <tth@none.invalid> wrote:
On 1/10/24 01:40, Bart wrote:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
Meanwhile I routinely test C programs on half a dozen compilers. I
can't
be bothered with all this crap.
So do as all of us do : put this crap in a Makefile or
a shell script, and forget about it.
Please don't get Bart started on makefiles!
Don't get me started about freaking out for some minutes when I failed
to use a god damn tab! My makefile would not work. God damn it! Ahhhh,
that was around 20 years ago.
On 10/01/2024 02:00, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 10/01/2024 00:05, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
$ functions c
function c
{
gcc -o "$1" -std=c11 -pedantic-errors "$1".c
}
$ cat a.c
#include <stdio.h>
int
main(int argc, const char **argv, const char **envp)
{
printf("Hello World\n");
return 0;
}
$ c a
$ ./a
Hello World
Can't get any more concise than that.
Great. Now tell the gcc people that's what it should do ANYWAY.
On 2024-01-09, Bart <bc@freeuk.cm> wrote:
Don't tell me: gcc has some option to warn of using octal literals. So
your 'successful' language relies on extensive extra tools (all the
stuff in that support truck) to keep it useable.
Unfortunately, it is a cynical observation that languages that don't
require tools tend not to attract people.
On 09/01/2024 23:22, Bart wrote:
1. C is a language defined by the standards. Without further
qualification, "C" refers to the latest published ISO standard -
currently C17. But it's also fine to refer to specific standards, such
as C99. The standards define the language syntax, constraints, required diagnostics (implementations are free to choose warnings or hard
errors), and standard library specifications. There are also many pre-standard C versions, for which "K&R C" is /almost/ a standard.
2. Almost all C compilers implement some extensions by default. These extensions are not C, but are compiler-specific language variants. Some people find them useful, other people prefer to stick to standard C.
Some extensions are so widely implemented that they may be considered a pseudo-standard, others are very compiler or target specific.
3. Almost all C compilers have default warnings and errors that do not
match the standards requirements for C. Often they do this in both directions - failing to issue diagnostics on things that the standards require, and also issuing errors (halting compilation) for things that
the standard allows. Almost all C compilers allow you to tune warnings
and errors.
4. Some C compilers aim to provide conforming modes, often for several standard versions, even though they are non-conforming (see 2 and 3
above) by default. Others don't try to conform to any particular C standard, and can only very loosely be called a C compiler.
5. A "C implementation" needs a compiler, a standard library, headers, a linker, perhaps an assembler, and a way to run the program on the
target. These might be provided together, or combined from different places.
For example, Microsoft provides everything with their MSVC
tools. For gcc-based toolchains, GCC writes the compiler but does not provide binaries. The assembler and linker often come from the binutils project (which again does not provide binaries), but other assemblers
and linkers may be used. Various libraries may be used, depending on
the target, including glibc, newlib, musl, newlib-nano, redlib, avrlib,
and many others. Users can get the parts individually, or more often
they get packaged toolchains from Debian, Redhat, TDM, mingw-64, MS WSL,
NXP, TI, Microchip, or many others according to their needs.
You /know/ how to make gcc and clang compile standard C. Yet you insist
on faking ignorance. You are either the most thick-witted programmer around, or you are a dishonest troll who goes out of their way to spread FUD. And we know you are not thick-witted.
Zero-length arrays are a gcc extension.
On 10/01/2024 08:37, David Brown wrote:
On 09/01/2024 23:22, Bart wrote:
1. C is a language defined by the standards. Without further
qualification, "C" refers to the latest published ISO standard -
currently C17. But it's also fine to refer to specific standards,
such as C99. The standards define the language syntax, constraints,
required diagnostics (implementations are free to choose warnings or
hard errors), and standard library specifications. There are also
many pre-standard C versions, for which "K&R C" is /almost/ a standard.
2. Almost all C compilers implement some extensions by default. These
extensions are not C, but are compiler-specific language variants.
Some people find them useful, other people prefer to stick to standard
C. Some extensions are so widely implemented that they may be
considered a pseudo-standard, others are very compiler or target
specific.
3. Almost all C compilers have default warnings and errors that do not
match the standards requirements for C. Often they do this in both
directions - failing to issue diagnostics on things that the standards
require, and also issuing errors (halting compilation) for things that
the standard allows. Almost all C compilers allow you to tune
warnings and errors.
4. Some C compilers aim to provide conforming modes, often for several
standard versions, even though they are non-conforming (see 2 and 3
above) by default. Others don't try to conform to any particular C
standard, and can only very loosely be called a C compiler.
5. A "C implementation" needs a compiler, a standard library, headers,
a linker, perhaps an assembler, and a way to run the program on the
target. These might be provided together, or combined from different
places.
My original implementation for Windows used these two files only:
For example, Microsoft provides everything with their MSVC tools.
For gcc-based toolchains, GCC writes the compiler but does not provide
binaries. The assembler and linker often come from the binutils
project (which again does not provide binaries), but other assemblers
and linkers may be used. Various libraries may be used, depending on
the target, including glibc, newlib, musl, newlib-nano, redlib,
avrlib, and many others. Users can get the parts individually, or
more often they get packaged toolchains from Debian, Redhat, TDM,
mingw-64, MS WSL, NXP, TI, Microchip, or many others according to
their needs.
You /know/ how to make gcc and clang compile standard C. Yet you
insist on faking ignorance. You are either the most thick-witted
programmer around, or you are a dishonest troll who goes out of their
way to spread FUD. And we know you are not thick-witted.
Suppose you were given a C program to build. There are no instructions
(or there are instructions, but they are encoded inside some script in a proprietory language that you don't understand and don't have a tool for).
How do you know what to tell gcc to compile it?
Suppose you took a program that you know perfectly well how to compile
with gcc, but you some reason you don't have it (maybe all instances of
gcc vanished overnight after some power blockout).
Zero-length arrays are a gcc extension.
I wonder how many thousands of lines of code it took to implement such
an extension?
On 10/01/2024 13:12, bart wrote:
For example, Microsoft provides everything with their MSVC tools.
For gcc-based toolchains, GCC writes the compiler but does not
provide binaries. The assembler and linker often come from the
binutils project (which again does not provide binaries), but other
assemblers and linkers may be used. Various libraries may be used,
depending on the target, including glibc, newlib, musl, newlib-nano,
redlib, avrlib, and many others. Users can get the parts
individually, or more often they get packaged toolchains from Debian,
Redhat, TDM, mingw-64, MS WSL, NXP, TI, Microchip, or many others
according to their needs.
No one cares.
Without the slightest doubt, I can say that if it were possible to build
it using /your/ C compiler, and does not use extensions unsupported by
gcc, then I could get it to build with gcc without much trouble.
think the same would apply to pretty much anyone who has some experience using gcc from the command line.
Suppose you took a program that you know perfectly well how to compile
with gcc, but you some reason you don't have it (maybe all instances
of gcc vanished overnight after some power blockout).
Now you sound as silly as that C90 fanatic who turns up here occasionally.
I wonder how many thousands of lines of code it took to implement such
an extension?
I have no idea. It was a good idea at the time, because C90 did not
have flexible array members. It is now pretty much unnecessary, though
I am sure some people use "[0]" instead of "[]" in their flexible array struct members. And some might take advantage of gcc's slightly greater flexibility here than the C standards define.
On 10/01/2024 03:14, Bart wrote:
On 10/01/2024 02:00, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 10/01/2024 00:05, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
$ functions c
function c
{
gcc -o "$1" -std=c11 -pedantic-errors "$1".c
}
$ cat a.c
#include <stdio.h>
int
main(int argc, const char **argv, const char **envp)
{
printf("Hello World\n");
return 0;
}
$ c a
$ ./a
Hello World
Can't get any more concise than that.
Great. Now tell the gcc people that's what it should do ANYWAY.
But it is /not/ what gcc should do.
You seem to be mixing up "what Bart wants" with "what countless other
people want". Write your own tools to revolve around your own selfish
needs if that's your preference, but don't expect everyone else to
change their worlds to suit /you/.
Scott wrote that as an example that might suit you.
I am confident that
the compiler options he mostly uses are different from that - and that
he uses a variety of different options and different times, and that
they are different from the options /I/ use or anyone else uses.
My original implementation for Windows used these two files only:
bcc.exe (about 1MB)
On 10/01/2024 13:17, David Brown wrote:
On 10/01/2024 13:12, bart wrote:
For example, Microsoft provides everything with their MSVC tools.
For gcc-based toolchains, GCC writes the compiler but does not
provide binaries. The assembler and linker often come from the
binutils project (which again does not provide binaries), but other
assemblers and linkers may be used. Various libraries may be used,
depending on the target, including glibc, newlib, musl, newlib-nano,
redlib, avrlib, and many others. Users can get the parts
individually, or more often they get packaged toolchains from
Debian, Redhat, TDM, mingw-64, MS WSL, NXP, TI, Microchip, or many
others according to their needs.
No one cares.
But you cared enough to list all those complicated ways that some have implemented C. It sounded almost lovingly.
You know, the most remarkable and impressive product for me is the Tiny
C compiler. It's Small. It's Fast. It's Simple.
Anybody can implement their own ideas of how it should work. There are reasons why those people put considerable efforts into those smaller, self-contained and more user-friendly products.
For me a compiler should be as utilitarian and simple to use as a light switch.
Without the slightest doubt, I can say that if it were possible to
build it using /your/ C compiler, and does not use extensions
unsupported by gcc, then I could get it to build with gcc without much
trouble.
You seem to be acknowledging the benefits of passing a code base through
a lesser compiler.
What is the minimum number of actual (ie. not ZIP) files needed to
bundle gcc in the same way? Give me a number. (I guess you don't know; I thought you knew your tools inside out!)
On 10/01/2024 00:49, Keith Thompson wrote:...
I spent some minutes trying to find where this zero-length array
business originated. It was from you:
No, it doesn't. C doesn't have zero-length arrays.
You say "C", but don't say which version or which common
dialect.
... Apperently the most commonly used dialect is gnu C,
I'm still none the wiser. Can I actually use zero-length arrays? Yes,
No, Maybe, Only with this or that compiler or that bunch of options.
David Brown <david.brown@hesbynett.no> writes:
On 10/01/2024 03:14, Bart wrote:
On 10/01/2024 02:00, Scott Lurndal wrote:
Bart <bc@freeuk.cm> writes:
On 10/01/2024 00:05, Keith Thompson wrote:
Bart <bc@freeuk.cm> writes:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
$ functions c
function c
{
gcc -o "$1" -std=c11 -pedantic-errors "$1".c
}
$ cat a.c
#include <stdio.h>
int
main(int argc, const char **argv, const char **envp)
{
printf("Hello World\n");
return 0;
}
$ c a
$ ./a
Hello World
Can't get any more concise than that.
Great. Now tell the gcc people that's what it should do ANYWAY.
But it is /not/ what gcc should do.
You seem to be mixing up "what Bart wants" with "what countless other
people want". Write your own tools to revolve around your own selfish
needs if that's your preference, but don't expect everyone else to
change their worlds to suit /you/.
Scott wrote that as an example that might suit you.
I am confident that
the compiler options he mostly uses are different from that - and that
he uses a variety of different options and different times, and that
they are different from the options /I/ use or anyone else uses.
Indeed. I use a Makefile and the Makefile.defs include alone has 373
lines - something sure to piss Bart off. Of course, the project
has:
SLOC Directory SLOC-by-Language (Sorted)
7316068 include ansic=7274603,cpp=41465
899374 tests python=763294,ansic=82789,asm=34873,cpp=18013,sh=405 885492 io cpp=603113,ansic=281285,python=466,sh=324,asm=304 133342 processor cpp=131855,python=1487
133153 3rd_party cpp=133033,sh=78,python=42
26803 tools python=17834,cpp=4300,sh=1903,ansic=1459,perl=1199,
ruby=108
16754 gen ansic=16754
10955 platform cpp=10955
8392 common cpp=8392
5290 bin cpp=5118,python=172
2204 cpc cpp=2204
1883 top_dir cpp=1883
1430 noc cpp=1430
560 shim cpp=560
SLOC doesn't include whitespace or comments.
That's about 10 million lines of code across several hundred source
and include files.
Yet, the entire application can be built with
$ make
bart <bc@freeuk.com> writes:
My original implementation for Windows used these two files only:
bcc.exe (about 1MB)
Why do you think anyone here cares?
On 10/01/2024 15:31, bart wrote:
It is almost entirely useless, except for the very, very few people who /need/ a C compiler that is very small. In the days of 3.5" floppy
disks for rescue disks, it was useful. In the days of 16 GB USB flash devices costing pennies, it is irrelevant.
It is still remarkable, and still impressive - I have no disagreements there. It will always be an impressive achievement - that will not
change with time. It might also be fun to play with. But it is still basically useless. It is no longer a particularly useful or needed
tool. Lots of C compilers were once relevant, and now are not, except perhaps for a few very niche uses.
Anybody can implement their own ideas of how it should work. There are
reasons why those people put considerable efforts into those smaller,
self-contained and more user-friendly products.
"User-friendly" /is/ something worth caring about. So is ease of installation. Small size, self-contained binaries - those are not of concern to users.
It is absolutely a good thing for many people that a
tool is easy to understand and use - but no one cares what goes on
behind the scenes to make it work, as long as it works fast enough to be practical on sensible computers. (A half-arsed compiler that implements some unknown and undocumented subset of C, with random changes at the
whim of the developer, cannot ever be "user-friendly" for anyone other
than said developer.)
On 10/01/2024 14:51, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
My original implementation for Windows used these two files only:
bcc.exe (about 1MB)
Why do you think anyone here cares?
Why do you think anyone cares about your makefiles?
bart <bc@freeuk.com> writes:
On 10/01/2024 14:49, Scott Lurndal wrote:[...]
Yet, the entire application can be built with
$ make
I bet you can't. There's something missing. Unless the implicit file
that make uses happens to be in that '$' directory. Usually you have
to navigate to the project first.
That '$' is a shell prompt, not a directory.
Yes, you have to "cd" (like "chdir" on Windows) to the project directory before typing "make". You also have to make sure the computer is
powered on, login, launch a shell, and maybe one or two other things
before you get to that point.
On 10/01/2024 14:51, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
My original implementation for Windows used these two files only:
bcc.exe (about 1MB)
Why do you think anyone here cares?
Why do you think anyone cares about your makefiles?
It's a counter-example to somebody who might think implementations
really need to be as elaborate as DB was suggesting.
On 09/01/2024 23:12, Kaz Kylheku wrote:
On 2024-01-09, Bart <bc@freeuk.cm> wrote:
Don't tell me: gcc has some option to warn of using octal literals. So
your 'successful' language relies on extensive extra tools (all the
stuff in that support truck) to keep it useable.
Unfortunately, it is a cynical observation that languages that don't
require tools tend not to attract people.
A less cynical observation would suggest that a successful language is
one that is used a lot, and therefore attracts additional tools that
make its use better.
On 10/01/2024 18:39, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 10/01/2024 14:49, Scott Lurndal wrote:[...]
Yet, the entire application can be built with
$ make
I bet you can't. There's something missing. Unless the implicit file
that make uses happens to be in that '$' directory. Usually you have
to navigate to the project first.
That '$' is a shell prompt, not a directory.
Yes, you have to "cd" (like "chdir" on Windows) to the project directory
before typing "make". You also have to make sure the computer is
powered on, login, launch a shell, and maybe one or two other things
before you get to that point.
You're missing the point. SL is making a big deal about the fact that
you can type 'make' without providing any apparent input. But that input
is provided when you use 'cd' to get the relevant folder.
Or do you think that the 'make' command provided will build that
specific project irrespective of the CWD?
On 2024-01-10, David Brown <david.brown@hesbynett.no> wrote:
On 09/01/2024 23:12, Kaz Kylheku wrote:
On 2024-01-09, Bart <bc@freeuk.cm> wrote:
Don't tell me: gcc has some option to warn of using octal literals. So >>>> your 'successful' language relies on extensive extra tools (all the
stuff in that support truck) to keep it useable.
Unfortunately, it is a cynical observation that languages that don't
require tools tend not to attract people.
A less cynical observation would suggest that a successful language is
one that is used a lot, and therefore attracts additional tools that
make its use better.
Also, there is no language for which the implementations have the final
word on any program, such that the users cannot imagine needing new
kinds of diagnostics.
What is a diagnostic? It's something which evaluate a
truth-valued proposition about a program, and reports if it
is true.
A program has a large number of properties about which true propositions
can be stated. Any of those truths could potentially be a useful
diagnostic to someone.
It is not realistic to expect that a language specification can state
all the propositions that may be diagnosed, such that no others may be diagnosed.
Furthermore, users may want to be informed about certain programs which
are correct by the language, but which have some undesirable property, specific to their requirements, such as a local coding standard or
even application-specific requirements.
E.g. a C program which puts sensitive info into an improperly procured temporary file might be correct by ISO C, but we can imagine a
sophisticated diagnostic identifying the situation.
On 10/01/2024 15:51, David Brown wrote:
On 10/01/2024 15:31, bart wrote:
It is almost entirely useless, except for the very, very few people
who /need/ a C compiler that is very small. In the days of 3.5"
floppy disks for rescue disks, it was useful. In the days of 16 GB
USB flash devices costing pennies, it is irrelevant.
16 GB can hold a great deal of useful DATA: images, audio and video for example. It also tends to be consumed sequentially.
On 10/01/2024 18:39, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 10/01/2024 14:49, Scott Lurndal wrote:[...]
Yet, the entire application can be built with
$ make
I bet you can't. There's something missing. Unless the implicit file
that make uses happens to be in that '$' directory. Usually you have
to navigate to the project first.
That '$' is a shell prompt, not a directory.
Yes, you have to "cd" (like "chdir" on Windows) to the project directory
before typing "make". You also have to make sure the computer is
powered on, login, launch a shell, and maybe one or two other things
before you get to that point.
You're missing the point. SL is making a big deal about the fact that
you can type 'make' without providing any apparent input. But that input
is provided when you use 'cd' to get the relevant folder.
Or do you think that the 'make' command provided will build that
specific project irrespective of the CWD?
On 10/01/2024 19:57, bart wrote:
On 10/01/2024 15:51, David Brown wrote:
On 10/01/2024 15:31, bart wrote:
It is almost entirely useless, except for the very, very few people
who /need/ a C compiler that is very small. In the days of 3.5"
floppy disks for rescue disks, it was useful. In the days of 16 GB
USB flash devices costing pennies, it is irrelevant.
16 GB can hold a great deal of useful DATA: images, audio and video
for example. It also tends to be consumed sequentially.
What the *beep* are you talking about? Does your compiler consist of 1
MB of program and 16 GB of video about how much better it is than
everything else?
The prime use of tcc, when it had a use, was for rescue disks and other situations when you needed to boot from small mediums and have a running system of some sort without installing on a hard disk. That was
originally a floppy, so tiny tools were vital. Then boot CDs became
popular - size was no longer an issue unless you wanted a mini CD. By
the time DVD's were common for the job, size of tools was irrelevant.
I've just checked my main IT supplier - the /cheapest/ USB stick they
have is 32 GB.
So, when claiming that you only need to type one thing to start the
process, it is disingenuous to leave out that part out.
Before typing 'make', you'd better be sure you're in the right place!
On 2024-01-10, bart <bc@freeuk.com> wrote:
So, when claiming that you only need to type one thing to start the
process, it is disingenuous to leave out that part out.
That's just knowing where the project is. That's something external to
the project; it's not a build secret hidden in the project itself, but
likely something the user themselves chose.
It's the same for any project, in any language using any build
procedure; they all have a location, and the first step is usually
changing to that location. Some users will cd to the project root even
before looking at any instructions.
(Perhaps the instructions will tell them smoething else, like
create a build directory somewhere, change to /that/ directory and
from the reference some build script in the unpacked tree.)
Before typing 'make', you'd better be sure you're in the right place!
The instructions for that project can't even tell you what that is.
It's "wherever you unpacked/cloned the project".
The above tries to compile hello.c with gcc. The first attempt doesn't
work. The second does, as no error is displayed. But what exactly has it compiled it to? As there is no file called 'hello.exe.
On 1/10/24 19:57, bart wrote:
The above tries to compile hello.c with gcc. The first attempt doesn't
work. The second does, as no error is displayed. But what exactly has
it compiled it to? As there is no file called 'hello.exe.
o ____ _____ _____ __ __
o | _ \ |_ _| | ___| | \/ |
o | |_) | | | | |_ | |\/| |
o | _ < | | | _| | | | |
o |_| \_\ |_| |_| |_| |_|
o
This is what I mean by user-friendly:
c:\c>gcc hello
C:\tdm\bin\ld.exe: cannot find hello: No such file or directory
collect2.exe: error: ld returned 1 exit status
c:\c>gcc hello.c
c:\c>hello
'hello' is not recognized as an internal or external command,
operable program or batch file.
c:\c>gcc hello.c -hello
gcc: error: unrecognized command-line option '-h'
c:\c>gcc hello.c -ohello
Please don't talk to me about user-friendly, you don't seem to have a clue.
On 10/01/2024 21:43, Kaz Kylheku wrote:
On 2024-01-10, bart <bc@freeuk.com> wrote:
So, when claiming that you only need to type one thing to start the
process, it is disingenuous to leave out that part out.
That's just knowing where the project is. That's something external to
the project; it's not a build secret hidden in the project itself, but
likely something the user themselves chose.
It's the same for any project, in any language using any build
procedure; they all have a location, and the first step is usually
changing to that location. Some users will cd to the project root even
before looking at any instructions.
(Perhaps the instructions will tell them smoething else, like
create a build directory somewhere, change to /that/ directory and
from the reference some build script in the unpacked tree.)
Before typing 'make', you'd better be sure you're in the right place!
The instructions for that project can't even tell you what that is.
It's "wherever you unpacked/cloned the project".
I once claimed that with the build schemes I prefer, you usually have to type only two things:
mm prog # my language
mcc @prog # C; also works across compilers
bcc -auto prog # my older compiler for projects written
# to certain rules
But 'make' was claimed to be superior because you only had to type one thing:
make
On 10/01/2024 21:43, Kaz Kylheku wrote:
On 2024-01-10, bart <bc@freeuk.com> wrote:
So, when claiming that you only need to type one thing to start the
process, it is disingenuous to leave out that part out.
That's just knowing where the project is. That's something external to
the project; it's not a build secret hidden in the project itself, but
likely something the user themselves chose.
It's the same for any project, in any language using any build
procedure; they all have a location, and the first step is usually
changing to that location. Some users will cd to the project root even
before looking at any instructions.
(Perhaps the instructions will tell them smoething else, like
create a build directory somewhere, change to /that/ directory and
from the reference some build script in the unpacked tree.)
Before typing 'make', you'd better be sure you're in the right place!
The instructions for that project can't even tell you what that is.
It's "wherever you unpacked/cloned the project".
I once claimed that with the build schemes I prefer, you usually have to
type only two things:
mm prog # my language
mcc @prog # C; also works across compilers
bcc -auto prog # my older compiler for projects written
# to certain rules
But 'make' was claimed to be superior because you only had to type one
thing:
make
Never mind that 'makefile' might contains 100s or even 1000s of lines in
It caused confusion, and made things go badly wrong, things that would
have been picked up if 'make' needed the name of an input, as you surely wouldn't have used the same name for both.
This seems common in Unix: let's call every input file 'makefile', and
every output file 'a.out'!)
On 10/01/2024 19:55, David Brown wrote:
On 10/01/2024 19:57, bart wrote:
On 10/01/2024 15:51, David Brown wrote:
On 10/01/2024 15:31, bart wrote:
It is almost entirely useless, except for the very, very few people
who /need/ a C compiler that is very small. In the days of 3.5"
floppy disks for rescue disks, it was useful. In the days of 16 GB
USB flash devices costing pennies, it is irrelevant.
16 GB can hold a great deal of useful DATA: images, audio and video
for example. It also tends to be consumed sequentially.
What the *beep* are you talking about? Does your compiler consist of
1 MB of program and 16 GB of video about how much better it is than
everything else?
The prime use of tcc, when it had a use, was for rescue disks and
other situations when you needed to boot from small mediums and have a
running system of some sort without installing on a hard disk. That
was originally a floppy, so tiny tools were vital. Then boot CDs
became popular - size was no longer an issue unless you wanted a mini
CD. By the time DVD's were common for the job, size of tools was
irrelevant.
I've just checked my main IT supplier - the /cheapest/ USB stick they
have is 32 GB.
I'm starting to wonder how dense you can be. But I know that's not the
the case; just unreceptive to certain concepts or unwilling to consider
them.
You are claiming that there is no point in limiting the size of code,
because after all you can buy 16GB or 32GB memory sticks.
So I'm not sure what your point is. Does the prevalence of very cheap
storage make it OK to have code that is 10, 100 or 1000 times bigger
than it need be?
Apparently so, according to you. Because of course there are no
consequences of that.
BTW these are some programs which are part of my gcc:
nasm.exe 1.4MB
ld.exe 1.5MB
as.exe 1.6MB
The first one, perhaps first two, will just fit onto one floppy, which I think is 1.44MiB. The last is just over.
So three standalone programs that are still, in 2024, at floppy disk
scale. Why is that, given that you can buy 32,000MB storages devices for 'pennies'?
Note that I did not write these programs. I don't have a monopoly on
writing smallish, self-contained applications.
Will you also dismiss those apps for being too small to be of
consequence? Or will you admit that sometimes the scale of a task isn't
great enough to warrant a huge executable?
On 1/10/2024 2:10 AM, David Brown wrote:
On 10/01/2024 08:40, Chris M. Thomasson wrote:
On 1/9/2024 9:28 PM, Kaz Kylheku wrote:
On 2024-01-10, tTh <tth@none.invalid> wrote:
On 1/10/24 01:40, Bart wrote:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
Meanwhile I routinely test C programs on half a dozen compilers. I >>>>>> can't
be bothered with all this crap.
So do as all of us do : put this crap in a Makefile or
a shell script, and forget about it.
Please don't get Bart started on makefiles!
Don't get me started about freaking out for some minutes when I
failed to use a god damn tab! My makefile would not work. God damn
it! Ahhhh, that was around 20 years ago.
The author of "make" described the distinction between tabs and spaces
to be his worst mistake ever - but use of the tool took off too
quickly for him to change it.
Well, shit happens! lol. Jesting here. ;^)
The difference between Bart and everyone else is that when other
people make a tab mistake, they learn from it - and even 20 years on,
you still remember.
Oh my I sure do remember it, David! It managed to ingrain itself firmly
into my mind. I was getting a bit pissed saying why the f**k wont this
god damn makefile work!!!! GRRRRRR! I can remember it clearly. Fwiw, I
just got dropped off at my house from a party, and decided to create a program that generated some makefiles for my AppCore project at the
time, don't ask why. I might of been a bit boooozzyy. Yikes!
Bart seems to forget things seconds after someone tells him how to use
any program that he has not written himself. (It takes him a little
longer to forget how his own programs and languages work.)
No shit? Bard helped me with some of my C code before. He is a nice guy.
On 1/10/2024 7:10 PM, Chris M. Thomasson wrote:
On 1/10/2024 2:10 AM, David Brown wrote:[...]
Bart seems to forget things seconds after someone tells him how to
use any program that he has not written himself. (It takes him a
little longer to forget how his own programs and languages work.)
No shit? Bard helped me with some of my C code before. He is a nice guy.
Oops! I meant Bart not Bard. Sorry everybody! ;^o Damn typos (tabs),
lol! ;^o
On 2024-01-10, bart <bc@freeuk.com> wrote:
I once claimed that with the build schemes I prefer, you usually have to
type only two things:
mm prog # my language
You mean: "cd /path/to/proj/project; mm prog", right?
But 'make' was claimed to be superior because you only had to type one
thing:
make
Never mind that 'makefile' might contains 100s or even 1000s of lines in
There doesn't have to be a makefile. In a directory where there is
nothing but prog.c, "make prog" will run "cc prog.c -o prog".
It caused confusion, and made things go badly wrong, things that would
have been picked up if 'make' needed the name of an input, as you surely
wouldn't have used the same name for both.
But in "mm prog1", "prog1" isn't the name of an input; it's an output.
This seems common in Unix: let's call every input file 'makefile', and
every output file 'a.out'!)
The fixed name of the makefile is a great idea. If I could change
anything, I would call it ".makefile".
On 10/01/2024 19:24, bart wrote:
On 10/01/2024 18:39, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 10/01/2024 14:49, Scott Lurndal wrote:[...]
Yet, the entire application can be built with
$ make
I bet you can't. There's something missing. Unless the implicit file
that make uses happens to be in that '$' directory. Usually you have
to navigate to the project first.
That '$' is a shell prompt, not a directory.
Yes, you have to "cd" (like "chdir" on Windows) to the project directory >>> before typing "make". You also have to make sure the computer is
powered on, login, launch a shell, and maybe one or two other things
before you get to that point.
You're missing the point. SL is making a big deal about the fact that
you can type 'make' without providing any apparent input. But that
input is provided when you use 'cd' to get the relevant folder.
Or do you think that the 'make' command provided will build that
specific project irrespective of the CWD?
Let me put it another way; how does:
$ make
know it is to build that project, and not any other? The default input
is presumably "./makefile", with the key bit being that ".".
So, when claiming that you only need to type one thing to start the
process, it is disingenuous to leave out that part out.
After all, if you wanted to build project A, and, separately, project B,
you can't do both of them like this:
$ make
$ make
We've covered this before. While I quite like sensible defaults, where C compilers tend to be sticklers for dotting all the Is, 'make' goes a
little too far the other way.
Before typing 'make', you'd better be sure you're in the right place!
On 11/01/2024 02:46, Kaz Kylheku wrote:
On 2024-01-10, bart <bc@freeuk.com> wrote:
I once claimed that with the build schemes I prefer, you usually have to >>> type only two things:
mm prog # my language
You mean: "cd /path/to/proj/project; mm prog", right?
Well, it could be done as:
mm <proglocation>
Which in my typical projects, might mean typing:
mm \cx\cc
This creates \cx\cc.exe
On 10/01/2024 21:20, bart wrote:
Before typing 'make', you'd better be sure you're in the right place!
The same is true of getting undressed, and indeed most things in life.
On 11/01/2024 12:24, David Brown wrote:
On 10/01/2024 21:20, bart wrote:
Before typing 'make', you'd better be sure you're in the right place!
The same is true of getting undressed, and indeed most things in life.
You might remember a discussion last autumn about building Lua 5.4.
There was a lot of confusion since the sources you got from googling
'github lua', and also I think from the releases from that github site,
and sources obtained via 'lua.org', were different.
The latter had an extra directory level compared with github, and two makefiles rather than one:
c:\xxx\lua-5.4.6>dir makefile*/s
Directory of c:\xxx\lua-5.4.6
02/05/2023 20:06 3,150 Makefile
Directory of c:\xxx\lua-5.4.6\src
03/02/2023 10:43 7,685 Makefile
Github had only the latter level.
Since both have the same name and both can be invoked with:
make
it meant no error was reported (like: 'no such makefile'), it just went wrong.
Here, having to specify the name of a file, /and/ ensuring those two
input files weren't identically named, could have saved a lot trouble by detecting the discrepancy sooner.
On 2024-01-10, tTh <tth@none.invalid> wrote:
On 1/10/24 01:40, Bart wrote:
An easy compiler is one where you just do:
gcc prog
and not, at the very least:
gcc prog.c -prog.exe -std=c11 -pedantic-errors
Meanwhile I routinely test C programs on half a dozen compilers. I
can't be bothered with all this crap.
So do as all of us do : put this crap in a Makefile or a shell
script, and forget about it.
Please don't get Bart started on makefiles!
On 11/01/2024 02:46, Kaz Kylheku wrote:
The fixed name of the makefile is a great idea. If I could change
anything, I would call it ".makefile".
I think it's a terrible idea. It's like me deciding that when I type:
mcc
by itself, it will automatically compile xyzzy.c, every time. What it actually does is a bit saner, it shows some help text:
On 2024-01-11, bart <bc@freeuk.com> wrote:
On 11/01/2024 02:46, Kaz Kylheku wrote:
The fixed name of the makefile is a great idea. If I could change
anything, I would call it ".makefile".
I think it's a terrible idea. It's like me deciding that when I type:
mcc
by itself, it will automatically compile xyzzy.c, every time. What it
actually does is a bit saner, it shows some help text:
No, that's like mcc, if not given options, finding a .mccproj
file and building what is described there.
I.e. "build the mcc project that's associated with this directory,
via that directory's .mccproject property".
You need a fixed name to attach a property to the directory.
On 10/01/2024 23:30, tTh wrote:
On 1/10/24 19:57, bart wrote:
The above tries to compile hello.c with gcc. The first attempt
doesn't work. The second does, as no error is displayed. But what
exactly has it compiled it to? As there is no file called 'hello.exe.
o ____ _____ _____ __ __
o | _ \ |_ _| | ___| | \/ |
o | |_) | | | | |_ | |\/| |
o | _ < | | | _| | | | |
o |_| \_\ |_| |_| |_| |_|
o
I see. It doesn't matter how complex, unfriendly, inconvenient or
error-prone using a piece of software is, it's all fine so long as you
tell the user:
R T F M?
That makes up for designing it properly?
Instead of getting rid of unnecessary hoops you have to jump through,
you'd just write a thicker instruction manual and sell more training
courses. (And force people to use 'make' - another dozen hoops.)
Plus of course you can command more pay for mastering that contrived complexity.
What /I/ do is strive to get rid of those hoops.
On 2024-01-11, bart <bc@freeuk.com> wrote:
On 11/01/2024 02:46, Kaz Kylheku wrote:
The fixed name of the makefile is a great idea. If I could change
anything, I would call it ".makefile".
I think it's a terrible idea. It's like me deciding that when I type:
mcc
by itself, it will automatically compile xyzzy.c, every time. What it
actually does is a bit saner, it shows some help text:
No, that's like mcc, if not given options, finding a .mccproj
file and building what is described there.
I.e. "build the mcc project that's associated with this directory,
via that directory's .mccproject property".
You need a fixed name to attach a property to the directory.
On 11/01/2024 01:14, bart wrote:
On 10/01/2024 23:30, tTh wrote:
On 1/10/24 19:57, bart wrote:
The above tries to compile hello.c with gcc. The first attempt
doesn't work. The second does, as no error is displayed. But what
exactly has it compiled it to? As there is no file called 'hello.exe.
o ____ _____ _____ __ __
o | _ \ |_ _| | ___| | \/ |
o | |_) | | | | |_ | |\/| |
o | _ < | | | _| | | | |
o |_| \_\ |_| |_| |_| |_|
o
I see. It doesn't matter how complex, unfriendly, inconvenient or
error-prone using a piece of software is, it's all fine so long as you
tell the user:
R T F M?
That makes up for designing it properly?
Instead of getting rid of unnecessary hoops you have to jump through,
you'd just write a thicker instruction manual and sell more training
courses. (And force people to use 'make' - another dozen hoops.)
Plus of course you can command more pay for mastering that contrived
complexity.
What /I/ do is strive to get rid of those hoops.
Here's some hoops, for you ...
$ ls
hello.c
No Makefile, you'll notice.
$ make hello
cc hello.c -o hello
$ ./hello
Hello World!
On 11/01/2024 16:13, Kaz Kylheku wrote:
On 2024-01-11, bart <bc@freeuk.com> wrote:
On 11/01/2024 02:46, Kaz Kylheku wrote:
The fixed name of the makefile is a great idea. If I could change
anything, I would call it ".makefile".
I think it's a terrible idea. It's like me deciding that when I type:
mcc
by itself, it will automatically compile xyzzy.c, every time. What it
actually does is a bit saner, it shows some help text:
No, that's like mcc, if not given options, finding a .mccproj
file and building what is described there.
I.e. "build the mcc project that's associated with this directory,
via that directory's .mccproject property".
You need a fixed name to attach a property to the directory.
That would be more like a configuration file. It can control the
behaviour of an interactive op, or run some sort of prelude script, or
load some data, prior to starting a normal session.
It wouldn't see such a file, act on it, then (perhaps silently) stop.
If I wanted that behaviour, I would script it, using the same commands
that I've been familiar with for decades from my OS's command prompt.
The name of the script would usually be specific to the project, for
example, 'makeccia' will run 'makeccia.bat'. The '-a' suffix refers to
the floppy A: drive, so it is a very old example.
That's fine, but don't make it a hidden file. A hidden file
can be used to record tool preferences, but the file describing
the project itself shouldn't be hidden.
On 2024-01-11, Scott Lurndal <scott@slp53.sl.home> wrote:
That's fine, but don't make it a hidden file. A hidden file
can be used to record tool preferences, but the file describing
the project itself shouldn't be hidden.
That ship mostly sailed with names like .svn/ and .git/.
On 2024-01-11, bart <bc@freeuk.com> wrote:
I worked with .BAT files in the MS-DOS era. I left that stuff behind
just as the DOS era was coming to an end, and I went off to university.
Over the years, I had only rare, minor interactions with batch files on Windows.
it's a very poor scripting language,
that Microsoft replaced with the
PowerShell. Each time I had to interact with it over the past 30
years, I was reminded of how bad it is. It's like a freshman student's weekend project
The main weakness is that DOS and Windows programs receive the
entire command line as a single string, which they must delimit
themselves into arguments.
No two programs agree on how that should be done, beyond the
the trivial case when the command line has nothing but clumps of
alphanumeric characters separated by spaces.
Everything in DOS was cribbed from Unix, badly: piping | operator
that doesn't actually run processes concurrently, < > redirection
oeprators, device names mapped into the filesystem (but in the most
stupid way imaginable, not like /dev). The parent directory being ..,
but not actually due to there being a parent link, only faked out ...
They couldn't make up their minds between / and \ so they stupidly
supported *both* as separators, and put in a flag into the command interpreter to choose which one it would print.
Why, what hairy stuff are you doing that requires a language as complex
as 'make' and a shell environment as capable as 'bash'?
Kaz Kylheku <433-929-6894@kylheku.com> writes:
On 2024-01-11, Scott Lurndal <scott@slp53.sl.home> wrote:
That's fine, but don't make it a hidden file. A hidden file
can be used to record tool preferences, but the file describing
the project itself shouldn't be hidden.
That ship mostly sailed with names like .svn/ and .git/.
That's metadata that will never be edited directly, but rather
is managed by the appropriate tool.
On 2024-01-11, Scott Lurndal <scott@slp53.sl.home> wrote:
That's fine, but don't make it a hidden file. A hidden file
can be used to record tool preferences, but the file describing
the project itself shouldn't be hidden.
That ship mostly sailed with names like .svn/ and .git/.
The name of the script would usually be specific to the project, for
example, 'makeccia' will run 'makeccia.bat'. The '-a' suffix refers to
the floppy A: drive, so it is a very old example.
On 11/01/2024 22:02, Kaz Kylheku wrote:
On 2024-01-11, bart <bc@freeuk.com> wrote:
I worked with .BAT files in the MS-DOS era. I left that stuff behind
just as the DOS era was coming to an end, and I went off to university.
Over the years, I had only rare, minor interactions with batch files on
Windows.
it's a very poor scripting language,
I used the word 'script' loosely.
I make a distinction between command-languages, which generally have
only a linear sequence of commands, no looping or conditional code; and proper scripting languages which are full languages.
BAT files I consider command languages where you just write commands A,
B, C ... one after the other.
For building programs, I've never needed anything more sophisticated.
This is the batch file for building my C compiler from source and
replacing the current production compiler:
mm -opt cc.m
copy cc.exe \m\mcc.exe
copy headers\windows.h \m
This involves working with 62 source and support files. The assembler
that mcc depends on is managed separately. The BAT file to update that
is this, which involves 15 modules:
mm -opt aa.m
copy aa.exe \m\aa.exe
You can see that there are very, very few demands in these 'scripts'. If
I need real scripting, I have a perfectly good one of my own.
Why, what hairy stuff are you doing that requires a language as complex
as 'make' and a shell environment as capable as 'bash'? It sounds more
like, you just use the capability because it's there. And then complain
when Windows doesn't have that.
that Microsoft replaced with the
PowerShell. Each time I had to interact with it over the past 30
years, I was reminded of how bad it is. It's like a freshman student's
weekend project
That's exactly how I view gcc's UI. Most Linux-derived utilities are as
bad: you invoke them, and they just apparently hang. Then you realise
they're waiting for input. Would it kill somebody to get them to display
a prompt?
On 12/01/2024 00:20, bart wrote:
Why, what hairy stuff are you doing that requires a language as
complex as 'make' and a shell environment as capable as 'bash'? It
sounds more like, you just use the capability because it's there. And
then complain when Windows doesn't have that.
People use good tools when good tools are available - they don't go out
of their way to use something inferior. Why is that a surprise? If you want to cut an orange, do you go out to the garage and find a rusty old
knife that is nonetheless useable for cutting a bit of fruit - or do you
use the shiny new titanium kitchen knife that is right in front of you?
I used "make" on DOS and Windows for about 15 years before I started
using Linux as a development system (rather than just for servers and
for fun). Every Windows system I have ever owned personally or used at
work has had "make" - because "make" has been the standard for quality development tools since early DOS days. The version I used first on
Windows came with Turbo Pascal - Microsoft's tools came with their own
slight variation of make called "nmake".
If your build uses only a couple of commands, make doesn't add much
compared to a script - but it doesn't cost much either. And you have
the convenience that a lot of editors have shortcut keys for doing a
"build", so something like ctrl-B will run "make" in the current
directory without needing to set up any specific tool shortcuts. That's convenient.
And unlike a language/compiler specific build system like you have
within your "mm" or "mcc" tools (if I am not mistaken), "make" will work
for anything you can run by commands. My makefiles might not just build executables from C files,
but also build documentation (doxygen, LaTeX,
pandoc, graphviz, etc.), run tests, run "clean", build release zip
files, download to a target board via a debugger, and all sorts of other
bits and pieces according to the needs of the project. One makefile
beats a dozen scripts.
That's exactly how I view gcc's UI. Most Linux-derived utilities are
as bad: you invoke them, and they just apparently hang. Then you
realise they're waiting for input. Would it kill somebody to get them
to display a prompt?
Are you sure you are not exaggerating just a /tiny/ bit?
On 12/01/2024 00:20, bart wrote:
That's exactly how I view gcc's UI. Most Linux-derived utilities are as
bad: you invoke them, and they just apparently hang. Then you realise
they're waiting for input. Would it kill somebody to get them to display
a prompt?
Are you sure you are not exaggerating just a /tiny/ bit?
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
Why, what hairy stuff are you doing that requires a language as
complex as 'make' and a shell environment as capable as 'bash'? It
sounds more like, you just use the capability because it's there. And
then complain when Windows doesn't have that.
People use good tools when good tools are available - they don't go
out of their way to use something inferior. Why is that a surprise?
If you want to cut an orange, do you go out to the garage and find a
rusty old knife that is nonetheless useable for cutting a bit of fruit
- or do you use the shiny new titanium kitchen knife that is right in
front of you?
I used "make" on DOS and Windows for about 15 years before I started
using Linux as a development system (rather than just for servers and
for fun). Every Windows system I have ever owned personally or used
at work has had "make" - because "make" has been the standard for
quality development tools since early DOS days. The version I used
first on Windows came with Turbo Pascal - Microsoft's tools came with
their own slight variation of make called "nmake".
If your build uses only a couple of commands, make doesn't add much
compared to a script - but it doesn't cost much either. And you have
the convenience that a lot of editors have shortcut keys for doing a
"build", so something like ctrl-B will run "make" in the current
directory without needing to set up any specific tool shortcuts.
That's convenient.
And unlike a language/compiler specific build system like you have
within your "mm" or "mcc" tools (if I am not mistaken), "make" will
work for anything you can run by commands. My makefiles might not
just build executables from C files,
If you isolate that part of it, then it's what I either build-in to the language + compiler (for M), or list in a simple text file (for C).
In either case I will have a project file using a similar list for my
basic IDE that I use for everyday development. But this part is not
needed when somebody else needs to build my project.
For a C project consisting of three files (one of Chris's), my IDE looks
like this:
https://github.com/sal55/langs/blob/master/ff.png
It probably looked about the same in 1984. The project file for that is
this:
run cipher c.c output -e
module cipher.c
module sha2.c
module hmac.c
file sha2.h
file hmac.h
The 'run' lines show what happens when I type 'R' in the IDE.
but also build documentation (doxygen, LaTeX, pandoc, graphviz, etc.),
run tests, run "clean", build release zip files, download to a target
board via a debugger, and all sorts of other bits and pieces according
to the needs of the project. One makefile beats a dozen scripts.
It looks like 'make' is competing with 'bash' then!
At least with bash, what you type equates to what you might type interactively. I doubt there is a an interactive, REPL-style make (or
maybe there is; I wouldn't be suprised).
That's exactly how I view gcc's UI. Most Linux-derived utilities are
as bad: you invoke them, and they just apparently hang. Then you
realise they're waiting for input. Would it kill somebody to get them
to display a prompt?
Are you sure you are not exaggerating just a /tiny/ bit?
I mainly remember the times when they do hang.
But take the related trio gcc, as and ld.
gcc and ld behave as expected with no input (with gcc it's a 'fatal
error'; with ld it's a more restrained 'no input files').
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal? (If 'as' is designed for piped-in input, tdm/gcc doesn't appear to use that feature as I remember
it generating discrete, temporary .s files.)
My equivalent says this:
c:\c>aa
AA2.0 Assembler/Linker 10-Jan-2024
Usage:
aa filename[.asm] # Assemble filename.asm to
filename.exe
aa -help # Show other options
If you're curious about what 'as' expects and speculatively try 'as
--help', it displays 167 dense lines.
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal?
David Brown <david.brown@hesbynett.no> writes:
On 12/01/2024 00:20, bart wrote:
That's exactly how I view gcc's UI. Most Linux-derived utilities are as
bad: you invoke them, and they just apparently hang. Then you realise
they're waiting for input. Would it kill somebody to get them to display >>> a prompt?
Are you sure you are not exaggerating just a /tiny/ bit?
Clearly, Bart is trolling.
And clearly, he doesn't understand the basic unix philosophy
of combining commands in pipeline, where "displaying a prompt"
would be silly.
On 12/01/2024 17:12, bart wrote:
I don't really understand what you are trying to say here. Are you suggesting that your 40 year old DOS IDE is equivalent to modern IDE's ?
Are you trying to say you can use your own tools for your own
language, and rely on a simple script for C compilation, and can't
handle anything else in a build process?
download to a
target board via a debugger,
and all sorts of other bits and pieces
according to the needs of the project. One makefile beats a dozen
scripts.
It looks like 'make' is competing with 'bash' then!
I have no idea why you think that - except perhaps because you still
have no concept of what "make" is and what it does, and think it is just
a script with a complicated syntax.
If you're curious about what 'as' expects and speculatively try 'as
--help', it displays 167 dense lines.
Are you trying to convince people that your assembler is better than gas because yours has fewer features? Bizarre.
On 12/01/2024 17:12, bart wrote:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
It looks like 'make' is competing with 'bash' then!
I have no idea why you think that - except perhaps because you still
have no concept of what "make" is and what it does, and think it is just
a script with a complicated syntax.
On 12/01/2024 16:01, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 12/01/2024 00:20, bart wrote:
That's exactly how I view gcc's UI. Most Linux-derived utilities are as >>>> bad: you invoke them, and they just apparently hang. Then you realise
they're waiting for input. Would it kill somebody to get them to display >>>> a prompt?
Are you sure you are not exaggerating just a /tiny/ bit?
Clearly, Bart is trolling.
And clearly, he doesn't understand the basic unix philosophy
of combining commands in pipeline, where "displaying a prompt"
would be silly.
Yeah. Because it would be impossible too have two versions of a program
(say 'as' that I used in my recent post), one for piping, one for >interactive.
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal?
It does that so you can pipe the assembler source code in to the
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
or, if you want, you can type in the assembler source directly.
Or you can save it in a file and supply the file argument to the command.
None of which your stuff supports, which makes it useless to me.
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal?
It does that so you can pipe the assembler source code in to the
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
or, if you want, you can type in the assembler source directly.
Or you can save it in a file and supply the file argument to the command.
None of which your stuff supports, which makes it useless to me.
On 12/01/2024 16:34, David Brown wrote:
It looks like 'make' is competing with 'bash' then!
I have no idea why you think that - except perhaps because you still
have no concept of what "make" is and what it does, and think it is
just a script with a complicated syntax.
So, what the hell is it then? What makes it so special compared with any other scripting language?
All I can see is that it can create dependency graphs between files -
which have to be determined from info that you provide in the file, it's
not that clever - and can use that to avoid recompilation etc of a file unless its dependencies have changed.
That is something I've never needed done automatically in my own work (I
do it manually as I will know my projects intimately when I'm working
with them).
For production builds, it doesn't matter if everything is compiled.
[...]
Clearly your idea of 'better' is to be vastly more complicated.
[...]
David Brown <david.brown@hesbynett.no> writes:
On 12/01/2024 17:12, bart wrote:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
It looks like 'make' is competing with 'bash' then!
I have no idea why you think that - except perhaps because you still
have no concept of what "make" is and what it does, and think it is just
a script with a complicated syntax.
I can't tell if he's just trolling, or if he really believes what
he writes.
On 12/01/2024 16:50, Scott Lurndal wrote:
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content?
That seems off.
On 12/01/2024 16:50, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal?
It does that so you can pipe the assembler source code in to the
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content? That seems off.
If you do this:
as hello.s
What might someone expect the output to be?
On 12.01.2024 18:59, bart wrote:
On 12/01/2024 16:50, Scott Lurndal wrote:
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content?
Of course.
That seems off.
Why?
On 12/01/2024 18:10, Janis Papanagnou wrote:
On 12.01.2024 18:59, bart wrote:
On 12/01/2024 16:50, Scott Lurndal wrote:
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content?
Of course.
That seems off.
Why?
Because when you see ">" on a command line, it means redirecting output
that would normally be shown as text on a console or terminal.
name' tells the shell to open 'name' on stdout before executing 'as'.
bart <bc@freeuk.com> writes:
On 12/01/2024 16:50, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
But with 'as', it just sits there. I wonder what it's waiting for; for >>>> me to type in ASM code live from the terminal?
It does that so you can pipe the assembler source code in to the
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content? That seems off.
Unix files are untyped sequences of bytes.
(e.g. the 'b' flag to the fopen stdio library routine is ignored on unix).
That windows fucks around with line endings and uses CRLF instead of the
more efficient LF is screwed up.
If you do this:
as hello.s
What might someone expect the output to be?
What the FM documents. RTFM.
bart <bc@freeuk.com> writes:
On 12/01/2024 18:10, Janis Papanagnou wrote:
On 12.01.2024 18:59, bart wrote:
On 12/01/2024 16:50, Scott Lurndal wrote:
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content?
Of course.
That seems off.
Why?
Because when you see ">" on a command line, it means redirecting output
that would normally be shown as text on a console or terminal.
No, it doesn't mean that at all. It never has meant that.
name' tells the shell to open 'name' on stdout before executing 'as'.
David Brown <david.brown@hesbynett.no> writes:
On 12/01/2024 00:20, bart wrote:
That's exactly how I view gcc's UI. Most Linux-derived utilities are as
bad: you invoke them, and they just apparently hang. Then you realise
they're waiting for input. Would it kill somebody to get them to display >>> a prompt?
Are you sure you are not exaggerating just a /tiny/ bit?
Clearly, Bart is trolling.
And clearly, he doesn't understand the basic unix philosophy
of combining commands in pipeline, where "displaying a prompt"
would be silly.
On 12.01.2024 18:09, bart wrote:
On 12/01/2024 16:34, David Brown wrote:
It looks like 'make' is competing with 'bash' then!
Why don't you just read about those two tools and learn, instead
of repeatedly spouting such stupid statements of ignorance.
It's a _simple_ tool - not complex, as you've previously posted -
where you can define dependencies of entities, and define commands
that create the targets if entities that are required by the target
had changed. Its basic syntax and also its logic is simple,
And this is a crucial feature; for professional non-trivial projects.
That is something I've never needed done automatically in my own work (I
do it manually as I will know my projects intimately when I'm working
with them).
Yes, we know. You've repeatedly shown that you are actually doing
small one-man-shows in projects that I can only call toy-projects.
Professional projects have a different situation in many respects.
(I don't go into detail here, since you're anyway only interested
in your local comfort zone.)
For production builds, it doesn't matter if everything is compiled.
For a production build we extract a committed release into an own
file system branch and build all, first for the tests, then to pack
the source (optionally), and then the complete runtime environment.
But we're doing professional software products. And the production
build comes _at the end_.
Before that we need efficient mechanisms to create consistent systems
and configurations, and to avoid unnecessary compiles. And the 'make'
process does exactly that.
Clearly your imputations are based on ignorance.
On 12/01/2024 18:02, Janis Papanagnou wrote:
On 12.01.2024 18:09, bart wrote:
On 12/01/2024 16:34, David Brown wrote:
It looks like 'make' is competing with 'bash' then!
Why don't you just read about those two tools and learn, instead
of repeatedly spouting such stupid statements of ignorance.
Because I've repeatedly said I don't need them. Why can't you accept that?
but also build documentation (doxygen, LaTeX,
pandoc, graphviz, etc.), run tests, run "clean", build release zip
files, download to a target board via a debugger, and all sorts of other
bits and pieces according to the needs of the project. One makefile
beats a dozen scripts.
It looks like 'make' is competing with 'bash' then!
That's exactly how I view gcc's UI. Most Linux-derived utilities are
as bad: you invoke them, and they just apparently hang. Then you
realise they're waiting for input. Would it kill somebody to get them
to display a prompt?
Are you sure you are not exaggerating just a /tiny/ bit?
I mainly remember the times when they do hang.
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the
On 12/01/2024 19:18, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 12/01/2024 18:10, Janis Papanagnou wrote:
On 12.01.2024 18:59, bart wrote:
On 12/01/2024 16:50, Scott Lurndal wrote:
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content?
Of course.
That seems off.
Why?
Because when you see ">" on a command line, it means redirecting output
that would normally be shown as text on a console or terminal.
No, it doesn't mean that at all. It never has meant that.
name' tells the shell to open 'name' on stdout before executing 'as'.
And without '> name', where does stuff sent to stdout end up?
On 12/01/2024 18:02, Janis Papanagnou wrote:
And this is a crucial feature; for professional non-trivial projects.
Come on then, tell me how big your projects are. Are they bigger than
Scott Lurndal's 10Mloc example? (Which seems to be mostly Python source >code.)
bart <bc@freeuk.com> writes:
On 12/01/2024 16:50, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:It does that so you can pipe the assembler source code in to the
On 12/01/2024 00:20, bart wrote:
But with 'as' it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal?
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
What might someone expect the output to be? Probably not 'a.out', more
likely hello.o. Why /isn't/ it just hello.o?
Partly historical inertia, and partly because "as" can't always know
what the output file should be, for example if its input isn't a file.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
bart <bc@freeuk.com> writes:
On 12/01/2024 16:50, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:It does that so you can pipe the assembler source code in to the
On 12/01/2024 00:20, bart wrote:
But with 'as' it just sits there. I wonder what it's waiting for; for >>>>> me to type in ASM code live from the terminal?
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Yes, I used this as an example pipeline. I don't recall
if the original as(1) wrote to stdout or always just to
a.out.
What might someone expect the output to be? Probably not 'a.out', more
likely hello.o. Why /isn't/ it just hello.o?
Partly historical inertia, and partly because "as" can't always know
what the output file should be, for example if its input isn't a file.
Yup. a.out is the historical inertia.
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content? That seems off.
I believe that the compiler suite /always/ wrote an a.out file. But,
it wouldn't have been the assembler (as), but the linker (ld) that
created it.
On 2024-01-12, bart <bc@freeuk.com> wrote:
On 12/01/2024 18:02, Janis Papanagnou wrote:
On 12.01.2024 18:09, bart wrote:
On 12/01/2024 16:34, David Brown wrote:
It looks like 'make' is competing with 'bash' then!
Why don't you just read about those two tools and learn, instead
of repeatedly spouting such stupid statements of ignorance.
Because I've repeatedly said I don't need them. Why can't you accept that?
You do need make, if you're on a Unix-like system and you want to make "hello" out of "hello.c" with a command that consists of only two words.
bart <bc@freeuk.com> writes:
On 12/01/2024 18:02, Janis Papanagnou wrote:
And this is a crucial feature; for professional non-trivial projects.
Come on then, tell me how big your projects are. Are they bigger than
Scott Lurndal's 10Mloc example? (Which seems to be mostly Python source
code.)
The example shown had 8 million lines of C and C++ code. Less
than 10% was python.
Granted, a significant fraction of that is generated from yaml descriptions of memory mapped registers, yet it is still compiled by the C and C++ compilers.
bart <bc@freeuk.com> writes:
It happens that once you have a working Makefile, it works equally
well either to rebuild a project after a single change, or to build
an entire project from scratch. Someone could probably create
a simpler version of "make" that doesn't look at dependencies,
and that always rebuilds everything. Such a tool would be worse
than "make" for building projects during development, and not
significantly better than "make" for building projects from scratch.
And that's ignoring the "-j" option, which allows "make" to execute
multiple steps in parallel. That works only because "make" knows
about dependencies, and it can result in a full build from scratch
finishing much more quickly. A simple script that just compiles
each file isn't likely to do that. You can typically specify a
maximum number of parallel jobs equal to the number of CPUs on your
build system, e.g., `make -j $(nproc)`.
That sounds barmy.
To you, I'm sure it does. It isn't.
The astronauts who had to apply the conversion kit to the HST had no
choice. Clearly they would rather have got the mirror in the first
place.
But your attitude seems to be that such a sticking-plaster fix is superior.
I'd like to bring something to your attention. You often make
statements about the attitudes of the people you're having these
discussions with. Those statements are almost always wrong. I'd say
more if I thought you'd be interested.
If you run "command" at a shell prompt, its stdout goes to the terminal
in which the shell is running. [...]
On 12/01/2024 18:10, Janis Papanagnou wrote:
On 12.01.2024 18:59, bart wrote:
On 12/01/2024 16:50, Scott Lurndal wrote:
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content?
Of course.
That seems off.
Why?
Because when you see ">" on a command line, it means redirecting output
that would normally be shown as text on a console or terminal.
But you rarely see pure binary being displayed like that on a text display.
[...]
However I'm obviously just a bot, so what do I know.
On 12/01/2024 19:15, Scott Lurndal wrote:
[...]
[...]
You guys all deserve medals for being so tolerant.
On 12/01/2024 19:18, Scott Lurndal wrote:
name' tells the shell to open 'name' on stdout before executing 'as'.
And without '> name', where does stuff sent to stdout end up?
On 12.01.2024 19:53, bart wrote:
On 12/01/2024 18:10, Janis Papanagnou wrote:
On 12.01.2024 18:59, bart wrote:
On 12/01/2024 16:50, Scott Lurndal wrote:
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
Using ">" on binary content?
Of course.
That seems off.
Why?
Because when you see ">" on a command line, it means redirecting output
that would normally be shown as text on a console or terminal.
I propose that you try to give up what you think is "normally" and
base your knowledge and opinions on facts. Honestly, it will make communication generally easier and not make you look like a moron.
I see. So forget just having intuitive behaviour. Or even behaviour that
is compatible with related tools, so that:
gcc -c file1.c produces file1.o
gcc -c file1.c file2.c produces file1.o file2.o
but:
as file1.s produces a.out
as file1.s file2.s produces a.out
You guys all deserve medals for being so tolerant.
On 13/01/2024 00:47, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
It happens that once you have a working Makefile, it works equally
well either to rebuild a project after a single change, or to build
an entire project from scratch. Someone could probably create
a simpler version of "make" that doesn't look at dependencies,
and that always rebuilds everything. Such a tool would be worse
than "make" for building projects during development, and not
significantly better than "make" for building projects from scratch.
And that's ignoring the "-j" option, which allows "make" to execute
multiple steps in parallel. That works only because "make" knows
about dependencies, and it can result in a full build from scratch
finishing much more quickly. A simple script that just compiles
each file isn't likely to do that. You can typically specify a
maximum number of parallel jobs equal to the number of CPUs on your
build system, e.g., `make -j $(nproc)`.
That's a reasonable thing to do. But how does make do it? Can't a
compiler apply the same approach if N files have been submitted?
After all C allows independent compilation of modules. (Something my
language doesn't have; there the granularity is an EXE file, not a module.)
My post was in response to "needing 'make' if you wanted a two-word
command to build 'hello'".
I didn't seriously think that was the reason make was invented.
[...] If a program produces binary data on standard output,
then it will pretty much always have to be redirected. [...]
On 12/01/2024 18:02, Janis Papanagnou wrote:
On 12.01.2024 18:09, bart wrote:
On 12/01/2024 16:34, David Brown wrote:
It looks like 'make' is competing with 'bash' then!
Why don't you just read about those two tools and learn, instead
of repeatedly spouting such stupid statements of ignorance.
Because I've repeatedly said I don't need them. Why can't you accept that?
How about YOU learn how to build software without those tools?
It's a _simple_ tool - not complex, as you've previously posted -
where you can define dependencies of entities, and define commands
that create the targets if entities that are required by the target
had changed. Its basic syntax and also its logic is simple,
And this is a crucial feature; for professional non-trivial projects.
Come on then, tell me how big your projects are. Are they bigger than
Scott Lurndal's 10Mloc example? (Which seems to be mostly Python source code.)
That is something I've never needed done automatically in my own work (I >>> do it manually as I will know my projects intimately when I'm working
with them).
Yes, we know. You've repeatedly shown that you are actually doing
small one-man-shows in projects that I can only call toy-projects.
This is incredibly patronising.
What is wrong with one-man projects?
What is wrong with writing non-professional software? Is that the same
as non-commercial?
Where is the line between a toy project and a non-toy project? Is it
related to how lines or how many modules an application might have, or
the size of the final binaries?
Is it to do with the number of end-users?
[...]
Professional projects have a different situation in many respects.
(I don't go into detail here, since you're anyway only interested
in your local comfort zone.)
No, don't. I assume you've got some hugely complicated app with a
million moving parts.
It's so big that nobody knows what's what. Your
compilers are so slow that you HAVE to use dependencies to avoid
spending 90% of the day twiddling your thumbs.
That's a million miles from the stuff I do, yet you still insist /I/
should be using all the same complicated tools you do.
[...]
Let me tell about my own tools:
[...]
So, now tell me where the hell 'makefiles' would fit into that scenario.
Just accept that some of this stuff is out of /your/ comfort zone.
[...]
If I needed a tool like 'make', I would have created one.
[...]
Clearly your imputations are based on ignorance.
Yeah. I could say the same thing. But usually I try and stay polite and
argue only against ideas and not people.
Have a good day.
On 2024-01-13, bart <bc@freeuk.com> wrote:
On 13/01/2024 00:47, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
It happens that once you have a working Makefile, it works equally
well either to rebuild a project after a single change, or to build
an entire project from scratch. Someone could probably create
a simpler version of "make" that doesn't look at dependencies,
and that always rebuilds everything. Such a tool would be worse
than "make" for building projects during development, and not
significantly better than "make" for building projects from scratch.
And that's ignoring the "-j" option, which allows "make" to execute
multiple steps in parallel. That works only because "make" knows
about dependencies, and it can result in a full build from scratch
finishing much more quickly. A simple script that just compiles
each file isn't likely to do that. You can typically specify a
maximum number of parallel jobs equal to the number of CPUs on your
build system, e.g., `make -j $(nproc)`.
That's a reasonable thing to do. But how does make do it? Can't a
compiler apply the same approach if N files have been submitted?
Yes. And in fact, languages with good module support like Modula-2
don't need external make utilities.
After all C allows independent compilation of modules. (Something my
language doesn't have; there the granularity is an EXE file, not a module.)
C has no specific syntax for expressing modules. It has translation
units, with preprocessor header files used for interfacing.
A C compiler could, instead of emitting makefile fragments, keep
the dependency information in some repository which it itself
understands, in order to recompile what is necessary.
Only problem is that that compiler would be reimplementing most of make, probably badly, and every other similar compiler would have to do the
same in order to have the same benefit.
On 12.01.2024 22:01, bart wrote:
Come on then, tell me how big your projects are. Are they bigger than
Scott Lurndal's 10Mloc example? (Which seems to be mostly Python source
code.)
Again playing childish? ("Mine is bigger that yours", sort of?)
If you're interested what I actually do and have done, I can tell
you. (Not that it would address or solve any inherent issue *you*
obviously have.)
The past decade (or so) "my" personal projects were only private
hobbies, i.e. small toy-projects from a couple lines to a couple
thousand lines. But I when I speak about "professional software
engineering" I am rather speaking about the professional projects.
Some outline; I was engaged in projects of various sizes. I don't
recall the (not very significant) LOC numbers; these were anyway
only in one case relevant, in a refactoring project of a large
software component (used by at that time 1000+ software companies
for their products, and at that time by nearly 20 million people
in our country). The projects that I led myself or was member of
ranged from a handful on-site persons to many hundreds persons
spread across several sites and even different companies. And
the development durations from very short ranges up to years. The
areas for which the various software projects was developed were;
for the big telecommunication companies (e.g. BT, Dt. Telekom),
for the financial sector, for the state government). We used local
tools for our site(s), and also collaborative tools. The source
code or libraries were partly imported by collaborating companies,
locally they were spread across various project component file
systems. It had been tens thousands of files (I don't recall the
exact number) and millions of lines of code (dito.). Everyone in
the project was able to work on any of the sub-projects or parts,
no specific knowledge (say, about compiler or library versions)
was necessary by the individual member. Make was a standard tool
almost everywhere. Other tools as well; configuration management,
version control, test environments, project management tools, etc.
These were all professional software projects, as opposed to my
(or your) toy projects.
What is wrong with one-man projects?
There's nothing wrong with them. (I said above that privately I
also do such "projects".) At some point of project complexity you
are advised to handle it more professionally, though. And that is
usually supported by sophisticated project tools and environments.
It's worth to understand, though, that 'make' is not a complex or
unnecessary tool. If you understand its (simple) concept you can
(but don't need to) also use it for your small projects.
You only
gain something, not lose anything; once you've overcome the barrier
of acceptance for a probably unknown or unfamiliar tool it's really
nice. (For example I maintain a dvds.csv file and generate a HTML
page for it that I then upload; why not put the generation process
commands and the simple dependencies in a Makefile and just call
'make' and/or 'make install'? - I have tons of little toy-projects
and instead of having everything in mind I have it either in a
Makefile or in a small shell script that occasionally gets into a
Makefile, so that I only need to do a 'make' in whatever context
I actually am.)
It should have meanwhile become obvious that no one forces you to
use Makefiles.
And that there's also nothing to say again one's
toy-projects.
In the industry where I've done my professional projects we had
no slow computers. But we had also no toy-projects. Yes, some of
the (full!) compile runs lasted many hours
bart <bc@freeuk.com> writes:
On 13/01/2024 00:47, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
It happens that once you have a working Makefile, it works equally
well either to rebuild a project after a single change, or to build
an entire project from scratch. Someone could probably create
a simpler version of "make" that doesn't look at dependencies,
and that always rebuilds everything. Such a tool would be worse
than "make" for building projects during development, and not
significantly better than "make" for building projects from scratch.
And that's ignoring the "-j" option, which allows "make" to execute
multiple steps in parallel. That works only because "make" knows
about dependencies, and it can result in a full build from scratch
finishing much more quickly. A simple script that just compiles
each file isn't likely to do that. You can typically specify a
maximum number of parallel jobs equal to the number of CPUs on your
build system, e.g., `make -j $(nproc)`.
That's a reasonable thing to do. But how does make do it? Can't a
compiler apply the same approach if N files have been submitted?
I've never looked into it.
I suppose it could be done, but IMHO compilers are complex enough
without adding logic to perform parallel compilations, especially since >"make" already solves that problem.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
bart <bc@freeuk.com> writes:
On 13/01/2024 00:47, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
It happens that once you have a working Makefile, it works equally
well either to rebuild a project after a single change, or to build
an entire project from scratch. Someone could probably create
a simpler version of "make" that doesn't look at dependencies,
and that always rebuilds everything. Such a tool would be worse
than "make" for building projects during development, and not
significantly better than "make" for building projects from scratch.
And that's ignoring the "-j" option, which allows "make" to execute
multiple steps in parallel. That works only because "make" knows
about dependencies, and it can result in a full build from scratch
finishing much more quickly. A simple script that just compiles
each file isn't likely to do that. You can typically specify a
maximum number of parallel jobs equal to the number of CPUs on your
build system, e.g., `make -j $(nproc)`.
That's a reasonable thing to do. But how does make do it? Can't a
compiler apply the same approach if N files have been submitted?
I've never looked into it.
It would be ironic that, if I was to write an application in C, it would
be 100% C with no other language involved. Not even any compiler options
(not with my compiler anyway).
bart <bc@freeuk.com> writes:
On 13/01/2024 04:17, Kaz Kylheku wrote:[...]
On 2024-01-13, bart <bc@freeuk.com> wrote:
That's a reasonable thing to do. But how does make do it? Can't aYes. And in fact, languages with good module support like Modula-2
compiler apply the same approach if N files have been submitted?
don't need external make utilities.
Finally somebody admitting that some languages may not need make as much.
Bart, seriously, what the hell are you talking about?
It's true that some languages don't need "make" as much as C does.
Nobody here has said otherwise, likely because other languages are
largely off-topic here in comp.lang.c.
By all means, don't use "make". Nobody wants you to use it.
It should have meanwhile become obvious that no one forces you to
use Makefiles.
- No one said you should be using it.
Why don't you just read about those two tools and learn
It's very interesting that despite your very small and restricted
experience you'd decide to follow the "not invented here" principle
instead of using long established, refined, and well accepted tools
that are already available (and even for free), and reliably work.
It's very interesting that despite your very small and restricted
experience
you'd decide to follow the "not invented here" principle
instead of using long established, refined, and well accepted tools
that are already available (and even for free), and reliably work.
bart <bc@freeuk.com> writes:
On 13/01/2024 21:42, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 13/01/2024 04:17, Kaz Kylheku wrote:[...]
On 2024-01-13, bart <bc@freeuk.com> wrote:
It's true that some languages don't need "make" as much as C does.That's a reasonable thing to do. But how does make do it? Can't aYes. And in fact, languages with good module support like Modula-2
compiler apply the same approach if N files have been submitted?
don't need external make utilities.
Finally somebody admitting that some languages may not need make as much. >>> Bart, seriously, what the hell are you talking about?
Nobody here has said otherwise, likely because other languages are
largely off-topic here in comp.lang.c.
Except 'make'? I get the impression that most programs written in C
have a large component written in 'make' too. A component you can't
always ignore since essential build info is encoded in it.
Most? I don't know. Many? Sure.
You wrote, "Finally somebody admitting that some languages may not need
make as much.". Has anyone here claimed otherwise? If not, why do you
find Kaz's statement so remarkable?
On 2024-01-12, bart <bc@freeuk.com> wrote:
I mainly remember the times when they do hang.
DOS/Windows stuff hangs also:
C:\Users\kazk>findstr foo
... "hang" ...
Even Microsoft clued in to the idea that a text filter shouldn't
spew extraneous diagnostics by default.
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal? (If 'as' is designed for
piped-in input, tdm/gcc doesn't appear to use that feature as I remember
it generating discrete, temporary .s files.)
gcc -pipe works in pipe mode.
The "as" command is intended for compiler use; not only is it not
an interactive assembler, it doesn't even have particularly good
diagnostics for batch use. You have to know what you're doing.
On 1/13/24 23:39, bart wrote:
It would be ironic that, if I was to write an application in C, it
would be 100% C with no other language involved. Not even any compiler
options (not with my compiler anyway).
Is your application can be compiled on Linux, OpenBSD, Solaris,
HP-UX, FreeBSD, AIX, MacOSX and a few others commonly used
operating system ?
On 13/01/2024 23:26, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 13/01/2024 21:42, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 13/01/2024 04:17, Kaz Kylheku wrote:[...]
On 2024-01-13, bart <bc@freeuk.com> wrote:
It's true that some languages don't need "make" as much as C does.That's a reasonable thing to do. But how does make do it? Can't a >>>>>>> compiler apply the same approach if N files have been submitted?Yes. And in fact, languages with good module support like Modula-2 >>>>>> don't need external make utilities.
Finally somebody admitting that some languages may not need make as much. >>>> Bart, seriously, what the hell are you talking about?
Nobody here has said otherwise, likely because other languages are
largely off-topic here in comp.lang.c.
Except 'make'? I get the impression that most programs written in C
have a large component written in 'make' too. A component you can't
always ignore since essential build info is encoded in it.
Most? I don't know. Many? Sure.
You wrote, "Finally somebody admitting that some languages may not need
make as much.". Has anyone here claimed otherwise? If not, why do you
find Kaz's statement so remarkable?
People have suggested using make for everything, from hello.c up to JP's
and SL's massive applications.
They have suggested using it for any language and even stuff which is
program code.
While 'make' conflates several kinds of processes: [...]
On 12/01/2024 19:15, Scott Lurndal wrote:[...]
What the FM documents. RTFM.
I see. So forget just having intuitive behaviour. [...]
Bart <bc@freeuk.cm> writes:[...]
I've told you (multiple times, for *years*) how to invoke gcc in
ISO C conforming mode *if that's what you want*.
bart <bc@freeuk.com> writes:[...]
You repeatedly react strongly to things nobody said. You invent
strawman arguments.
bart <bc@freeuk.com> writes:
On 12/01/2024 19:15, Scott Lurndal wrote:[...]
What the FM documents. RTFM.
I see. So forget just having intuitive behaviour. [...]
The problem is not what the behavior is. The problem is
with your intuition about what the behavior should be.
On 14/01/2024 17:54, Tim Rentsch wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
bart <bc@freeuk.com> writes:[...]
You repeatedly react strongly to things nobody said. You invent
strawman arguments.
That's what bart does. He continually misrepresents other
people's statements, to make them look stupid, so he can feel
superior. Only insecure people feel a need to perpetually brag
and to constantly run down everyone else's point of view.
Yes, sorry. I forgot that was your job.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
bart <bc@freeuk.com> writes:[...]
You repeatedly react strongly to things nobody said. You invent
strawman arguments.
That's what bart does. He continually misrepresents other
people's statements, to make them look stupid, so he can feel
superior. Only insecure people feel a need to perpetually brag
and to constantly run down everyone else's point of view.
On 1/14/24 13:17, bart wrote:
On 14/01/2024 17:54, Tim Rentsch wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
bart <bc@freeuk.com> writes:[...]
You repeatedly react strongly to things nobody said.?? You invent
strawman arguments.
That's what bart does.?? He continually misrepresents other
people's statements, to make them look stupid, so he can feel
superior.?? Only insecure people feel a need to perpetually brag
and to constantly run down everyone else's point of view.
Yes, sorry. I forgot that was your job.
I'm new here, so please forgive my ignorance. What's yours?
I suppose what I'm asking is: What exactly is your goal here, Bart?
On 1/14/24 13:17, bart wrote:
On 14/01/2024 17:54, Tim Rentsch wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
bart <bc@freeuk.com> writes:[...]
You repeatedly react strongly to things nobody said. You invent
strawman arguments.
That's what bart does. He continually misrepresents other
people's statements, to make them look stupid, so he can feel
superior. Only insecure people feel a need to perpetually brag
and to constantly run down everyone else's point of view.
Yes, sorry. I forgot that was your job.
I'm new here, so please forgive my ignorance. What's yours?
I suppose what I'm asking is: What exactly is your goal here, Bart?
On 12/01/2024 21:31, Kaz Kylheku wrote:
On 2024-01-12, bart <bc@freeuk.com> wrote:
I mainly remember the times when they do hang.
DOS/Windows stuff hangs also:
C:\Users\kazk>findstr foo
... "hang" ...
Even Microsoft clued in to the idea that a text filter shouldn't
spew extraneous diagnostics by default.
If you type 'findstr' with no arguments, it reports an error. Maybe
'sort' was a better example to make your point.
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal? (If 'as' is designed for
piped-in input, tdm/gcc doesn't appear to use that feature as I remember >>> it generating discrete, temporary .s files.)
gcc -pipe works in pipe mode.
The "as" command is intended for compiler use; not only is it not
an interactive assembler, it doesn't even have particularly good
diagnostics for batch use. You have to know what you're doing.
You're sort of making excuses for it.
My 'aa' assembler was also
designed mainly for machine-generated code, so it has very few frills.
The syntax however is decent enough that I can use it for my inline
assembler too.
How about accepting some constructive criticism for a change, and
ADMITTING that the behaviour is rubbish, but it has to be accepted
because the way it works is hard-coded into too many tools to change it.
bart <bc@freeuk.com> writes:
On 14/01/2024 17:22, Tim Rentsch wrote:
bart <bc@freeuk.com> writes:
On 12/01/2024 19:15, Scott Lurndal wrote:[...]
The problem is not what the behavior is. The problem isWhat the FM documents. RTFM.
I see. So forget just having intuitive behaviour. [...]
with your intuition about what the behavior should be.
I would love to know what behaviour of an assembler is intuitive to /you/. >>
Or anybody.
I would be surprised if that involved the brain-dead behaviour of
either naming every output 'a.out', so overwriting the file created 5
seconds previously, or spewing reams of binary code to a text terminal
sensitive to escape codes.
I already mentioned that GNU as doesn't write machine code to the
terminal. (I discussed problems that could occor *if it did*.) Did you
miss that? I know you're seeing at least *some* of my posts.
Yes, it writes its output to "a.out" by default, for historical reasons.
It also has an option to specify the name of the output file -- an
option that is almost always used in practice. Invoking the "as"
command directly is relatively rare.
On 2024-01-14, bart <bc@freeuk.com> wrote:
On 12/01/2024 21:31, Kaz Kylheku wrote:
On 2024-01-12, bart <bc@freeuk.com> wrote:
I mainly remember the times when they do hang.
DOS/Windows stuff hangs also:
C:\Users\kazk>findstr foo
... "hang" ...
Even Microsoft clued in to the idea that a text filter shouldn't
spew extraneous diagnostics by default.
If you type 'findstr' with no arguments, it reports an error. Maybe
'sort' was a better example to make your point.
So does grep with no arguments? I don't see where that is going.
In Unixes and the GNU Project, there has not been a focus on assembly language as a primary development language, with a great developer experience.
That's pretty much a fact.
The amount of material written in .s or .S files is very small.
My 'aa' assembler was also
designed mainly for machine-generated code, so it has very few frills.
The syntax however is decent enough that I can use it for my inline
assembler too.
GCC has great inline assembly.
You can reference C expressions, which
are evaluated to registers that the register allocator chooses, which
you can reference in your inline code in a symbolic way.
On 15/01/2024 00:34, Kaz Kylheku wrote:
In Unixes and the GNU Project, there has not been a focus on assembly
language as a primary development language, with a great developer
experience.
That's pretty much a fact.
That is extraordinary. Wasn't C first implemented in assembly? It's
always been a mainstay of computing as far as I can remember. Except no
one now write whole apps in assembly. (I've done quite a few in the past.)
My 'aa' assembler was also
designed mainly for machine-generated code, so it has very few frills.
The syntax however is decent enough that I can use it for my inline
assembler too.
GCC has great inline assembly.
You can reference C expressions, which
are evaluated to registers that the register allocator chooses, which
you can reference in your inline code in a symbolic way.
GCC inline assembly looks absolutely diabolic. I take it you've never
seen it done properly?
Actually I spent 5-10 minutes looking for examples, to try and figure
out if asm instructions could in fact directly refer to symbols in the HLL.
But most examples were one or two lines of weird syntax, following by
some interfacing code. So I don't know.
If /I/ had to write extensive programs in gcc inline assembly, then put
a gun to my head now!
Take this example in C:
int a;
void F(void) {
int b=2, c=3;
static int d=4;
a = b + c * d;
}
I will now show it in my language but with that assignment replaced by
inline assembly:
int a
proc F=
int b:=2, c:=3
static int d=4
assem
mov rax, [c] # (note my ints are 64 bits)
imul2 rax, [d]
add rax, [b]
mov [a], rax
end
end
My question is: what would the C version look like with that line in gcc inline assembly? (In both cases, 'a' should end up with the value 14.)
You need to tell me, because I will otherwise not have a clue. From what
I've seen of gcc inline asm:
* Code has to be written within string literals, in dreadfil AT&T
syntax. And apparently even with embedded \n line breaks. (Good
grief - I think early 80s BASICs had more sophisticated facilities!)
* You mostly use offsets to get at local variables
* You apparently aren't allowed to use just any registers as you need
to negotiate with gcc so as not to interfere with /its/ use of
registers. So most examples I saw seemed to deal with this.
I consider that when writing assembly, YOU are in charge not the
compiler.
As you can see from mine:
* It is written just as it would be in an actual ASM file
* You can refer to variables directly (the compiler will add what is
needed to access locals or statics)
If a function uses inline ASM, variables are kept in memory not
registers. (I might allow that at some point.) Most such functions
however contain only ASM.
On 2024-01-15, bart <bc@freeuk.com> wrote:
On 15/01/2024 00:34, Kaz Kylheku wrote:
In Unixes and the GNU Project, there has not been a focus on assembly
language as a primary development language, with a great developer
experience.
That's pretty much a fact.
That is extraordinary. Wasn't C first implemented in assembly? It's
No; C would have been implemented in NB (new B). It was B that was implemented in assembly. That's just bootstrapping, though.
Thompson and Ritchie didn't have a nice assembler; IIRC, they started
out by assembling code using macros in the TECO editor.
Assembly language has never been emphasized in Unix, to my best
knowledge. It's there.
always been a mainstay of computing as far as I can remember. Except no
one now write whole apps in assembly. (I've done quite a few in the past.)
I did a bunch of assembly language programming, which was with
"nice" assemblers. At university, I made a linked list library with
numerous functions on a Sun 3 (68K) using Sun's "as". That was my
first encounter with Unix's idea of assembly. I got it done, but it
was pretty horrible, with next to no diagnostics when there was
something wrong. It was obvious that the tool assumes correct input,
coming from a compiler.
My 'aa' assembler was also
designed mainly for machine-generated code, so it has very few frills. >>>>
The syntax however is decent enough that I can use it for my inline
assembler too.
GCC has great inline assembly.
You can reference C expressions, which
are evaluated to registers that the register allocator chooses, which
you can reference in your inline code in a symbolic way.
GCC inline assembly looks absolutely diabolic. I take it you've never
seen it done properly?
Actually I spent 5-10 minutes looking for examples, to try and figure
out if asm instructions could in fact directly refer to symbols in the HLL. >>
But most examples were one or two lines of weird syntax, following by
some interfacing code. So I don't know.
If /I/ had to write extensive programs in gcc inline assembly, then put
a gun to my head now!
Take this example in C:
int a;
void F(void) {
int b=2, c=3;
static int d=4;
a = b + c * d;
}
I will now show it in my language but with that assignment replaced by
inline assembly:
int a
proc F=
int b:=2, c:=3
static int d=4
assem
mov rax, [c] # (note my ints are 64 bits)
imul2 rax, [d]
add rax, [b]
mov [a], rax
end
Problem is that the compiler's register allocator now has to be informed that the assembly language part is using rax and work around it.
end
My question is: what would the C version look like with that line in gcc
inline assembly? (In both cases, 'a' should end up with the value 14.)
Let's make it more interesting: what if b and c come from arguments,
and the static int d actually has state that changes between
invocations, so it can't be optimized away. Let's return the
result, a:
int F(int b, int c)
{
static int d=4;
int a;
d++;
asm("imul %3, %2\n\t"
"add %2, %1\n\t"
"mov %1, %0\n\t"
: "=r" (a)
: "r" (b), "r" (c), "r" (d));
return a;
}
On 12/01/2024 18:02, Janis Papanagnou wrote:
On 12.01.2024 18:09, bart wrote:
On 12/01/2024 16:34, David Brown wrote:
It looks like 'make' is competing with 'bash' then!
Why don't you just read about those two tools and learn, instead
of repeatedly spouting such stupid statements of ignorance.
Because I've repeatedly said I don't need them. Why can't you accept that?
On 13/01/2024 21:42, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 13/01/2024 04:17, Kaz Kylheku wrote:[...]
On 2024-01-13, bart <bc@freeuk.com> wrote:
It's true that some languages don't need "make" as much as C does.
Nobody here has said otherwise, likely because other languages are
largely off-topic here in comp.lang.c.
Except 'make'? I get the impression that most programs written in C have
a large component written in 'make' too. A component you can't always
ignore since essential build info is encoded in it.
bart <bc@freeuk.com> writes:
If I recall the ongoing thread, there was two "indefendable" statements:
a) make is useless and cryptic
b) gcc's outputing of binaries to a.out by default is useless and
cryptic
Since *a* has been explained already. I'll just give my two cents on
*b*.
When I'm learning to program, I use to have a lot of source files in the
same repository. I don't want't the binaries, I just want to play with
the source and sometimes, compile them and see if they compile correctly
and the behavior is correct. Outputting the binary to a.out by default instead of "hello.o" is sort of useful here. For two reasons :
1. I don't have the overhaul of remembering how did I call that source
file in that particular moment when I wrote it. I know I have to call
./a.out
and that's it.
2. It doesn't crowds my directory with lots of useless binaries.
On 12/01/2024 16:34, David Brown wrote:
On 12/01/2024 17:12, bart wrote:
I don't really understand what you are trying to say here. Are you
suggesting that your 40 year old DOS IDE is equivalent to modern IDE's ?
No. Only that the way I normally work hasn't changed a great deal.
Are you trying to say you can use your own tools for your own
language, and rely on a simple script for C compilation, and can't
handle anything else in a build process?
I'm separating the fundamental build-process for a program from what is needed for interactive development and testing.
Makefiles you see supplied with open source projects don't generally
make that distinction.
But I can see you're struggling with the concept of simplicity.
download to a target board via a debugger,
(Hey, I used to do that! Not a makefile in sight either; how is that possible?
I used to do that with no special tools, no external software and no
external languages. I had to write assemblers for any new devices I has
to use.)
and all sorts of other bits and pieces according to the needs of the
project. One makefile beats a dozen scripts.
It looks like 'make' is competing with 'bash' then!
I have no idea why you think that - except perhaps because you still
have no concept of what "make" is and what it does, and think it is
just a script with a complicated syntax.
So, what the hell is it then? What makes it so special compared with any other scripting language?
All I can see is that it can create dependency graphs between files -
which have to be determined from info that you provide in the file, it's
not that clever - and can use that to avoid recompilation etc of a file unless its dependencies have changed.
That is something I've never needed done automatically in my own work (I
do it manually as I will know my projects intimately when I'm working
with them).
If you're curious about what 'as' expects and speculatively try 'as
--help', it displays 167 dense lines.
Are you trying to convince people that your assembler is better than
gas because yours has fewer features? Bizarre.
It's simpler and gives a simple summary of how it works.
Clearly your idea of 'better' is to be vastly more complicated.
I guess an assembler which will only work for the processor you happen
to be working with is no good at all. It has to also support dozens that
are not relevant to the task in hand.
BTW my assembler can directly produce EXE and DLL files; in that regard
it IS better than 'as' and probably most others which like to off-load
that critical bit.
bart <bc@freeuk.com> writes:
[snip]
I'm new here, so please forgive my ignorance. What's yours?
I suppose what I'm asking is: What exactly is your goal here, Bart?
What is anyone's goal here?
This newsgroup is just a bunch of old-timers who mainly discuss the
finer points of the C standard, although the group has been more or
less dead for the past couple of years.
Hello there, everyone, this is actually my first post on Usenet at
all. I started configuring and reading it like two days ago. It is not
really a test since I'd like to reply to some content here.
This group
seems rather lively, I guess it is partly thanks to you, Bart.
Since there are rarely any people who post about any practical
problems, I think they mainly use stackoverflow and reddit for
that. The regulars mainly just argue amongst themselves.
I'm kind of young and getting into programming, in C mainly, because I
like backend stuff in general. I dislike going to stackoverflow or
reddit. I reckon these are useful platforms but not open to free
discussion like it is the case here in Usenet. So this newsgroup is not
dead and I find it still useful.
I'm somewhat of an outsider and I like to point out things that I
believe are wrong in things like the C language and the assorted
collection of Unix-specific tools that apparently go with it.
I'm an outsider because I don't routinely use C, or any of the tools,
and because I don't have background in Unix-based development. I have
a different perspective.
This may get on the nerve of some people. But it is interesting. Like
when you made a point about the redirection (>) of binary output to
stdout, the people that corrected your assumption of this operator
teached me a lot about how the redirection of stdin works.
So, what happens here is that I mention something that might be some
bizarre quirk of C, or some weird, unfriendly way some tool works, or
anything that goes against common-sense or intuition, or that I find
has caused me grief.
And then the regulars, instead of agreeing, go on the defensive. Some
go on the attack, saying I'm the one at fault, I should RTFM, or do
this or that, or that I'm ignorant, etc etc. (You've seen the thread.)
So since I have nothing better to do**, I like to defend myself. And
sometimes it is fascinating seeing people defend the indefendable.
If I recall the ongoing thread, there was two "indefendable" statements:
a) make is useless and cryptic
b) gcc's outputing of binaries to a.out by default is useless and
cryptic
Since *a* has been explained already. I'll just give my two cents on
*b*.
When I'm learning to program, I use to have a lot of source files in the
same repository. I don't want't the binaries, I just want to play with
the source and sometimes, compile them and see if they compile correctly
and the behavior is correct. Outputting the binary to a.out by default instead of "hello.o" is sort of useful here. For two reasons :
1. I don't have the overhaul of remembering how did I call that source
file in that particular moment when I wrote it. I know I have to call
./a.out and that's it.
2. It doesn't crowds my directory with lots of useless binaries.
All people need to do is be honest.
I agree.
Does that answer your question?
(** That's not quite true, this is taking me away from my current
project. But when people openly insult me, I can't let it go. They
need to stop replying.)
You are then on a long quest. Good luck.
On 13/01/2024 23:26, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 13/01/2024 21:42, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 13/01/2024 04:17, Kaz Kylheku wrote:[...]
On 2024-01-13, bart <bc@freeuk.com> wrote:
Bart, seriously, what the hell are you talking about?That's a reasonable thing to do. But how does make do it? Can't a >>>>>>> compiler apply the same approach if N files have been submitted?Yes. And in fact, languages with good module support like Modula-2 >>>>>> don't need external make utilities.
Finally somebody admitting that some languages may not need make as
much.
It's true that some languages don't need "make" as much as C does.
Nobody here has said otherwise, likely because other languages are
largely off-topic here in comp.lang.c.
Except 'make'? I get the impression that most programs written in C
have a large component written in 'make' too. A component you can't
always ignore since essential build info is encoded in it.
Most? I don't know. Many? Sure.
You wrote, "Finally somebody admitting that some languages may not need
make as much.". Has anyone here claimed otherwise? If not, why do you
find Kaz's statement so remarkable?
People have suggested using make for everything, from hello.c up to JP's
and SL's massive applications.
On 15/01/2024 08:51, Gabriel Rolland wrote:
bart <bc@freeuk.com> writes:
If I recall the ongoing thread, there was two "indefendable" statements:
OK, whatever the actual spelling of 'indefendable' might be...
a) make is useless and cryptic
My 'C and Make' thread showed a clear example of it being used
gratuitously. I contend that that happens a lot.
b) gcc's outputing of binaries to a.out by default is useless and
cryptic
Since *a* has been explained already. I'll just give my two cents on
*b*.
When I'm learning to program, I use to have a lot of source files in the
same repository. I don't want't the binaries, I just want to play with
the source and sometimes, compile them and see if they compile correctly
and the behavior is correct. Outputting the binary to a.out by default
instead of "hello.o" is sort of useful here. For two reasons :
1. I don't have the overhaul of remembering how did I call that source
file in that particular moment when I wrote it. I know I have to call
./a.out
Hang on: are you generating 'a.out' the object file, or 'a.out' the executable file? (Because ./a.out will execute the file.)
Here is where Unix/Linux's treatment of file extensions does my head in. 'a.out' is used there for both kinds of file. To find out what it
actually is, you have to look inside the file, which defeats the purpose
of having a file extension at all.
and that's it.
2. It doesn't crowds my directory with lots of useless binaries.
The problems of always having the same a.exe/a.out output (here it is
the executable file - see, I have to keep disambiguating!) are multiple:
* If you working with several small one-file programs c, d, and e say,
you want them compiled as c.exe, d.exe and e.exe. Having them all be
a.exe is not going to work; which of c, d, e does it correspond to?
Suppose you want to run c, d, e one after the other?
* You might be testing (as I do), multiple compilers on the same c.c.
The first produces c.exe; you test it. Compile with the second to make a
new c.exe; you test that. Compile with gcc to make a new ... a.exe. Now
you have to remember it's a different executable.
(The number of times I've forgotten that and run c.exe instead, and
thought gcc's code wasn't quite as fast as I'd expected...).
On 1/14/2024 9:47 PM, Keith Thompson wrote:
"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
On 1/14/2024 2:58 PM, Keith Thompson wrote:
[...]
I already mentioned that GNU as doesn't write machine code to the
terminal. (I discussed problems that could occor *if it did*.) Did >>>> you
miss that? I know you're seeing at least *some* of my posts.
Yes, it writes its output to "a.out" by default, for historical
reasons.
It also has an option to specify the name of the output file -- an
option that is almost always used in practice. Invoking the "as"
command directly is relatively rare.
Rare until you have to use it.
And still rare after you have to use it. Are you using the word "rare"
in some non-standard sense?
Fwiw, keep in mind that this was well
before C++11. Take the sensitive sync algorithms (compiler
reordering's, ect...) out of the realm of C/C++ and code them up using
assembly language. The declarations can be in C with CDECL ABI.
How is any of that relevant? Are you saying that invoking the "as"
command directly *isn't* relatively rare?
I am saying that I had to create my own sync primitives in pure assembly language back them. So, I would use as to assemble them. No problem.
Nothing strange, just that I had to do it. Whether or not that is rare,
well, that's another story? The commands were in a makefile anyway. Is
that rare?
Even if you needed to invoke "as" directly (and not, for example, via
"gcc"), it's still trivial to tell it the names of the input and output
files. I just typed "as hello.s -o hello.o" on my system, and it worked
just fine. And "gcc -c hello.s" did essentially the same thing.
Well, I used as directly. Just like with MASM. That is rare? ;^)
On 2024-01-15, bart <bc@freeuk.com> wrote:
GCC has great inline assembly.
My question is: what would the C version look like with that line in gcc
inline assembly? (In both cases, 'a' should end up with the value 14.)
Let's make it more interesting: what if b and c come from arguments,
and the static int d actually has state that changes between
invocations, so it can't be optimized away. Let's return the
result, a:
int F(int b, int c)
{
static int d=4;
int a;
d++;
asm("imul %3, %2\n\t"
"add %2, %1\n\t"
"mov %1, %0\n\t"
: "=r" (a)
: "r" (b), "r" (c), "r" (d));
return a;
}
It's pretty arcane in that the material is both in string literals,
and not. The assembly language template is textual; the compiler knows nothing about its interior.
I specified one output operand, and three input operands, requesting
that they be in registers. I don't specify the register identities.
They are referenced by number: %0, %1, %2, %3 in the order they
appear. (A way to use named references exists.)
25: 0f af d1 imul %ecx,%edx
28: 01 d0 add %edx,%eax
2a: 89 c0 mov %eax,%eax
f: 0f af f0 imul %eax,%esi
12: 01 f7 add %esi,%edi
14: 89 f8 mov %edi,%eax
The static variable is accessed relative to the instruction pointer.
The offset is all zeros: that will be patched when this is linked.
Note that between the unoptimized and optimized code, the register
identities changed entirely.
GCC's inline assembly feature is largely agnostic of the assembler
back end. It interfaces with register allocation and such, but is
otherwise generic. This allows it to have exactly the same grammar,
no matter the architecture target.
The syntax isn't particularly nice, but it has power.
You need to tell me, because I will otherwise not have a clue. From what
I've seen of gcc inline asm:
* Code has to be written within string literals, in dreadfil AT&T
syntax. And apparently even with embedded \n line breaks. (Good
grief - I think early 80s BASICs had more sophisticated facilities!)
The code template, after the registers like %0 and %1 are substituted
into it, is just shoved into the assembly language output verbatim.
The compiler doesn't analyze the interior.
AT&T syntax is used if that's what the assembler requires.
Not all GCC targets have assemblers in whose language the destination operand is on the right; it's that way for the x86 family though.
* You mostly use offsets to get at local variables
Nope; it's pretty much transparent.
I consider that when writing assembly, YOU are in charge not the
compiler.
If you're writing *inline* assembly in compiled code, if you let
the compiler be in charge of some things, it's a lot better.
On 15/01/2024 07:07, Kaz Kylheku wrote:
On 2024-01-15, bart <bc@freeuk.com> wrote:
asm("imul %3, %2\n\t"
"add %2, %1\n\t"
"mov %1, %0\n\t"
: "=r" (a)
: "r" (b), "r" (c), "r" (d));
return a;
}
But you've also made it clear that this isn't really assembly at all. Is
it even for x64? I can't tell! The use of 'mov' rather than 'ldr' or
'str' suggests it is x86 or x64 rather than ARM. But are those 32-bit
mov's or 64-bit?
So it's some sort of hideous hybrid that gcc has come up with, that is >neither assembly nor C.
It would be far, far simpler to write assembly in its own file (and keep
well away from AT&T syntax),
bart <bc@freeuk.com> writes:
This may get on the nerve of some people. But it is interesting. Like
when you made a point about the redirection (>) of binary output to
stdout, the people that corrected your assumption of this operator
teached me a lot about how the redirection of stdin works.
On 2024-01-15, bart <bc@freeuk.com> wrote:
On 15/01/2024 00:34, Kaz Kylheku wrote:
I will now show it in my language but with that assignment replaced by
inline assembly:
int a
proc F=
int b:=2, c:=3
static int d=4
assem
mov rax, [c] # (note my ints are 64 bits)
imul2 rax, [d]
add rax, [b]
mov [a], rax
end
Problem is that the compiler's register allocator now has to be informed that the assembly language part is using rax and work around it.
end
My question is: what would the C version look like with that line in gcc
inline assembly? (In both cases, 'a' should end up with the value 14.)
Let's make it more interesting: what if b and c come from arguments,
and the static int d actually has state that changes between
invocations, so it can't be optimized away.
Let's return the
result, a:
int F(int b, int c)
{
static int d=4;
int a;
d++;
asm("imul %3, %2\n\t"
"add %2, %1\n\t"
"mov %1, %0\n\t"
: "=r" (a)
: "r" (b), "r" (c), "r" (d));
return a;
}
It's pretty arcane in that the material is both in string literals,
and not. The assembly language template is textual; the compiler knows nothing about its interior.
I specified one output operand, and three input operands, requesting
that they be in registers. I don't specify the register identities.
They are referenced by number: %0, %1, %2, %3 in the order they
appear. (A way to use named references exists.)
GNU inline assembly is ugly, but it's very well designed semantically;
it hits the target.
bart <bc@freeuk.com> writes:
But you've also made it clear that this isn't really assembly at all. Is
it even for x64? I can't tell! The use of 'mov' rather than 'ldr' or
'str' suggests it is x86 or x64 rather than ARM. But are those 32-bit
mov's or 64-bit?
1) it's not an assembler, it's a way to "patch" the assembler code
generated by the compiler.
So it's some sort of hideous hybrid that gcc has come up with, that is
neither assembly nor C.
Your baseless opinion noted. Feel free to submit a patch to
the GCC team to implement your preferred syntax for inline assembler.
Do ensure that it works for all 100 of the GCC target architectures.
It would be far, far simpler to write assembly in its own file (and keep
well away from AT&T syntax),
No, it would not be.
And the AT&T syntax is _far_ superior to the intel syntax
with all the ugly syntactic sugar.
val = tp->mem_read(psource, len);
rax = get_reg_value(regs, REG_RAX, QUAD);
__asm__ __volatile__ (
"testl $1, %1\n\t" // 1 byte operand?
"jne 1f\n\t" // Yes, go to it
"testl $8, %1\n\t" // Was it eight bytes?
"jne 2f\n\t" // Yup. Done.
"testl $4, %1\n\t" // Was it 4 bytes?
"je 3f\n\t" // no, Try 2 bytes
"movsx %%ebx, %%rbx\n\t" // Sign extend 32-bits to 64
"cdqe\n\t" // Sign extend EAX to RAX
"jmp 2f\n" // Done here
"3:\tmovsx %%bx, %%rbx\n\t" // Sign extend 16-bits to 64
"movsx %%ax, %%rax\n\t" // Sign extend AX TO RAX
"jmp 2f\n" // Done here
"1:\tmovsx %%bl, %%rbx\n" // Sign extend BL to RBX
"movsx %%al, %%rax\n" // Sign extend AL to RAX
"2:\tsub %%rbx, %%rax\n" // Subtract from comparison value
"pushfq\n\t" // Save the flags from the sub
"popq %0\n\t"
:"=r"(flags_word)
:"r"(len), "b"(val), "a"(rax));
On 15/01/2024 00:34, Kaz Kylheku wrote:
On 2024-01-14, bart <bc@freeuk.com> wrote:
On 12/01/2024 21:31, Kaz Kylheku wrote:
On 2024-01-12, bart <bc@freeuk.com> wrote:
GCC has great inline assembly.
You can reference C expressions, which
are evaluated to registers that the register allocator chooses, which
you can reference in your inline code in a symbolic way.
GCC inline assembly looks absolutely diabolic.
I take it you've never
seen it done properly?
Actually I spent 5-10 minutes looking for examples, to try and figure
out if asm instructions could in fact directly refer to symbols in the HLL.
But most examples were one or two lines of weird syntax, following by
some interfacing code. So I don't know.
If /I/ had to write extensive programs in gcc inline assembly, then put
a gun to my head now!
Take this example in C:
int a;
void F(void) {
int b=2, c=3;
static int d=4;
a = b + c * d;
}
I will now show it in my language but with that assignment replaced by
inline assembly:
int a
proc F=
int b:=2, c:=3
static int d=4
assem
mov rax, [c] # (note my ints are 64 bits)
imul2 rax, [d]
add rax, [b]
mov [a], rax
end
end
My question is: what would the C version look like with that line in gcc inline assembly? (In both cases, 'a' should end up with the value 14.)
You need to tell me, because I will otherwise not have a clue.
From what
I've seen of gcc inline asm:
* Code has to be written within string literals,
in dreadfil AT&T
syntax.
And apparently even with embedded \n line breaks. (Good
grief - I think early 80s BASICs had more sophisticated facilities!)
* You mostly use offsets to get at local variables
* You apparently aren't allowed to use just any registers as you need
to negotiate with gcc so as not to interfere with /its/ use of
registers. So most examples I saw seemed to deal with this.
I consider that when writing assembly, YOU are in charge not the
compiler. As you can see from mine:
* It is written just as it would be in an actual ASM file
* You can refer to variables directly (the compiler will add what is
needed to access locals or statics)
If a function uses inline ASM, variables are kept in memory not
registers.
(I might allow that at some point.) Most such functions
however contain only ASM.
That still lets ASM use the facilities of the HLL such as functions, declarations, named constants, scopes etc.
I suppose you're going to suggest that gcc's facilities are superior...
On 2024-01-15, Kaz Kylheku <433-929-6894@kylheku.com> wrote:
On 2024-01-15, bart <bc@freeuk.com> wrote:
On 15/01/2024 00:34, Kaz Kylheku wrote:
In Unixes and the GNU Project, there has not been a focus on assembly
language as a primary development language, with a great developer
experience.
That's pretty much a fact.
That is extraordinary. Wasn't C first implemented in assembly? It's
No; C would have been implemented in NB (new B). It was B that was
implemented in assembly. That's just bootstrapping, though.
Thompson and Ritchie didn't have a nice assembler; IIRC, they started
out by assembling code using macros in the TECO editor.
Assembly language has never been emphasized in Unix, to my best
knowledge. It's there.
always been a mainstay of computing as far as I can remember. Except noI did a bunch of assembly language programming, which was with
one now write whole apps in assembly. (I've done quite a few in the past.) >>
"nice" assemblers. At university, I made a linked list library with
numerous functions on a Sun 3 (68K) using Sun's "as". That was my
first encounter with Unix's idea of assembly. I got it done, but it
was pretty horrible, with next to no diagnostics when there was
something wrong. It was obvious that the tool assumes correct input,
coming from a compiler.
My 'aa' assembler was also
designed mainly for machine-generated code, so it has very few frills. >>>>>
The syntax however is decent enough that I can use it for my inline
assembler too.
GCC has great inline assembly.
You can reference C expressions, which
are evaluated to registers that the register allocator chooses, which
you can reference in your inline code in a symbolic way.
GCC inline assembly looks absolutely diabolic. I take it you've never
seen it done properly?
Actually I spent 5-10 minutes looking for examples, to try and figure
out if asm instructions could in fact directly refer to symbols in the HLL. >>>
But most examples were one or two lines of weird syntax, following by
some interfacing code. So I don't know.
If /I/ had to write extensive programs in gcc inline assembly, then put
a gun to my head now!
Take this example in C:
int a;
void F(void) {
int b=2, c=3;
static int d=4;
a = b + c * d;
}
I will now show it in my language but with that assignment replaced by
inline assembly:
int a
proc F=
int b:=2, c:=3
static int d=4
assem
mov rax, [c] # (note my ints are 64 bits)
imul2 rax, [d]
add rax, [b]
mov [a], rax
end
Problem is that the compiler's register allocator now has to be informed that
the assembly language part is using rax and work around it.
end
My question is: what would the C version look like with that line in gcc >>> inline assembly? (In both cases, 'a' should end up with the value 14.)
Let's make it more interesting: what if b and c come from arguments,
and the static int d actually has state that changes between
invocations, so it can't be optimized away. Let's return the
result, a:
int F(int b, int c)
{
static int d=4;
int a;
d++;
asm("imul %3, %2\n\t"
"add %2, %1\n\t"
"mov %1, %0\n\t"
: "=r" (a)
: "r" (b), "r" (c), "r" (d));
return a;
}
We can also turn this multiply and add into a stand-alone primitive
that we can put behind a macro:
#define mul_add(x, y, z) \
({ int _res; \
asm("imul %3, %2\n\t" \
"add %2, %1\n\t" \
"mov %1, %0\n\t" \
: "=r" (_res) \
: "r" (x), "r" (y), "r" (z)); \
_res; })
Which we then freely use like this:
int F(int b, int c)
{
static int d=4;
int a;
d++;
a = mul_add(b, c, d);
return a;
}
Complex example:
int G(int a, int b, int c, int d, int e, int f, int g, int h, int i)
{
return mul_add(mul_add(a, b, c),
mul_add(d, mul_add(e, f, g), h),
i);
}
gcc -O2 code:
0000000000000020 <G>:
20: 0f af f2 imul %edx,%esi
23: 01 f7 add %esi,%edi
25: 89 ff mov %edi,%edi
27: 8b 44 24 08 mov 0x8(%rsp),%eax
2b: 8b 54 24 10 mov 0x10(%rsp),%edx
2f: 44 0f af c8 imul %eax,%r9d
33: 45 01 c8 add %r9d,%r8d
36: 44 89 c0 mov %r8d,%eax
39: 0f af c2 imul %edx,%eax
3c: 01 c1 add %eax,%ecx
3e: 89 c9 mov %ecx,%ecx
40: 8b 44 24 18 mov 0x18(%rsp),%eax
44: 0f af c8 imul %eax,%ecx
47: 01 cf add %ecx,%edi
49: 89 f8 mov %edi,%eax
4b: c3 retq
GCC inline assembly is good if you have certain instructions that the compiler
doesn't use, and you'd like to use them as first class primitives (meaning that
they are not disadvantaged compared to primitives the compiler knows about).
We would likely obtain better code if if we unbundled the multiplication
and addition by writing separate mul and add primitives.
Because the code above has to follow the rigid template where imul
is immediately followed by the related add, after which there is
a mandatory mov.
We really want:
#define mul_add(x, y, z) add(x, mul(y, z))
where we separately write the add and mul as inline assembly fragments.
A mul_add primitive might make sense if the processor had such a thing
in one instruction.
With GCC inline assembly, you want to only put the essentials into it:
only do what is necessary and only bundle together what must be
bundled.
On 13/01/2024 21:42, Keith Thompson wrote:
bart <bc@freeuk.com> writes:
On 13/01/2024 04:17, Kaz Kylheku wrote:[...]
On 2024-01-13, bart <bc@freeuk.com> wrote:
It's true that some languages don't need "make" as much as C does.
Nobody here has said otherwise, likely because other languages are
largely off-topic here in comp.lang.c.
Except 'make'? I get the impression that most programs written in C have
a large component written in 'make' too. ...
... A component you can't always
ignore since essential build info is encoded in it.
On 15/01/2024 08:40, Kaz Kylheku wrote:
A mul_add primitive might make sense if the processor had such a thing
in one instruction.
With GCC inline assembly, you want to only put the essentials into it:
only do what is necessary and only bundle together what must be
bundled.
Yes. For inline assembly, simple is not really important, but small is beautiful! Say exactly what you mean, no more and no less, and let the compiler do what it does best.
And move to C++20. Then you can write (for the mythical madd
instruction) :
constexpr int mul_add(int x, int y, int z)
__attribute__((const))
{
if (std::is_constant_evaluated()) {
return x + y * z;
} else {
asm("madd %[x], %[y], %[z]"
: [x] "+r" (x) : [y] "g" (y), [z] "g" (z));
return x;
}
}
Now the compiler can evaluate at compile time if the parameters are
known, or using the super-efficient madd instruction at runtime if
necessary.
On 15/01/2024 14:35, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
But you've also made it clear that this isn't really assembly at all. Is >>> it even for x64? I can't tell! The use of 'mov' rather than 'ldr' or
'str' suggests it is x86 or x64 rather than ARM. But are those 32-bit
mov's or 64-bit?
1) it's not an assembler, it's a way to "patch" the assembler code
generated by the compiler.
So it's some sort of hideous hybrid that gcc has come up with, that is
neither assembly nor C.
Your baseless opinion noted. Feel free to submit a patch to
the GCC team to implement your preferred syntax for inline assembler.
Do ensure that it works for all 100 of the GCC target architectures.
Which all share the same instruction set?
You're having a laugh, surely?
AT&T is bad enough even without the
travesty of it displayed here:
"jmp 2f\n"
"3:\tmovsx %%bx, %%rbx\n\t"
What's with the strings, newline and tab escapes? What's that 'f' for?
On 15/01/2024 03:14, bart wrote:
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different >syntax, was Diab Data.
bart <bc@freeuk.com> writes:
Here is where Unix/Linux's treatment of file extensions does my head
in. 'a.out' is used there for both kinds of file. To find out what it
actually is, you have to look inside the file, which defeats the
purpose of having a file extension at all.
The executable. Unix treatment of file extension is non existent,
This is a supposition on your part. If gcc/cc was working as you said,
like outputting of hello.c hello.exe by default. In that case I would
maybe have considered something like aliasing gcc thus :
alias cc='gcc -o current.exe'
bart <bc@freeuk.com> writes:
You're having a laugh, surely?
No. I'm serious. [DWORD] is useless cruft.
AT&T is bad enough even without the
travesty of it displayed here:
"jmp 2f\n"
"3:\tmovsx %%bx, %%rbx\n\t"
If you don't understand the standard C escapes, you really
should go back to read the standard carefluly.
What's with the strings, newline and tab escapes? What's that 'f' for?
RTFM for the architecture dependent assembler that the compiler
driver will end up calling to build the output object file.
On 15/01/2024 16:40, Gabriel Rolland wrote:
bart <bc@freeuk.com> writes:
This has been answered better than I could by David Brown.
I'm not reading DB's comments today.
The executable. Unix treatment of file extension is non existent,
true. So I don't bother with file extensions much. But I have yet to go
on comp.os.windows to explain how confusing file extensions are.
This is not a Linux newsgroup. It's a language group.
But I expect you do use file extensions.
Otherwise any file in one of
your folders be absolutely anything: text, source in any language,
document, binary data, object file, executable ...
I'm imagine if 'abc' was a C source file, you'd have to put the object
file 'abc' in a separate folder, and the executable 'abc' in a third
one? Otherwise they will clash.
Programs that deal with them will of course check that they are what
they say they are.
But because of this stupid quirk in gcc that could have been fixed in
three lines of code, millions of people have to add extra instructions:
gcc myprog.c -o myprog
You don't get this in other languages.
bart <bc@freeuk.com> writes:
This has been answered better than I could by David Brown.
The executable. Unix treatment of file extension is non existent,
true. So I don't bother with file extensions much. But I have yet to go
on comp.os.windows to explain how confusing file extensions are.
This is a supposition on your part. If gcc/cc was working as you said,
like outputting of hello.c hello.exe by default. In that case I would
maybe have considered something like aliasing gcc thus :
alias cc='gcc -o current.exe'
So I wouldn't have that extra hassle.
But that would belong to my
'.profile' opinions and everybody can have his own '.profile' opinion
about how gcc should work. I respect that.
bart <bc@freeuk.com> writes:
On 15/01/2024 16:40, Gabriel Rolland wrote:
bart <bc@freeuk.com> writes:
This has been answered better than I could by David Brown.
I'm not reading DB's comments today.
Your loss.
The executable. Unix treatment of file extension is non existent,
true. So I don't bother with file extensions much. But I have yet to go
on comp.os.windows to explain how confusing file extensions are.
This is not a Linux newsgroup. It's a language group.
But I expect you do use file extensions.
Sure, for convenience, not necessity. Some very useful tools,
such as make, have default rules for common file name suffixes,
but there's nothing that prevents one from supplying their own
rules.
Otherwise any file in one of
your folders be absolutely anything: text, source in any language,
document, binary data, object file, executable ...
It's a directory, not a folder.
And yes, any file in a directory can be pretty much anything the
user wants it to be. c_code_main is a perfectly legal name for
a c source file
while b235 is a perfectly legal name for the
executable file generated by a C compiler from the source file
c_code_main.
I'm imagine if 'abc' was a C source file, you'd have to put the object
file 'abc' in a separate folder, and the executable 'abc' in a third
one? Otherwise they will clash.
$ cat /tmp/z
#include <stdio.h>
int
main(int argc, const char **argv, const char **envp, const char **auxv)
{
printf("Hello World\n");
return 0;
}
$ cat /tmp/z | cc -x c -o a -
$ ./a
Hello World
$
Programs that deal with them will of course check that they are what
they say they are.
And how, exactly, will they do that for a text file?
On 15/01/2024 19:12, Scott Lurndal wrote:
Programs that deal with them will of course check that they are what
they say they are.
And how, exactly, will they do that for a text file?
How does Linux deal with that same problem?
On 15/01/2024 17:35, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
You're having a laugh, surely?
No. I'm serious. [DWORD] is useless cruft.
I don't use [DWORD], whatever that means.
Meanwhile %% in front of every register name, f after a label, and ""
and \n and \t on every line is useful cruft!
AT&T is bad enough even without the
travesty of it displayed here:
"jmp 2f\n"
"3:\tmovsx %%bx, %%rbx\n\t"
If you don't understand the standard C escapes, you really
should go back to read the standard carefluly.
I understand C escape codes. I am asking WHAT THE FUCK ARE THEY DOING IN >EVERY LINE OF AN ASSEMBLY PROGRAM?
What's with the strings, newline and tab escapes? What's that 'f' for?
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 03:14, bart wrote:
<snip>
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different
syntax, was Diab Data.
Now there's a name I haven't heard in decades. What ever happened
to them? We worked with them back in the early 90's using their
88100 compiler. It was impressive, particularly compared to the
PCC port that Motorola provided. Greenhills was ok, but diab
produced better code. gcc was still pretty primitive in those days.
A good Norweigian company.
On 15/01/2024 16:40, Gabriel Rolland wrote:
bart <bc@freeuk.com> writes:
This has been answered better than I could by David Brown.
I'm not reading DB's comments today.
The executable. Unix treatment of file extension is non existent,
true. So I don't bother with file extensions much. But I have yet to go
on comp.os.windows to explain how confusing file extensions are.
This is not a Linux newsgroup. It's a language group.
But I expect you do use file extensions. Otherwise any file in one of
your folders be absolutely anything: text, source in any language,
document, binary data, object file, executable ...
I'm imagine if 'abc' was a C source file, you'd have to put the object
file 'abc' in a separate folder, and the executable 'abc' in a third
one? Otherwise they will clash.
So extensions serve useful purposes for both machine, and human users.
Suppose you write an application as myprog.c. You distribute it for
people to build. The instructions could have been as simple as:
gcc myprog.c
and it will create myprog which can be invoked as ./myprog, or as
myprog.exe which on Windows is invoked as just myprog.
But because of this stupid quirk in gcc that could have been fixed in
three lines of code, millions of people have to add extra instructions:
gcc myprog.c -o myprog
You don't get this in other languages.
It is just Wrong.
Hardcoding the name of an output file is something a beginner might do,
or in the early stages of an an application (I do that myself).
But that program version is never going to see the light of day. So how
the hell did that idiotic behaviour of gcc (and the worse one of as)
ever escape into the wild? QoI was lacking.
Meanwhile everybody is defending it, and there are even people like you saying that is an advantage!
On 15/01/2024 18:30, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 03:14, bart wrote:
<snip>
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different
syntax, was Diab Data.
Now there's a name I haven't heard in decades. What ever happened
to them? We worked with them back in the early 90's using their
88100 compiler. It was impressive, particularly compared to the
PCC port that Motorola provided. Greenhills was ok, but diab
produced better code. gcc was still pretty primitive in those days.
A good Norweigian company.
A good /Swedish/ company, not Norwegian!
On 1/15/2024 8:29 AM, David Brown wrote:
On 15/01/2024 17:04, David Brown wrote:[...]
On 15/01/2024 08:40, Kaz Kylheku wrote:
A mul_add primitive might make sense if the processor had such a thing >>>> in one instruction.
With GCC inline assembly, you want to only put the essentials into it: >>>> only do what is necessary and only bundle together what must be
bundled.
Yes. For inline assembly, simple is not really important, but small
is beautiful! Say exactly what you mean, no more and no less, and let
the compiler do what it does best.
Wrt inline GCC assembly, remember __volatile__ and memory? ;^)
c:\c>copy hello.c c_code_main
1 file(s) copied.
On 2024-01-15, bart <bc@freeuk.com> wrote:
c:\c>copy hello.c c_code_main
1 file(s) copied.
There goes that damned AT&T instruction syntax for file copying, same
like in Unix.
Doesn't it bother you that it isn't: copy destination source?
On 15/01/2024 17:35, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
You're having a laugh, surely?
No. I'm serious. [DWORD] is useless cruft.
I don't use [DWORD], whatever that means.
Meanwhile %% in front of every register name, f after a label, and ""
and \n and \t on every line is useful cruft!
AT&T is bad enough even without the
travesty of it displayed here:
"jmp 2f\n"
"3:\tmovsx %%bx, %%rbx\n\t"
If you don't understand the standard C escapes, you really
should go back to read the standard carefluly.
I understand C escape codes. I am asking WHAT THE FUCK ARE THEY DOING IN EVERY LINE OF AN ASSEMBLY PROGRAM?
Just admit that my approach to inline assembler is better and give it up.
On 2024-01-15, bart <bc@freeuk.com> wrote:
The instruction template is just a string literal to the
compiler. It specifies text to be inserted into the assembly
output.
Some assembly languages require the whitespace; you need
instructions to be on separate lines.
GCC does not look inside this template other than to replace
% codes like %0 (the first register).
In my example, I put the newlines and tabs together on the right
"imul %3, %2\n\t"
"add %1, %2\n\t"
"mov %1, %0\n\t"
Thanks to these newlines and tabs, the textual output (generated .s
file if we use gcc -S) has this in it:
#APP
# 24 "inline.c" 1
imul %edx, %esi
add %esi, %edi
mov %edi, %edi
# 0 "" 2
#NO_APP
movl 8(%rsp), %eax
movl 16(%rsp), %edx
You can understand why GCC went with this textual templating approach, since the number of back-end assembly languages is astonishing. In some cases, I think, GCC supports more than one assembler syntax for the same architecture. Historically it has worked with assemblers that didn't come from GNU.
It would not be practical for GNU C to have a bazillino assembly language syntaxes in its grammar.
Just admit that my approach to inline assembler is better and give it up.
Your approach to inline assembler has nicer looking syntax, but
semantically, it doesn't seem to be on same level as the GCC approach.
GCC's inline assembly does nice things that are might not be possible in your implementation. (Or pretty much anyone else's).
Some of it has been sufficiently exemplified elsewhere in the thread.
GCC can choose registers for your inline assembly block,
and will automatically
move between those registers and the operands they are connected to (if necessary). The inline code seamlessly integrates with the code generated by the compiler. You can write primitives that generate code as well as compiler
built-ins.
On 15/01/2024 17:04, David Brown wrote:
On 15/01/2024 08:40, Kaz Kylheku wrote:
A mul_add primitive might make sense if the processor had such a thing
in one instruction.
With GCC inline assembly, you want to only put the essentials into it:
only do what is necessary and only bundle together what must be
bundled.
Yes. For inline assembly, simple is not really important, but small
is beautiful! Say exactly what you mean, no more and no less, and let
the compiler do what it does best.
And move to C++20. Then you can write (for the mythical madd
instruction) :
constexpr int mul_add(int x, int y, int z)
__attribute__((const))
{
if (std::is_constant_evaluated()) {
return x + y * z;
} else {
asm("madd %[x], %[y], %[z]"
: [x] "+r" (x) : [y] "g" (y), [z] "g" (z));
return x;
}
}
Now the compiler can evaluate at compile time if the parameters are
known, or using the super-efficient madd instruction at runtime if
necessary.
Apologies for replying to my own post, but of course this can be done in
C with gcc extensions (and when we have inline assembly, we are already relying on gcc extensions):
static inline int mul_add(int x, int y, int z)
__attribute__((const))
{
if (__builtin_constant_p(x + y * z)) {
return x + y * z;
} else {
asm("madd %[x], %[y], %[z]"
On 15/01/2024 03:14, bart wrote:
You pass the relevant data into and out of the inline assembly. If you think you need access to other symbols in the assembly, you are (almost certainly) doing things wrong. You are trying to do the compiler's job behind its back, and that is not a good idea.
If /I/ had to write extensive programs in gcc inline assembly, then
put a gun to my head now!
If you are trying to write extensive programs in assembly, you are
already getting it wrong.
Inline assembly is for things that cannot be
expressed in high level languages, or the very rare occasions where you
know a way to do something in assembly that is very much more efficient
than the compiler can generate, and the code is speed critical, and
there are no built-ins for the task, and no target intrinsics provided
by the processor manufacturer.
Take this example in C:
int a;
void F(void) {
int b=2, c=3;
static int d=4;
a = b + c * d;
}
I will now show it in my language but with that assignment replaced by
inline assembly:
int a
proc F=
int b:=2, c:=3
static int d=4
assem
mov rax, [c] # (note my ints are 64 bits)
imul2 rax, [d]
add rax, [b]
mov [a], rax
end
end
My question is: what would the C version look like with that line in
gcc inline assembly? (In both cases, 'a' should end up with the value
14.)
void F(void) {
int b = 2:
int c = 3;
static int d = 4;
asm ("imul2 %[c], %[d]\n\t"
"add %[c], %[b]"
: [c] "+g" (c) : [b] "g" (b), [d] "g" (d));
a = c;
}
The generated result (from <https://godbolt.org>) is :
F():
mov eax, 3
imul2 eax, 4
add eax, 2
mov DWORD PTR a[rip], eax
ret
On 1/15/2024 8:29 AM, David Brown wrote:
On 15/01/2024 17:04, David Brown wrote:[...]
On 15/01/2024 08:40, Kaz Kylheku wrote:
A mul_add primitive might make sense if the processor had such a thing >>>> in one instruction.
With GCC inline assembly, you want to only put the essentials into it: >>>> only do what is necessary and only bundle together what must be
bundled.
Yes. For inline assembly, simple is not really important, but small
is beautiful! Say exactly what you mean, no more and no less, and
let the compiler do what it does best.
Wrt inline GCC assembly, remember __volatile__ and memory? ;^)
On 15/01/2024 23:28, Kaz Kylheku wrote:
On 2024-01-15, bart <bc@freeuk.com> wrote:
c:\c>copy hello.c c_code_main
1 file(s) copied.
There goes that damned AT&T instruction syntax for file copying, same
like in Unix.
Doesn't it bother you that it isn't: copy destination source?
I haven't said anything about in which direction the data goes.
This is something that tends to depend on device, so Motorola went left
to right, and Zilog/Intel went right to left.
So it did seem odd for this x86 assembler to do the opposite of Intel.
However, doesn't it bother /you/ that AT&T also does the opposite of not
only how assignment works in C, but in most languages? That would be
more pertinent than somebody's choices of command-line syntax.
On 1/15/2024 3:24 PM, Kaz Kylheku wrote:
On 2024-01-15, Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
On 1/15/2024 8:29 AM, David Brown wrote:
On 15/01/2024 17:04, David Brown wrote:[...]
On 15/01/2024 08:40, Kaz Kylheku wrote:
A mul_add primitive might make sense if the processor had such a
thing
in one instruction.
With GCC inline assembly, you want to only put the essentials into >>>>>> it:
only do what is necessary and only bundle together what must be
bundled.
Yes. For inline assembly, simple is not really important, but small >>>>> is beautiful! Say exactly what you mean, no more and no less, and let >>>>> the compiler do what it does best.
Wrt inline GCC assembly, remember __volatile__ and memory? ;^)
The volatile is only important if you're writing sync primitives, so the
compiler won't move loads or stores to one side or the other of the
inline assembly.
If you're just doing some calculation, it's
counterproductive to tell the compiler not to move the code.
"memory" is only needed if your inline code clobbers memory. E.g.
one of the operands is a pointer, and the code writes through it.
That location isn't one of the output operands.
I want my sync primitive code to be _un_abused by any "clever"
optimizations. I want it is as it is.
No rouge compiler thinking that
link-time optimizations are all fun and joy, lets dance around the sync instructions... Step on them, ruin them, make them incorrect. Even GCC
had a nasty error with an optimization wrt Pthread
pthread_mutex_trylock(). Simply ruined is correctness wrt the standard.
On 15/01/2024 16:29, David Brown wrote:
On 15/01/2024 17:04, David Brown wrote:
On 15/01/2024 08:40, Kaz Kylheku wrote:
A mul_add primitive might make sense if the processor had such a thing >>>> in one instruction.
With GCC inline assembly, you want to only put the essentials into it: >>>> only do what is necessary and only bundle together what must be
bundled.
Yes. For inline assembly, simple is not really important, but small
is beautiful! Say exactly what you mean, no more and no less, and
let the compiler do what it does best.
And move to C++20. Then you can write (for the mythical madd
instruction) :
constexpr int mul_add(int x, int y, int z)
__attribute__((const))
{
if (std::is_constant_evaluated()) {
return x + y * z;
} else {
asm("madd %[x], %[y], %[z]"
: [x] "+r" (x) : [y] "g" (y), [z] "g" (z));
return x;
}
}
Now the compiler can evaluate at compile time if the parameters are
known, or using the super-efficient madd instruction at runtime if
necessary.
Apologies for replying to my own post, but of course this can be done
in C with gcc extensions (and when we have inline assembly, we are
already relying on gcc extensions):
static inline int mul_add(int x, int y, int z)
__attribute__((const))
{
if (__builtin_constant_p(x + y * z)) {
return x + y * z;
} else {
asm("madd %[x], %[y], %[z]"
Which processor is this for again?
On 16/01/2024 01:21, Kaz Kylheku wrote:
On 2024-01-15, bart <bc@freeuk.com> wrote:
[Inline assembly]
The instruction template is just a string literal to the
compiler. It specifies text to be inserted into the assembly
output.
Some assembly languages require the whitespace; you need
instructions to be on separate lines.
GCC does not look inside this template other than to replace
% codes like %0 (the first register).
In my example, I put the newlines and tabs together on the right
"imul %3, %2\n\t"
"add %1, %2\n\t"
"mov %1, %0\n\t"
Thanks to these newlines and tabs, the textual output (generated .s
file if we use gcc -S) has this in it:
#APP
# 24 "inline.c" 1
imul %edx, %esi
add %esi, %edi
mov %edi, %edi
# 0 "" 2
#NO_APP
movl 8(%rsp), %eax
movl 16(%rsp), %edx
This is still peculiar: why prioritise the appearance of the
intermediate code which I assume you're rarely going to look at?
It's the version with strings and escape codes that you're going to be writing and maintaining, and that people will see in the C sources!
Just admit that my approach to inline assembler is better and give it
up.
Your approach to inline assembler has nicer looking syntax, but
semantically, it doesn't seem to be on same level as the GCC approach.
GCC's inline assembly does nice things that are might not be possible
in your
implementation. (Or pretty much anyone else's).
Some of it has been sufficiently exemplified elsewhere in the thread.
GCC can choose registers for your inline assembly block,
The point of ASM is that /you/ call the shots.
It's not just more natural syntax, but better integration within the
host language. More example, I can 'goto' to a label within an ASSEM
block, and 'jmp' out of the ASSEM block to a HLL label, or to an label
within a separate ASSEM block further on.
and will automatically
move between those registers and the operands they are connected to (if
necessary). The inline code seamlessly integrates with the code
generated by
the compiler. You can write primitives that generate code as well as
compiler
built-ins.
OK. But it's not an 'assembler' as is generally understood. Mine is; it
looks exactly line normal assembly, and it is written inline to the HLL.
Although these days I keep such hybrid functions within their own modules.
On 16/01/2024 12:54, bart wrote:
Which processor is this for again?
It's for a cpu with a "madd" instruction that implements "x = x + y * z"
in a single instruction - as Kaz pointed out, doing this in inline
assembly would make sense if it the cpu had such a dedicated
instruction. [...]
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 18:30, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 03:14, bart wrote:
<snip>
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different >>>> syntax, was Diab Data.
Now there's a name I haven't heard in decades. What ever happened
to them? We worked with them back in the early 90's using their
88100 compiler. It was impressive, particularly compared to the
PCC port that Motorola provided. Greenhills was ok, but diab
produced better code. gcc was still pretty primitive in those days.
A good Norweigian company.
A good /Swedish/ company, not Norwegian!
Hm. I could have sworn the folks we dealt with were
in Norway - perhaps a branch office?
Let's look at an actual example from my own code, in an older project. I wanted an endian swap function on an ARM microcontroller, and for
reasons that escape me for now, I did not want to use gcc's __builtin_bswap32, or an intrinsic from a header, or just plain C code
(which modern gcc could optimise to a single "rev" instruction). The
code was probably originally written for quite an old version of the compiler. So I wrote the function:
static inline uint32_t swapEndian32(uint32_t x) {
uint32_t y;
asm ("rev %[y], %[x]" : [y] "=r" (y) : [x] "r" (x) : );
return y;
}
This is, IMHO, quite clear once you know that gcc assembly consists of
the assembly template, the outputs, then the inputs.
And it generates
the code optimally - when used in an expression, there will be no extra moves, or data put on the stack, or wasted registers. The compiler can
move the code back and forth while optimising, eliminate calls when the result is used, and generally do its job just as well with this function
as any other inline function or built in operator.
You need to tell me, because I will otherwise not have a clue.
It's clear that you haven't a clue. So how can you justify ranting and raving against something you don't understand?
From what I've seen of gcc inline asm:
* Code has to be written within string literals,
Yes, obviously. Assembly is not C, so writing assembly mixed in your C requires it to be in a format that is acceptable in C syntax (or at
least close enough to C syntax to be a non-invasive extension). String literals are also quite amenable to generation by macros, for those that
want to write something complicated.
in dreadfil AT&T
syntax.
"Dreadful" is, again, /your/ opinion - not shared by everyone. (I personally don't care either way.)
It only applies to x86, not any
other targets, and is easily changed by the "-masm=intel" flag
And apparently even with embedded \n line breaks. (Good
grief - I think early 80s BASICs had more sophisticated facilities!)
That is an inevitability for string literals. And it doesn't matter
much in practice, since most inline assembly (IME) consists of a single statement - gcc handles any moves that might be needed.
Remember, the compiler passes the assembly on to the assembler - this is /not/ a C compiler with a built-in assembler. And that's a good thing.
Have you any idea how many assembly instructions there are for all the targets supported by gcc? And you'd need to update gcc every time there
was a new instruction, rather than just updating the assembler (which is
a lot simpler).
Of course it would be /possible/ to extend gcc with a built-in
assembler. But what would that give you? Lots of duplicate work to support C, C++, Fortran, Ada, and other languages?
The assembler
already handles assembly - why make an HLL do it? It's a lot better to
put the effort into reducing the number of times you actually need to
write inline assembly, by improving the optimiser and builtin functions.
* You mostly use offsets to get at local variables
You never do that. You are imagining things. Or you are looking at
some very odd inline assembly examples.
* You apparently aren't allowed to use just any registers as you need
to negotiate with gcc so as not to interfere with /its/ use of
registers. So most examples I saw seemed to deal with this.
Or, as sane people would say, you don't need to mess around trying to
figure out what different registers are used for different purposes, or
where your input data is, or where your output data should go - gcc will handle it all for you.
I consider that when writing assembly, YOU are in charge not the
compiler. As you can see from mine:
* It is written just as it would be in an actual ASM file
Yes - and that's why it is so limited, and requires so much more
assembly. I prefer to let the compiler do what the compiler is good at.
* You can refer to variables directly (the compiler will add what is
needed to access locals or statics)
I can refer to all the variables I want to - and coordinate with the
compiler so that it knows what I am doing. Cooperation works far better than some arrogant pompous fool claiming they know better, and ruining
the optimiser's work. Mind you, you wrote your compiler, so I suppose
you /do/ know better than your compiler.
If a function uses inline ASM, variables are kept in memory not
registers.
What a terrible pessimation.
(I might allow that at some point.) Most such functions however
contain only ASM.
What a terrible limitation.
That still lets ASM use the facilities of the HLL such as functions,
declarations, named constants, scopes etc.
I suppose you're going to suggest that gcc's facilities are superior...
There really isn't the slightest doubts there.
I'll happily agree that your inline assembly is simpler. But in every
other respect, it's not close to gcc's.
But perhaps you don't care about efficient code generation (and to be
fair, that is certainly not always important), and perhaps since your compiler doesn't do much optimising then there is little to be lost by failing to work along with the optimiser.
And perhaps you have to write
big sections of assembly because you can't write them in C and get fast results.
On 15/01/2024 19:41, bart wrote:
Meanwhile everybody is defending it, and there are even people like
you saying that is an advantage!
Maybe to some people it /is/ an advantage. Have you ever considered that?
On 15/01/2024 20:16, David Brown wrote:
On 15/01/2024 19:41, bart wrote:
[gcc writing.out/a.ext output executables by default.]
Meanwhile everybody is defending it, and there are even people like
you saying that is an advantage!
Maybe to some people it /is/ an advantage. Have you ever considered
that?
Sure. The same way that the fallthrough behaviour of C's switch
statement is considered an advantage by some. Meanwhile, 99% of the time
On 15/01/2024 15:23, David Brown wrote:
On 15/01/2024 03:14, bart wrote:
You pass the relevant data into and out of the inline assembly. If
you think you need access to other symbols in the assembly, you are
(almost certainly) doing things wrong. You are trying to do the
compiler's job behind its back, and that is not a good idea.
Not with gcc. You don't want to mess with that.
But /I/ use inline assembler when /I/ in in charge of the code.
If /I/ had to write extensive programs in gcc inline assembly, then
put a gun to my head now!
If you are trying to write extensive programs in assembly, you are
already getting it wrong.
I want to write HLL functions that may have a number of lines in
assembly, from one line up to a few dozen.
Inline assembly is for things that cannot be expressed in high level
languages, or the very rare occasions where you know a way to do
something in assembly that is very much more efficient than the
compiler can generate, and the code is speed critical, and there are
no built-ins for the task, and no target intrinsics provided by the
processor manufacturer.
Take this example in C:
int a;
void F(void) {
int b=2, c=3;
static int d=4;
a = b + c * d;
}
I will now show it in my language but with that assignment replaced
by inline assembly:
int a
proc F=
int b:=2, c:=3
static int d=4
assem
mov rax, [c] # (note my ints are 64 bits)
imul2 rax, [d]
add rax, [b]
mov [a], rax
end
end
My question is: what would the C version look like with that line in
gcc inline assembly? (In both cases, 'a' should end up with the value
14.)
void F(void) {
int b = 2:
int c = 3;
static int d = 4;
asm ("imul2 %[c], %[d]\n\t"
"add %[c], %[b]"
: [c] "+g" (c) : [b] "g" (b), [d] "g" (d));
a = c;
}
Sorry, but you've turned it into gobbledygook. My example was for x64
which is a 1.5 address machine, here you've turned it into a 2-address machine. Could I make it 3-address? What are the rules?
It is a different language.
The generated result (from <https://godbolt.org>) is :
F():
mov eax, 3
imul2 eax, 4
add eax, 2
mov DWORD PTR a[rip], eax
ret
The initialisations I used were so I could test that it gave the correct results. Without them, godbolt gives me this for the body of the function:
movl d.0(%rip), %edx
movl -8(%rbp), %eax
imul2 %eax, %edx
add %eax, -4(%rbp)
movl %eax, -8(%rbp)
movl -8(%rbp), %eax
movl %eax, a(%rip)
My version (which evaluates a=b+c*d; somehow your version modifies c)
gives me this (D0 == rax):
mov D0, [Dframe+test.f.c]
imul2 D0, [test.f.d]
add D0, [Dframe+test.f.b]
mov [test.a], D0
Unsurprisingly, this is exactly the ASM I typed (plus the necessary name qualifiers). That is the entire point.
If I tweak your C version, make a,b,c,d all external statics, and apply
-O3, godbolt gives me this:
imul2 c(%rip), d(%rip)
add c(%rip), b(%rip)
movl c(%rip), %eax
movl %eax, a(%rip)
ret
This is slightly worrying: imul2 is not a valid instruction (it's
specific to my assembler). While add can't take two memory operands.
So it looks like it can only do so much checking. (Using gcc locally
gave valid assembly.)
So it all seems hit and miss.
I'll reply to the second half of your post later.
Let me just say that, in my interpreter, the extensive use of inline
assembly in one module, makes some programs run twice as fast, as a gcc-O3-compiled C rendering.
It also lets me write trivial solutions to the LIBFFI problem.
It works.
On 16.01.2024 14:42, David Brown wrote:
On 16/01/2024 12:54, bart wrote:
Which processor is this for again?
It's for a cpu with a "madd" instruction that implements "x = x + y * z"
in a single instruction - as Kaz pointed out, doing this in inline
assembly would make sense if it the cpu had such a dedicated
instruction. [...]
I recall such a feature from a 35+ years old assembler project I did.
It was on the TI TMS320C25 DSP, and the instruction was called 'MAC' (Multiply and ACcumulate). - Not sure it clarifies anything but just
as an amendment if someone is interested in searching for keywords
on such a function.
On 16/01/2024 01:21, Kaz Kylheku wrote:
On 2024-01-15, bart <bc@freeuk.com> wrote:
[Inline assembly]
The instruction template is just a string literal to the
compiler. It specifies text to be inserted into the assembly
output.
Some assembly languages require the whitespace; you need
instructions to be on separate lines.
GCC does not look inside this template other than to replace
% codes like %0 (the first register).
In my example, I put the newlines and tabs together on the right
"imul %3, %2\n\t"
"add %1, %2\n\t"
"mov %1, %0\n\t"
Thanks to these newlines and tabs, the textual output (generated .s
file if we use gcc -S) has this in it:
#APP
# 24 "inline.c" 1
imul %edx, %esi
add %esi, %edi
mov %edi, %edi
# 0 "" 2
#NO_APP
movl 8(%rsp), %eax
movl 16(%rsp), %edx
This is still peculiar: why prioritise the appearance of the
intermediate code which I assume you're rarely going to look at?
On 16.01.2024 14:42, David Brown wrote:
On 16/01/2024 12:54, bart wrote:
Which processor is this for again?
It's for a cpu with a "madd" instruction that implements "x = x + y * z"
in a single instruction - as Kaz pointed out, doing this in inline
assembly would make sense if it the cpu had such a dedicated
instruction. [...]
I recall such a feature from a 35+ years old assembler project I did.
It was on the TI TMS320C25 DSP, and the instruction was called 'MAC' >(Multiply and ACcumulate). - Not sure it clarifies anything but just
as an amendment if someone is interested in searching for keywords
on such a function.
On 15/01/2024 21:41, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 18:30, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 03:14, bart wrote:
<snip>
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different >>>>> syntax, was Diab Data.
Now there's a name I haven't heard in decades. What ever happened
to them? We worked with them back in the early 90's using their
88100 compiler. It was impressive, particularly compared to the
PCC port that Motorola provided. Greenhills was ok, but diab
produced better code. gcc was still pretty primitive in those days.
A good Norweigian company.
A good /Swedish/ company, not Norwegian!
Hm. I could have sworn the folks we dealt with were
in Norway - perhaps a branch office?
It would be a little surprising, but certainly possible. Sweden has had
been quite significant in the compiler world - IAR is a big name in
embedded toolchains, and they are Swedish.
You are sure you are not just one of these ignorant parochial Merican's
who think Norway is the capital of Sweden? :-)]
On 15/01/2024 23:28, Kaz Kylheku wrote:
On 2024-01-15, bart <bc@freeuk.com> wrote:
c:\c>copy hello.c c_code_main
1 file(s) copied.
There goes that damned AT&T instruction syntax for file copying, same
like in Unix.
Doesn't it bother you that it isn't: copy destination source?
I haven't said anything about in which direction the data goes.
This is something that tends to depend on device, so Motorola went left
to right, and Zilog/Intel went right to left.
So it did seem odd for this x86 assembler to do the opposite of Intel.
However, doesn't it bother /you/ that AT&T also does the opposite of not
On 15/01/2024 15:23, David Brown wrote:
Let's look at an actual example from my own code, in an older project.
I wanted an endian swap function on an ARM microcontroller, and for
reasons that escape me for now, I did not want to use gcc's
__builtin_bswap32, or an intrinsic from a header, or just plain C code
(which modern gcc could optimise to a single "rev" instruction). The
code was probably originally written for quite an old version of the
compiler. So I wrote the function:
static inline uint32_t swapEndian32(uint32_t x) {
uint32_t y;
asm ("rev %[y], %[x]" : [y] "=r" (y) : [x] "r" (x) : );
return y;
}
This is, IMHO, quite clear once you know that gcc assembly consists of
the assembly template, the outputs, then the inputs.
Well, you've explained it. But I'm none the wiser. Let's break it down better:
rev %[y], %[x] # rev appears to be an ARM instruction:
# rev Rdest, Rsource
[y] "=r" (y) # Outputs?
[x] "r" (x) # Inputs?
You're telling gcc that somehow, the value of x needs to get into a
register (since rev doesn't work on memory, or immediates). And that the
new value of y needs to come from a register.
The compiler will decide which registers to use, and insert them into
that instruction.
And it will ensure that x is loaded into its register,
if it is not already in one; and that y is loaded from its register, if
it is not already in the prefered one.
Since this is a return value, it will likely use R0 anyway.
But I'm inferring this from the way I know that 'rev' must work, and
from your comments. You seem to be expending a lot of effort however
into explaining it to gcc.
My function would be this on x64 (although I don't support bswap):
fun swapends64(u64 x)u64 = assem mov rax, [x]; bswap rax; end
This could have been shorter if 'bswap' had a separate dest register.
Here, knowing that x is always going to be rcx, I could have copied
straight from there, but would be bad form.
I think a useful enhancement to my scheme would be allow 'x' for example
to exist in static memory, stackframe, or in a register. The register allocator for locals can be made to work with the assembly: it will only choose registers that have not been used for anything else.
So, there are plenty of opportunities to make my scheme even better.
And it generates the code optimally - when used in an expression,
there will be no extra moves, or data put on the stack, or wasted
registers. The compiler can move the code back and forth while
optimising, eliminate calls when the result is used, and generally do
its job just as well with this function as any other inline function
or built in operator.
You need to tell me, because I will otherwise not have a clue.
It's clear that you haven't a clue. So how can you justify ranting
and raving against something you don't understand?
I'm been familiar with x86 assembly for 40 years, so I should expect to understand it! But the answer is simple: what gcc provides is little to
do with x86, and 90% of it seems made up.
From what I've seen of gcc inline asm:
* Code has to be written within string literals,
Yes, obviously. Assembly is not C, so writing assembly mixed in your
C requires it to be in a format that is acceptable in C syntax (or at
least close enough to C syntax to be a non-invasive extension).
String literals are also quite amenable to generation by macros, for
those that want to write something complicated.
So, how did I manage to get Intel-style assembly into my language? I
didn't need to use strings.
gcc should try harder!
in dreadfil AT&T
syntax.
"Dreadful" is, again, /your/ opinion - not shared by everyone. (I
personally don't care either way.)
This is the first hit for "at&t versus intel syntax": https://imada.sdu.dk/u/kslarsen/dm546/Material/IntelnATT.htm
Its opinion is:
"The AT&T form for instructions involving complex operations is very
obscure compared to Intel syntax."
So, it's not just my opinion.
It only applies to x86, not any other targets, and is easily changed
by the "-masm=intel" flag
That's usually how I view gcc assembly output. But it still manages to
make it look terrible. Godbolt is much better as it filters out stuff
that is not relevant.
And apparently even with embedded \n line breaks. (GoodThat is an inevitability for string literals. And it doesn't matter
grief - I think early 80s BASICs had more sophisticated facilities!) >>
much in practice, since most inline assembly (IME) consists of a
single statement - gcc handles any moves that might be needed.
I'm sorry, but that is not writing 'assembly'.
Remember, the compiler passes the assembly on to the assembler - this
is /not/ a C compiler with a built-in assembler. And that's a good
thing. Have you any idea how many assembly instructions there are for
all the targets supported by gcc? And you'd need to update gcc every
time there was a new instruction, rather than just updating the
assembler (which is a lot simpler).
I wonder how many times people here have updated just 'as'?
In any case,
there are a number of ways around it, but as you have pointed out, you
don't make serious use of assembly so it doesn't matter.
Of course it would be /possible/ to extend gcc with a built-in
assembler. But what would that give you? Lots of duplicate work to
support C, C++, Fortran, Ada, and other languages?
On top of the duplicate work you already need to support C, C++, Fortran
and Ada?
Well, you can forget the last two.
But a lower level language like C,
which is already known as a 'portable assembler', you'd think would have better facilities.
I have a better idea: how about you take an existing, proper assembler,
and build a C compiler around it?
The assembler already handles assembly - why make an HLL do it?
It's a lot better to put the effort into reducing the number of times
you actually need to write inline assembly, by improving the optimiser
and builtin functions.
* You mostly use offsets to get at local variables
You never do that. You are imagining things. Or you are looking at
some very odd inline assembly examples.
* You apparently aren't allowed to use just any registers as you need >>> to negotiate with gcc so as not to interfere with /its/ use of
registers. So most examples I saw seemed to deal with this.
Or, as sane people would say, you don't need to mess around trying to
figure out what different registers are used for different purposes,
or where your input data is, or where your output data should go - gcc
will handle it all for you.
As I've said repeatedly, this not assembly. You have to ask exactly why
you need to use assembly. If it is in rare, special situations, then it
is not a big deal to think about how it will work with registers.
I consider that when writing assembly, YOU are in charge not the
compiler. As you can see from mine:
* It is written just as it would be in an actual ASM file
Yes - and that's why it is so limited, and requires so much more
assembly. I prefer to let the compiler do what the compiler is good at.
I do that when I write HLL code. But when I need ASM, it should be as
simple as possible:
a := asm rdtsc # low 32 bit of time stamp counter
println a
* You can refer to variables directly (the compiler will add what is >>> needed to access locals or statics)
I can refer to all the variables I want to - and coordinate with the
compiler so that it knows what I am doing. Cooperation works far
better than some arrogant pompous fool claiming they know better, and
ruining the optimiser's work. Mind you, you wrote your compiler, so I
suppose you /do/ know better than your compiler.
If a function uses inline ASM, variables are kept in memory not
registers.
What a terrible pessimation.
The need for assembly usually trumps whatever minor optimisation my
compiler might do.
(I might allow that at some point.) Most such functions however
contain only ASM.
What a terrible limitation.
I didn't mention a limitation. My remarks mean my functions can comprise
0% to 100% inline assembly, but quite often it will be 100%, by choice.
For example, routines to do 128-bit arithmetic.
That still lets ASM use the facilities of the HLL such as functions,
declarations, named constants, scopes etc.
I suppose you're going to suggest that gcc's facilities are superior...
There really isn't the slightest doubts there.
I'll happily agree that your inline assembly is simpler. But in every
other respect, it's not close to gcc's.
But perhaps you don't care about efficient code generation (and to be
fair, that is certainly not always important), and perhaps since your
compiler doesn't do much optimising then there is little to be lost by
failing to work along with the optimiser.
And perhaps you have to write big sections of assembly because you
can't write them in C and get fast results.
My last post mentioned an app where my inline assembly, even combined
with my non-optimised code for the rest, resulted in much faster
runtimes than achieved by transpiling to C.
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 21:41, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 18:30, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 03:14, bart wrote:
<snip>
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different >>>>>> syntax, was Diab Data.
Now there's a name I haven't heard in decades. What ever happened
to them? We worked with them back in the early 90's using their
88100 compiler. It was impressive, particularly compared to the
PCC port that Motorola provided. Greenhills was ok, but diab
produced better code. gcc was still pretty primitive in those days. >>>>>
A good Norweigian company.
A good /Swedish/ company, not Norwegian!
Hm. I could have sworn the folks we dealt with were
in Norway - perhaps a branch office?
It would be a little surprising, but certainly possible. Sweden has had
been quite significant in the compiler world - IAR is a big name in
embedded toolchains, and they are Swedish.
You are sure you are not just one of these ignorant parochial Merican's
who think Norway is the capital of Sweden? :-)]
No, I'm 7/8th norwegian, with a bit of swiss. While I haven't visited (yet), I do have relatives there. Think Luren dal.
Granted it's been three decades since were were using diab compilers (1993ish)...
On 16/01/2024 17:02, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 21:41, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 18:30, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 03:14, bart wrote:
<snip>
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different >>>>>>> syntax, was Diab Data.
Now there's a name I haven't heard in decades. What ever happened >>>>>> to them? We worked with them back in the early 90's using their
88100 compiler. It was impressive, particularly compared to the
PCC port that Motorola provided. Greenhills was ok, but diab
produced better code. gcc was still pretty primitive in those days. >>>>>>
A good Norweigian company.
A good /Swedish/ company, not Norwegian!
Hm. I could have sworn the folks we dealt with were
in Norway - perhaps a branch office?
It would be a little surprising, but certainly possible. Sweden has had >>> been quite significant in the compiler world - IAR is a big name in
embedded toolchains, and they are Swedish.
You are sure you are not just one of these ignorant parochial Merican's
who think Norway is the capital of Sweden? :-)]
I hope you noticed the smiley :-)
No, I'm 7/8th norwegian, with a bit of swiss. While I haven't visited (yet),
I do have relatives there. Think Luren dal.
I would say you are of Norwegian decent, or have Norwegian family roots
- it's not the same as being Norwegian.
You need to at least visit the Country!
Alternatively, you need to eat a /lot/ of brunost to improve
your credentials.
I looked up "Lurendal" on Google maps. It's in Sweden :-)
Maybe your
parents told you they were Norwegian, because they know that Norwegians
are superior to Swedes in every way...
(There are a few place names in Norway with "Luren" in them, and of
course spellings change over time between family names and place names.)
On 16/01/2024 01:21, Kaz Kylheku wrote:
On 2024-01-15, bart <bc@freeuk.com> wrote:
[Inline assembly]
The instruction template is just a string literal to the
compiler. It specifies text to be inserted into the assembly
output.
Some assembly languages require the whitespace; you need
instructions to be on separate lines.
GCC does not look inside this template other than to replace
% codes like %0 (the first register).
In my example, I put the newlines and tabs together on the right
"imul %3, %2\n\t"
"add %1, %2\n\t"
"mov %1, %0\n\t"
Thanks to these newlines and tabs, the textual output (generated .s
file if we use gcc -S) has this in it:
#APP
# 24 "inline.c" 1
imul %edx, %esi
add %esi, %edi
mov %edi, %edi
# 0 "" 2
#NO_APP
movl 8(%rsp), %eax
movl 16(%rsp), %edx
This is still peculiar: why prioritise the appearance of the
intermediate code which I assume you're rarely going to look at?
It's the version with strings and escape codes that you're going to be writing and maintaining, and that people will see in the C sources!
This is akin to a language allowing embedded C but you have to write it
like this:
clang{"\tprintf(\"A=%%d\\n\", a);\n"};
so that the generated version looks like:
printf("A=%d\n", a);
Except you can't refer to 'a' directly, you have to write it as %0 and
then have some extra mechanism to somehow map that to 'a'.
You can understand why GCC went with this textual templating approach, since >> the number of back-end assembly languages is astonishing. In some cases, I >> think, GCC supports more than one assembler syntax for the same architecture.
Historically it has worked with assemblers that didn't come from GNU.
gcc is a project 10s of 1000s of files. If a particular configuration
targets one architecture out of 100, it can also support the ASM syntax
for that architecture.
You don't need to have the ASM syntax embedded within the C grammar. Not
so specifically anyway; you allow a bunch of keyword and register tokens within asm{...}.
Yes it's a bit harder, but if I can do it within a 0.3MB product, gcc
can do it within 24MB.
It would not be practical for GNU C to have a bazillino assembly language
syntaxes in its grammar.
Just admit that my approach to inline assembler is better and give it up. >>Your approach to inline assembler has nicer looking syntax, but
semantically, it doesn't seem to be on same level as the GCC approach.
GCC's inline assembly does nice things that are might not be possible in your
implementation. (Or pretty much anyone else's).
Some of it has been sufficiently exemplified elsewhere in the thread.
GCC can choose registers for your inline assembly block,
The point of ASM is that /you/ call the shots.
It's not just more natural syntax, but better integration within the
host language. More example, I can 'goto' to a label within an ASSEM
block, and 'jmp' out of the ASSEM block to a HLL label, or to an label
within a separate ASSEM block further on.
and will automatically
move between those registers and the operands they are connected to (if
necessary). The inline code seamlessly integrates with the code generated by >> the compiler. You can write primitives that generate code as well as compiler
built-ins.
OK. But it's not an 'assembler' as is generally understood.
Mine is; it
looks exactly line normal assembly, and it is written inline to the HLL.
On 16.01.2024 14:42, David Brown wrote:
On 16/01/2024 12:54, bart wrote:
Which processor is this for again?
It's for a cpu with a "madd" instruction that implements "x = x + y * z"
in a single instruction - as Kaz pointed out, doing this in inline
assembly would make sense if it the cpu had such a dedicated
instruction. [...]
I recall such a feature from a 35+ years old assembler project I did.
It was on the TI TMS320C25 DSP, and the instruction was called 'MAC' (Multiply and ACcumulate). - Not sure it clarifies anything but just
as an amendment if someone is interested in searching for keywords
on such a function.
On 12/01/2024 18:09, bart wrote:
download to a target board via a debugger,
(Hey, I used to do that! Not a makefile in sight either; how is that
possible?
Grow up.
I used to do that with no special tools, no external software and no
external languages. I had to write assemblers for any new devices I
has to use.)
Yes, I've heard it before. If you wanted a keyboard, you had to carve
it out of a rock with your teeth.
When I learned assembly, I assembled code to hex by hand. On paper. I don't consider that particularly relevant to my work today.
On 15/01/2024 11:45, David Brown wrote:
On 12/01/2024 18:09, bart wrote:
(BTW that keyboard was a joy to use: it used a simple 8-bit port, with
the top bit strobing when a key was ready. Compare with what's involved
with using a USB keyboard today, if you didn't have a 10GB OS to take
care of it.)
bart <bc@freeuk.com> writes:
On 15/01/2024 11:45, David Brown wrote:
On 12/01/2024 18:09, bart wrote:
(BTW that keyboard was a joy to use: it used a simple 8-bit port, with
the top bit strobing when a key was ready. Compare with what's involved
with using a USB keyboard today, if you didn't have a 10GB OS to take
care of it.)
If you ignore, of course, the fact that a 64KB BIOS can easily handle
the entire USB stack sufficient to support both USB mass storage
devices, networking devices (PXE) and USB Human Interface Devices (keyboards, mice).
No 10GB OS involvement.
David Brown <david.brown@hesbynett.no> writes:
On 16/01/2024 17:02, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 21:41, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 18:30, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
On 15/01/2024 03:14, bart wrote:
<snip>
The only compiler I used with an inline
assembly of comparable power and integration to gcc's, but a different >>>>>>>> syntax, was Diab Data.
Now there's a name I haven't heard in decades. What ever happened >>>>>>> to them? We worked with them back in the early 90's using their >>>>>>> 88100 compiler. It was impressive, particularly compared to the >>>>>>> PCC port that Motorola provided. Greenhills was ok, but diab
produced better code. gcc was still pretty primitive in those days. >>>>>>>
A good Norweigian company.
A good /Swedish/ company, not Norwegian!
Hm. I could have sworn the folks we dealt with were
in Norway - perhaps a branch office?
It would be a little surprising, but certainly possible. Sweden has had >>>> been quite significant in the compiler world - IAR is a big name in
embedded toolchains, and they are Swedish.
You are sure you are not just one of these ignorant parochial Merican's >>>> who think Norway is the capital of Sweden? :-)]
I hope you noticed the smiley :-)
No, I'm 7/8th norwegian, with a bit of swiss. While I haven't visited (yet),
I do have relatives there. Think Luren dal.
I would say you are of Norwegian decent, or have Norwegian family roots
- it's not the same as being Norwegian.
Point.
You need to at least visit the Country!
My folks have been there a couple of times, and looked
up distant relatives from both sides (the other side
was from the Bergen area, IIRC).
Alternatively, you need to eat a /lot/ of brunost to improve
your credentials.
Does lutefisk and lefse count?
On 16/01/2024 19:45, Scott Lurndal wrote:
David Brown <david.brown@hesbynett.no> writes:
Alternatively, you need to eat a /lot/ of brunost to improve
your credentials.
Does lutefisk and lefse count?
Everyone likes lefser, but if you can claim to like lutefisk with a
straight face, you must be Norwegian!
(For those that don't know, "lutefisk" is made by drying cod completely,
then soaking it in draincleaner, then washing it, then boiling it. It >doesn't beat the Swedish canned fermented fish or Icelandic sharks
buried for months in the sand, but it's definitely not something to be
eaten lightly.)
(For those that don't know, "lutefisk" is made by drying cod completely,
then soaking it in draincleaner, then washing it, then boiling it. It doesn't beat the Swedish canned fermented fish or Icelandic sharks
buried for months in the sand, but it's definitely not something to be
eaten lightly.)
On 16/01/2024 16:15, bart wrote:
void swap_lots(const uint32_t * restrict in, uint32_t * restrict out, uint32_t n) {
while (n--) {
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
}
}
My function would be this on x64 (although I don't support bswap):
fun swapends64(u64 x)u64 = assem mov rax, [x]; bswap rax; end
And how efficient would your "swap_lots" function be?
On 2024-01-16, bart <bc@freeuk.com> wrote:
gcc is a project 10s of 1000s of files. If a particular configuration
targets one architecture out of 100, it can also support the ASM syntax
for that architecture.
Not reasonably so. The parser of every front end would have to have
a special case for that assembly language.
The proposal does not pass a cost/benefit analysis, due to the
high cost and low benefit.
You don't need to have the ASM syntax embedded within the C grammar. Not
so specifically anyway; you allow a bunch of keyword and register tokens
within asm{...}.
keywords and tokens are grammar.
How things look does become important when you are writing tens of
thousands of lines of code.
On 1/16/2024 5:38 AM, David Brown wrote:
On 16/01/2024 03:18, Chris M. Thomasson wrote:[...]
On 1/15/2024 3:24 PM, Kaz Kylheku wrote:
On 2024-01-15, Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote: >>>>> On 1/15/2024 8:29 AM, David Brown wrote:
On 15/01/2024 17:04, David Brown wrote:
On 15/01/2024 08:40, Kaz Kylheku wrote:
You've mentioned this many times - do you have a reference that gives
the source of this function (at the time when there was an issue), and a
description or report of what you think gcc did wrong? I am curious as
to whether it was a bug in the code or a bug in gcc (gcc is certainly
not bug-free).
I think I found it David!
https://groups.google.com/g/comp.programming.threads/c/Y_Y2DZOWErM/m/nuyEoKq0onUJ
On 1/16/2024 5:08 PM, Chris M. Thomasson wrote:
On 1/16/2024 5:38 AM, David Brown wrote:
On 16/01/2024 03:18, Chris M. Thomasson wrote:[...]
On 1/15/2024 3:24 PM, Kaz Kylheku wrote:
On 2024-01-15, Chris M. Thomasson <chris.m.thomasson.1@gmail.com>
wrote:
On 1/15/2024 8:29 AM, David Brown wrote:
On 15/01/2024 17:04, David Brown wrote:
On 15/01/2024 08:40, Kaz Kylheku wrote:
You've mentioned this many times - do you have a reference that gives
the source of this function (at the time when there was an issue), and
a description or report of what you think gcc did wrong? I am curious
as to whether it was a bug in the code or a bug in gcc (gcc is
certainly not bug-free).
I think I found it David!
https://groups.google.com/g/comp.programming.threads/c/Y_Y2DZOWErM/m/nuyEoKq0onUJ
Here is a nice quote from Dave Butenhof in that thread: ____________________________
Dave Butenhof's profile photo
Dave Butenhof
Nov 15, 2007, 3:12:14 PM
to
Chris Thomasson wrote:
"Zeljko Vrba" <zvrba....@ieee-sb1.cc.fer.hr> wrote in message news:slrnfjnt6l...@ieee-sb1.cc.fer.hr...
On 2007-11-14, Chris Thomasson <cri...@comcast.net> wrote:
How is the compiler supposed to know where a CS begins and ends? should
If GCC performs the optimization that David Schwartz pointer out, your >>> basically screwed. AFAICT, GCC is totally busted if it allows stores to >>> escape a critical-section. This is a race-condition waiting to
happen. I am
it have a knowledge of every imaginable official and unofficial API?
I was under the impression that POSIX puts some restrictions onThe point is that POSIX puts restrictions on the behavior of a
compilers. Humm... I can't really remember where I heard that right now, but I sure think I did. Humm...
conforming system. That includes library, kernel, and compiler. If the
RESULT doesn't behave like POSIX, then it's not POSIX.
A compiler that's part of a conforming POSIX system environment can't generate code that breaks synchronization. How it and the rest of the
system accomplish that is unspecified.
Often, it means simply not performing risky optimizations. But if they
are enabled, then the system needs to be able to detect and avoid
performing them in "dangerous" areas of code. (A complicated problem,
but nothing's impossible.)
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal?
It does that so you can pipe the assembler source code in to the
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
or, if you want, you can type in the assembler source directly.
Or you can save it in a file and supply the file argument to the command.
None of which your stuff supports, which makes it useless to me.
On 16/01/2024 20:09, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 15/01/2024 11:45, David Brown wrote:
On 12/01/2024 18:09, bart wrote:
(BTW that keyboard was a joy to use: it used a simple 8-bit port, with
the top bit strobing when a key was ready. Compare with what's involved
with using a USB keyboard today, if you didn't have a 10GB OS to take
care of it.)
If you ignore, of course, the fact that a 64KB BIOS can easily handle
the entire USB stack sufficient to support both USB mass storage
devices, networking devices (PXE) and USB Human Interface Devices
(keyboards, mice).
No 10GB OS involvement.
It seemed to take quite a few years before the early Linuxes I played
around with 25+ years ago managed to support USB, among other things.
So it's only easy if you know how. Reading the current key on the Z80 (already in ASCII) was one IN instruction.
On 1/16/2024 6:06 AM, David Brown wrote:
Well, you don't often write inline assembly - its rare to write it.
It's typically the kind of thing you write once for your particular
instruction, then stick it away in a header somewhere. You might use
it often, but you don't need to read or edit the code often.[...]
As soon as you use inline assembler in a file, you sort of "need" to?
On 1/16/2024 5:38 AM, David Brown wrote:
On 16/01/2024 03:18, Chris M. Thomasson wrote:[...]
On 1/15/2024 3:24 PM, Kaz Kylheku wrote:
On 2024-01-15, Chris M. Thomasson <chris.m.thomasson.1@gmail.com>
wrote:
On 1/15/2024 8:29 AM, David Brown wrote:
On 15/01/2024 17:04, David Brown wrote:
On 15/01/2024 08:40, Kaz Kylheku wrote:
You've mentioned this many times - do you have a reference that gives
the source of this function (at the time when there was an issue), and
a description or report of what you think gcc did wrong? I am curious
as to whether it was a bug in the code or a bug in gcc (gcc is
certainly not bug-free).
I think I found it David!
https://groups.google.com/g/comp.programming.threads/c/Y_Y2DZOWErM/m/nuyEoKq0onUJ
On 16/01/2024 17:46, David Brown wrote:
On 16/01/2024 16:15, bart wrote:
void swap_lots(const uint32_t * restrict in, uint32_t * restrict out,
uint32_t n) {
while (n--) {
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
}
}
What's 'n' here? Are 4n bytes being transformed, or 16n?
My function would be this on x64 (although I don't support bswap):
fun swapends64(u64 x)u64 = assem mov rax, [x]; bswap rax; end
And how efficient would your "swap_lots" function be?
How do you measure efficiency? This task seems memory-bound anyway.
Using exactly that function (I now support 'bswap'), I can process a 100M-element u64 array (0.8GB) in .35 seconds, or 2.3GB/second.
On 16/01/2024 23:42, bart wrote:
On 16/01/2024 17:46, David Brown wrote:
On 16/01/2024 16:15, bart wrote:
void swap_lots(const uint32_t * restrict in, uint32_t * restrict out,
uint32_t n) {
while (n--) {
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
}
}
What's 'n' here? Are 4n bytes being transformed, or 16n?
In this case, 4n 32-bit words. It's just example code to demonstrate a point, not to be a particularly useful function in reality.
My function would be this on x64 (although I don't support bswap):
fun swapends64(u64 x)u64 = assem mov rax, [x]; bswap rax; end
And how efficient would your "swap_lots" function be?
How do you measure efficiency? This task seems memory-bound anyway.
That will depend on the sizes, cache, etc., as well as the target. The target I was using here is a Cortex M7 - with data in tightly-coupled
memory that runs at core speed, it would not be memory bound.
Efficiency is measured in clock cycles. (It can also be measured in
code size, but that's usually not as important. If it were, we would
not be doing manual loop unrolling here.)
Using exactly that function (I now support 'bswap'), I can process a
100M-element u64 array (0.8GB) in .35 seconds, or 2.3GB/second.
On what target? Do you have an ARM Cortex-M microcontroller for testing this?
Using a built-in bswap function kind of defeats the point of using
inline assembly.
gcc has __builtin_bswap32 too, or it can be written
manually in C and optimised by the compiler to "rev" instructions. The point is a demonstration of how gcc's inline assembly can work with the optimiser for surrounding code, not how fast your PC can swap endianness!
On 16/01/2024 18:39, Kaz Kylheku wrote:
On 2024-01-16, bart <bc@freeuk.com> wrote:
gcc is a project 10s of 1000s of files. If a particular configuration
targets one architecture out of 100, it can also support the ASM syntax
for that architecture.
Not reasonably so. The parser of every front end would have to have
a special case for that assembly language.
The proposal does not pass a cost/benefit analysis, due to the
high cost and low benefit.
You don't need to have the ASM syntax embedded within the C grammar. Not >>> so specifically anyway; you allow a bunch of keyword and register tokens >>> within asm{...}.
keywords and tokens are grammar.
So how does the grammar of HTML work: does it also include the entire
grammar of JavaScript?
JS appears inside <script> ... </script> tags.
On 17/01/2024 13:25, David Brown wrote:
On 16/01/2024 23:42, bart wrote:
On 16/01/2024 17:46, David Brown wrote:
On 16/01/2024 16:15, bart wrote:
void swap_lots(const uint32_t * restrict in, uint32_t * restrict
out, uint32_t n) {
while (n--) {
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
*out++ = swapEndian32(*in++);
}
}
What's 'n' here? Are 4n bytes being transformed, or 16n?
In this case, 4n 32-bit words. It's just example code to demonstrate
a point, not to be a particularly useful function in reality.
My function would be this on x64 (although I don't support bswap):
fun swapends64(u64 x)u64 = assem mov rax, [x]; bswap rax; end >>>>>
And how efficient would your "swap_lots" function be?
How do you measure efficiency? This task seems memory-bound anyway.
That will depend on the sizes, cache, etc., as well as the target.
The target I was using here is a Cortex M7 - with data in
tightly-coupled memory that runs at core speed, it would not be memory
bound.
Efficiency is measured in clock cycles. (It can also be measured in
code size, but that's usually not as important. If it were, we would
not be doing manual loop unrolling here.)
Using exactly that function (I now support 'bswap'), I can process a
100M-element u64 array (0.8GB) in .35 seconds, or 2.3GB/second.
On what target? Do you have an ARM Cortex-M microcontroller for
testing this?
I have an RPi4 somewhere; I only know it has a 64-bit ARM chip. ARM
processor and architecture models are mystery to me. My tests were done
on some AMD Ryzen device, probably bottom of the range.
(x64 processors are a bit of a mystery too! I just have little interest beyond whether it's x64 or ARM64; I don't come across anything else.)
Using a built-in bswap function kind of defeats the point of using
inline assembly.
I didn't support 'bswap' at all. I first has to add it to my assembler,
then roll that out to the backend of my compiler.
This was necessary both to be able to use it in inline assembler, and
for the compiler backend to be able to deal with that instruction,
whatever had generated it.
So I wanted to know whether building it to the language in would be
worth doing, performance-wise (probably not). Although there are other advantages of having it as an operator, since it could be used in-place
as 'bswap:=A[i]', and here it would be able to choose 32- and 64-bit variations.
But it also highlighted issues with my inline assembler within macros
used to emulate inline functions, such as this:
macro byteswap(x) = assem mov rax, [x] ...
'x' can only be a simple variable at the invocation, not an arbitrary expression like 'A[i]'. This suggested an enhancement:
assem (expr) ...
which first evaluates the expression to a known register.
(I could just
do 'expr; assem ...' which /probably/ does the same, but it's risky.)
So doing the exercise helped in determining what might be a suitable compromise. Although it would need explicit forms for 32- vs 64-bit operations.
('assem(expr) ...' is also partway to doing what gcc asm{} does in
creating an interface, but retaining the same syntax.)
gcc has __builtin_bswap32 too, or it can be written manually in C
and optimised by the compiler to "rev" instructions. The point is a
demonstration of how gcc's inline assembly can work with the optimiser
for surrounding code, not how fast your PC can swap endianness!
It showed that using an actual function call, and not unrolling loops,
wasn't that slow, not on my PC.
On 12/01/2024 16:50, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 12/01/2024 13:40, David Brown wrote:
On 12/01/2024 00:20, bart wrote:
But with 'as', it just sits there. I wonder what it's waiting for; for
me to type in ASM code live from the terminal?
It does that so you can pipe the assembler source code in to the
assembler.
$ cat file.s | as
$ cat file.c | cpp | c0 | c1 | c2 | as > file.o
or, if you want, you can type in the assembler source directly.
Or you can save it in a file and supply the file argument to the command.
None of which your stuff supports, which makes it useless to me.
I had a spare 15 minutes so I got my scripting language to do this:
csource:=(
"#include <stdio.h>",
"int main(void) {",
" puts(""Fahrenheit 451"");",
"}")
csource -> mcc -> aa -> run
'mcc' turns a source string (or a list of strings as here) into a string containing assembly code.
'aa' turns an assembly string into a string containing binary PE data.
'run' runs that PE data. Output is:
Fahrenheit 451
Those 3 functions are 40-50 lines. Here's another invocation:
On 2024-01-17, bart <bc@freeuk.com> wrote:
csource -> mcc -> aa -> run
'mcc' turns a source string (or a list of strings as here) into a string
containing assembly code.
'aa' turns an assembly string into a string containing binary PE data.
'run' runs that PE data. Output is:
Fahrenheit 451
Those 3 functions are 40-50 lines. Here's another invocation:
Scott's script is just that one line; just the external commands are
called.
It doesn't require function wrappers around external commands.
We have a ton of useful scripting languages; yet the shell is not
going away for process control tasks.
The only reason to use the shell for larger coding tasks that go
beyonod process control is that is that it's the only language you can
rely on being installed.
A good many scripting languages require a shell in order to build,
such as for runing a ./configure script.
On 17/01/2024 18:47, Kaz Kylheku wrote:
On 2024-01-17, bart <bc@freeuk.com> wrote:
A good many scripting languages require a shell in order to build,
such as for runing a ./configure script.
Funnily enough, mine don't.
(As for Bash, is it a language running in a permanent REPL loop, or is
it more of an application?)
bart <bc@freeuk.com> writes:
On 17/01/2024 18:47, Kaz Kylheku wrote:
On 2024-01-17, bart <bc@freeuk.com> wrote:
A good many scripting languages require a shell in order to build,
such as for runing a ./configure script.
Funnily enough, mine don't.
Perhaps you find it humorous. A posix shell comes with pretty
much every system used for software development (even windows
has WSL for serious software development).
Your scripting language isn't available on any of them.
(As for Bash, is it a language running in a permanent REPL loop, or is
it more of an application?)
It's an executable, just like everything else on unix/linux.
On 17/01/2024 22:18, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 17/01/2024 18:47, Kaz Kylheku wrote:
On 2024-01-17, bart <bc@freeuk.com> wrote:
A good many scripting languages require a shell in order to build,
such as for runing a ./configure script.
Funnily enough, mine don't.
Perhaps you find it humorous. A posix shell comes with pretty
much every system used for software development (even windows
has WSL for serious software development).
I suspect it's more that it's been browbeaten into having such a system because so many developers were deliberately, inadvertently or
spitefully building in so many dependencies from the Unix world into
their software and their procedures.
On 2024-01-17, bart <bc@freeuk.com> wrote:
On 17/01/2024 22:18, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 17/01/2024 18:47, Kaz Kylheku wrote:
On 2024-01-17, bart <bc@freeuk.com> wrote:
A good many scripting languages require a shell in order to build,
such as for runing a ./configure script.
Funnily enough, mine don't.
Perhaps you find it humorous. A posix shell comes with pretty
much every system used for software development (even windows
has WSL for serious software development).
I suspect it's more that it's been browbeaten into having such a system
because so many developers were deliberately, inadvertently or
spitefully building in so many dependencies from the Unix world into
their software and their procedures.
You can get shell scripts running on Windows.
Macs support then
natively.
The availability of the shell language is better than any other;
the places where it doesn't run out of the box are not many.
Also, no operating system that is still in wide use has a better
command language.
We could imagine, say, writing the build scripting of a project in the
batch file language of CMD.EXE, and making that work on all the
platforms: Mac, GNU Linuxes, Solaris, OpenBSD, Android/Termux what
have you.
For all the inconvenience of having to install that interpreter first,
we would then have to contend with a comically poor quality language.
It's worse than some CS freshman's semester project.
Shell has multiple implementations. At one point I tested the ./configure script of my TXR project with Bash, Zsh, Dash, Ash, public domain Ksh,
as well as some POSIX shells of some proprietary Unixes like Solaris's
POSIX shell.
Though the shell language is quirky, it is well specified and understood
to the point that fairly complex code can be written that works across multiple implementations.
As far as Windows goes, nobody actually builds my program for Windows
other than me. I provide a GUI installer. That's what you do for
Windows.
How your stuff builds on Windows is almost immaterial.
The only people building your FOSS stuff from code are (1) maintainers
of FOSS distros, who always use environments with shells and make, etc;
and (2) a few enthusiast users who build for themselves, also in such environments. Anyone who has something to do on Windows gets your
installer.
On 18/01/2024 00:25, Kaz Kylheku wrote:
On 2024-01-17, bart <bc@freeuk.com> wrote:
On 17/01/2024 22:18, Scott Lurndal wrote:
bart <bc@freeuk.com> writes:
On 17/01/2024 18:47, Kaz Kylheku wrote:
On 2024-01-17, bart <bc@freeuk.com> wrote:
A good many scripting languages require a shell in order to build, >>>>>> such as for runing a ./configure script.
Funnily enough, mine don't.
Perhaps you find it humorous. A posix shell comes with pretty
much every system used for software development (even windows
has WSL for serious software development).
I suspect it's more that it's been browbeaten into having such a system
because so many developers were deliberately, inadvertently or
spitefully building in so many dependencies from the Unix world into
their software and their procedures.
You can get shell scripts running on Windows.
You mean via layers like WSL or MSYS2 or CYGWIN? I don't really consider
that running on Windows.
Macs support then
natively.
The availability of the shell language is better than any other;
the places where it doesn't run out of the box are not many.
Also, no operating system that is still in wide use has a better
command language.
Yes, I have an idea what you mean by better. Something that gives low priority to user-friendliness and that is bristling with advanced
features that mean loads of gotchas.
We could imagine, say, writing the build scripting of a project in the
batch file language of CMD.EXE, and making that work on all the
platforms: Mac, GNU Linuxes, Solaris, OpenBSD, Android/Termux what
have you.
For all the inconvenience of having to install that interpreter first,
we would then have to contend with a comically poor quality language.
It's worse than some CS freshman's semester project.
What language do you have in mind? (Bear in mind that this is exactly
what I've long thought about C.)
Shell has multiple implementations. At one point I tested the ./configure
script of my TXR project with Bash, Zsh, Dash, Ash, public domain Ksh,
as well as some POSIX shells of some proprietary Unixes like Solaris's
POSIX shell.
I wonder why any other scripting languages exist? Why does TXR? What's missing from Bash (and why are there so many variations)?
As far as Windows goes, nobody actually builds my program for Windows
other than me. I provide a GUI installer. That's what you do for
Windows.
How do you get past AV software?
On 2024-01-18, bart <bc@freeuk.com> wrote:
Microsoft's PowerShell language is a lot better (if I believe what
others say). It's not widely installed though. You'd be better off
relying on Python from that perspective.
How do you get past AV software?
For some AV to identify your installer as a threat, it has to be a false positive. I ran into that only once before.
On 1/17/2024 4:43 AM, David Brown wrote:
On 17/01/2024 02:04, Chris M. Thomasson wrote:
On 1/16/2024 6:06 AM, David Brown wrote:
Well, you don't often write inline assembly - its rare to write it.
It's typically the kind of thing you write once for your particular
instruction, then stick it away in a header somewhere. You might
use it often, but you don't need to read or edit the code often.[...]
As soon as you use inline assembler in a file, you sort of "need" to?
Nonsense.
If I use inline asm in a file, I at least need to add in comments that
this is arch specific code. A macro for the arch also might be in order.
So, if the user compiles it on a different arch, well, the inline asm is eluded. Why is that wrong? You never did that before?
On 2024-01-18, bart <bc@freeuk.com> wrote:
You mean via layers like WSL or MSYS2 or CYGWIN? I don't really consider
that running on Windows.
What is it? To the left of Windows? Behind Windows? Under Windows?
It doesn't matter which one of these prepositions it is, because in this situation being discussed all those programs are doing is providing an environment for building something.
On 05/01/2024 19:42, Kaz Kylheku wrote:
On 2024-01-05, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
When you wrote "in Linux", I wondered if you were being imprecise, but
in fact that code is in the Linux kernel.
That means the macros aren't directly available to normal C code, but
you can always copy their definitions (*if* you're using a compiler
that supports the __builtin_types_compatible_p extension).
You can always copy their definitions, if you're using a compiler
that doesn't arbitrarily define __GNUC__ without providing the
associated behaviors:
#ifdef __GNU__
// define array_size in the Linux kernel way
#else
#define arrray_size(x) (sizeof (x)/sizeof *(x))
#endif
If you regularly build the code with a compiler that provides GNU
extensions (like as part of your CI), you're covered, even if you're
going to production with something else.
I use C++ this way in C projects; I have some macro features that
provide extra checks under C++. I get the benefit even if I just
compile the code as C++ only once before every release.
That is a good tactic if your code needs to be used with compilers that
don't have such features for static checking (perhaps you are releasing
your source code, and can't influence the tools or settings users use).
In a way, you are using gcc (or C++) as a linter.
I've done this myself when the actual target compiler had little in the
way of static checking - I ran gcc in parallel, for a different target
(the generated output was ignored), but with lots of warning flags that
the real compiler did not support.
On 18/01/2024 04:30, Kaz Kylheku wrote:
On 2024-01-18, bart <bc@freeuk.com> wrote:
Microsoft's PowerShell language is a lot better (if I believe what
others say). It's not widely installed though. You'd be better off
relying on Python from that perspective.
All I know about PowerShell is that instead of typing 'prog' to run the 'prog.exe' you've just created, you have to do '.\prog' just like you
have to do './prog' in Linux.
Plus it has a gazillion options, accessed from diverse places, so you
can spend half your time just trying to get a consistent background
colour. MS really know how to go to town on such things.
With TXR Win64 installer, Windows Defender stopped me from running the program (until I clicked 'more info' then there was an option to 'run anyway').
This is why these days, if somebody wants to run a binary EXE, I make it available as a C source file.
Since if they have experience of compiling programs from source, they
will already know how to run such programs without AV interference.
On 18/01/2024 04:30, Kaz Kylheku wrote:
On 2024-01-18, bart <bc@freeuk.com> wrote:
You mean via layers like WSL or MSYS2 or CYGWIN? I don't really consider >>> that running on Windows.
What is it? To the left of Windows? Behind Windows? Under Windows?
It doesn't matter which one of these prepositions it is, because in this
situation being discussed all those programs are doing is providing an
environment for building something.
Let's consider an OS like Android, to avoid a made-up one. (Put aside
the likelihood that deep inside Android might be a Linux core. Think of
the glossy-looking consumer OS that everyone knows.)
Now, somebody brings out some piece of software that only builds under Android.
To build it under Linux, requires installing a vast virtual Android OS
to do that.
Would you consider that being able to build 'under Linux'?
(Let's not go into whether the resulting binary can run outside of
Android.)
For that matter (this might have been a better example), would you
consider a project that builds using a massive Visual Studio
installation to compiler 'under Linux' if you could somehow run run VS
within Linux?
Could you, in good faith, provide sources to that project and say it
builds 'under Linux, no problem'. Never mind the considerable
dependencies they would need and the significant likelihood of failure.
Remember also that on Windows you want to end up with EXEs and DLLs that
run under actual Windows. Not ELF and SO files.
On 2024-01-18, bart <bc@freeuk.com> wrote:
On 18/01/2024 04:30, Kaz Kylheku wrote:
On 2024-01-18, bart <bc@freeuk.com> wrote:
Microsoft's PowerShell language is a lot better (if I believe what
others say). It's not widely installed though. You'd be better off
relying on Python from that perspective.
All I know about PowerShell is that instead of typing 'prog' to run the
'prog.exe' you've just created, you have to do '.\prog' just like you
have to do './prog' in Linux.
I don't know anything about PowerShell. In the case of the POSIX-like
shells, this is just a consequence of the . directory not being listed
in the PATH variable, which is an end-user configuration matter.
If you care about the convenience of just typing "prog", more than
about the security aspect, you can just add . to your PATH.
Plus it has a gazillion options, accessed from diverse places, so you
can spend half your time just trying to get a consistent background
colour. MS really know how to go to town on such things.
Looks like you have nowhere to run. Don't like Unix; and the Mirosoft's
next generation shell is too complex.
This is why these days, if somebody wants to run a binary EXE, I make it
available as a C source file.
But they need a C compiler EXE to compile it.
This is a complete non-starter for applications which are complex
and don't target programmers.
Even programmers won't use a program on Windows that they have to figure
out how to compile for Windows.
Since if they have experience of compiling programs from source, they
will already know how to run such programs without AV interference.
OK, now I understand why you need C compiling to be dead easy.
You'd like to ship .c files to Windows users.
On 18/01/2024 04:30, Kaz Kylheku wrote:
On 2024-01-18, bart <bc@freeuk.com> wrote:
You mean via layers like WSL or MSYS2 or CYGWIN? I don't really consider >>> that running on Windows.
What is it? To the left of Windows? Behind Windows? Under Windows?
It doesn't matter which one of these prepositions it is, because in this
situation being discussed all those programs are doing is providing an
environment for building something.
Let's consider an OS like Android, to avoid a made-up one. (Put aside
the likelihood that deep inside Android might be a Linux core. Think of
the glossy-looking consumer OS that everyone knows.)
Now, somebody brings out some piece of software that only builds under Android.
To build it under Linux, requires installing a vast virtual Android OS
to do that.
Would you consider that being able to build 'under Linux'?
On 18/01/2024 16:16, bart wrote:
Could you, in good faith, provide sources to that project and say it
builds 'under Linux, no problem'. Never mind the considerable
dependencies they would need and the significant likelihood of failure.
Dependencies are part of life with computers. A Windows installation without additional software gives you Minesweeper and Wordpad. How can
you possibly justify thinking that installing msys2 and mingw-64 for a compilation means you are "not building under Windows" while installing
MSVC means you /are/ "building under Windows"? The MSVC installation is
10 times the size of mingw-64.
On 06/01/2024 07:39, David Brown wrote:
On 05/01/2024 19:42, Kaz Kylheku wrote:
On 2024-01-05, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
When you wrote "in Linux", I wondered if you were being imprecise, but >>>> in fact that code is in the Linux kernel.
That means the macros aren't directly available to normal C code, but
you can always copy their definitions (*if* you're using a compiler
that supports the __builtin_types_compatible_p extension).
You can always copy their definitions, if you're using a compiler
that doesn't arbitrarily define __GNUC__ without providing the
associated behaviors:
#ifdef __GNU__
// define array_size in the Linux kernel way
#else
#define arrray_size(x) (sizeof (x)/sizeof *(x))
#endif
If you regularly build the code with a compiler that provides GNU
extensions (like as part of your CI), you're covered, even if you're
going to production with something else.
I use C++ this way in C projects; I have some macro features that
provide extra checks under C++. I get the benefit even if I just
compile the code as C++ only once before every release.
That is a good tactic if your code needs to be used with compilers
that don't have such features for static checking (perhaps you are
releasing your source code, and can't influence the tools or settings
users use). In a way, you are using gcc (or C++) as a linter.
I've done this myself when the actual target compiler had little in
the way of static checking - I ran gcc in parallel, for a different
target (the generated output was ignored), but with lots of warning
flags that the real compiler did not support.
This is pretty much how in the past I have suggested people use Tiny C:
use it for very fast compilation turnaround. Use gcc from time to time
for better error checking, and for production builds for faster code.
Another is for situations where obtaining the small-footprint Tiny C
compiler is simpler, or where it can be bundled with your source code.
Yet another is when you have a compiler for another language that
targets C code. Since the program has already been verified, the C code should be correct. Then extensive static checking is not needed; you
just want a fast backend that doesn't take up 95% of the overall
compilation time.
Again, use gcc for a production version for some extra speed, but it is
not essential to have it.
This is a compiler you've dismissed as a toy.
AFAIK, if I wanted to supply a program to any Linux user, I can't just
supply a binary, since every Linux is a bit different and they run on
more different kinds of processor.
This is where MS Windows reigned supreme.
On 19/01/2024 10:07, David Brown wrote:
On 18/01/2024 21:21, bart wrote:
AFAIK, if I wanted to supply a program to any Linux user, I can't
just supply a binary, since every Linux is a bit different and they
run on more different kinds of processor.
As so often, "AFAIK" for you is not very far.
For almost all Linux users, for almost all their programs, they
download and install binaries - they don't compile from source.
Compiling from source as a common method of distributing programs fell
out of fashion some 20 or 30 years ago.
That's good news. So I can build an ELF binary on my WSL today, I can
email it to you and you can run it on the nearest Linux machine?
I assume it will only work if that machine happens to be x64 (as that's
what mine is)?
Or will it not work even then?
What are the rules? Do I have to make a version for every variant of
Linux and make it available in the 'apt-get' repository?
For developers or companies who don't want to distribute source, or
who want to provide the users the convenience of binaries but it's not
appropriate to include the programs in distros, it is normal to
provide only one or two binaries - an x86-64 binary, and sometimes
also an AArch64 binary if that is a likely target.
It can be more complicated if the software has to integrate tightly
with different aspects of the system, but that's usually not an issue
for most software.
This is where MS Windows reigned supreme.
You mean it is so much more limited?
In being able to run EXEs created by anyone on any machine, and being
able to run them on any Windows computer, even years later.
(I think you can still run 32-bit EXEs compiled in the 1990s today.
16-bit EXEs can still run today on a 32-bit Windows.)
On 18/01/2024 21:21, bart wrote:
AFAIK, if I wanted to supply a program to any Linux user, I can't just
supply a binary, since every Linux is a bit different and they run on
more different kinds of processor.
As so often, "AFAIK" for you is not very far.
For almost all Linux users, for almost all their programs, they download
and install binaries - they don't compile from source. Compiling from source as a common method of distributing programs fell out of fashion
some 20 or 30 years ago.
For developers or companies who don't want to distribute source, or who
want to provide the users the convenience of binaries but it's not appropriate to include the programs in distros, it is normal to provide
only one or two binaries - an x86-64 binary, and sometimes also an
AArch64 binary if that is a likely target.
It can be more complicated if the software has to integrate tightly with different aspects of the system, but that's usually not an issue for
most software.
This is where MS Windows reigned supreme.
You mean it is so much more limited?
On 19/01/2024 12:17, bart wrote:
It
doesn't matter what anyone says, you will always twist things to fit
your preconceived ideas that everything Linux-related is terrible in
every way,
and that somehow you think it all proves that your own tools
are brilliant and the rest of the world is wrong.
On 19/01/2024 14:18, bart wrote:
It is also incomprehensible how someone can rant endlessly about "simplification" and "easy of use" when their claimed solution is to
make non-standard limited duplications of the functionality already
found on other machines.
On 19/01/2024 11:41, David Brown wrote:
On 19/01/2024 12:17, bart wrote:
It doesn't matter what anyone says, you will always twist things to
fit your preconceived ideas that everything Linux-related is terrible
in every way,
That's funny. I find that Linux people always have terrible things to
say about software development using Windows, usually totally unjustified.
The main reason is that on Linux they are reliant on a huge mountain of dependencies that don't exist on Windows, even when creating supposedly cross-platform products.
On 19/01/2024 10:07, David Brown wrote:
On 18/01/2024 21:21, bart wrote:
AFAIK, if I wanted to supply a program to any Linux user, I can't just
supply a binary, since every Linux is a bit different and they run on
more different kinds of processor.
As so often, "AFAIK" for you is not very far.
For almost all Linux users, for almost all their programs, they download
and install binaries - they don't compile from source. Compiling from
source as a common method of distributing programs fell out of fashion
some 20 or 30 years ago.
That's good news. So I can build an ELF binary on my WSL today, I can
email it to you and you can run it on the nearest Linux machine?
On 19/01/2024 14:18, bart wrote:
On 19/01/2024 11:41, David Brown wrote:
On 19/01/2024 12:17, bart wrote:
It doesn't matter what anyone says, you will always twist things to
fit your preconceived ideas that everything Linux-related is terrible
in every way,
That's funny. I find that Linux people always have terrible things to
say about software development using Windows, usually totally
unjustified.
And feel free to ignore these if you have good reason to believe they
are not fact based. Don't feel free to exaggerate what people say (I
say Linux is better for most development and programming work, other
than when targeting Windows itself - but you /can/ use Windows for many tasks, and I do use Windows as well as Linux). [...]
[...]
[...]
I can fully understand how someone who is only familiar with Windows
(and DOS before it) doesn't know about these utilities or where to get
them, and finds them strange at first. [...]
On 19/01/2024 14:42, David Brown wrote:
On 19/01/2024 14:18, bart wrote:
It is also incomprehensible how someone can rant endlessly about
"simplification" and "easy of use" when their claimed solution is to
make non-standard limited duplications of the functionality already
found on other machines.
The programs I write ARE easy to build, using ONLY a compiler.
You don't want to accept that, you'd rather it involved those mountains
of stuff.
On 19.01.2024 15:42, David Brown wrote:
On 19/01/2024 14:18, bart wrote:
On 19/01/2024 11:41, David Brown wrote:
On 19/01/2024 12:17, bart wrote:
It doesn't matter what anyone says, you will always twist things to
fit your preconceived ideas that everything Linux-related is terrible
in every way,
That's funny. I find that Linux people always have terrible things to
say about software development using Windows, usually totally
unjustified.
And feel free to ignore these if you have good reason to believe they
are not fact based. Don't feel free to exaggerate what people say (I
say Linux is better for most development and programming work, other
than when targeting Windows itself - but you /can/ use Windows for many
tasks, and I do use Windows as well as Linux). [...]
There was a time when I dual booted (pre-installed) WinDOS and
Linux. At some point I got aware that it makes no sense, Windows
was - besides its inherent issues - also completely unnecessary
(as could be seen soon); further installs were Linux-only.
It's a phenomenon that the marketing division of MS did so great
a job to mentally bind such a huge community. (There were other
reasons as well (driver support of companies, pre-installed OS
on hardware, gaming focus, or self-enforcing dissemination
feedback processes), but discussion would lead too far here.)
I've often observed uninformed WinDozers just spread uninformed
nonsense. But never in such extreme stubbornness as in this thread.
[...]
[...]
I can fully understand how someone who is only familiar with Windows
(and DOS before it) doesn't know about these utilities or where to get
them, and finds them strange at first. [...]
This is part of the phenomenon.
And as we see nothing helps if folks grew up and didn't leave
that bubble, didn't even dare to have a look outside and try
to grasp what's going on. Yet continuing uninformed rants about
something they never experienced nor understood to a minimum
extent.
We have to accept though that the condition where someone is
making specialized niche-software for a specific platform is
also not fostering susceptive open-mindedness when confronted
with the huge professional IT world.
But I have hope; after issues with Windows
I installed Linux
even for my (very old) father, and he has no problems with it
(using browser, email, and a text processor). And my children
installed Linux by themselves, anyway, and they're programming
their academic research tasks on that platform. - No lengthy
stupid discussions as in this thread.
On 19/01/2024 16:03, bart wrote:
On 19/01/2024 14:42, David Brown wrote:
On 19/01/2024 14:18, bart wrote:
It is also incomprehensible how someone can rant endlessly about
"simplification" and "easy of use" when their claimed solution is to
make non-standard limited duplications of the functionality already
found on other machines.
The programs I write ARE easy to build, using ONLY a compiler.
You don't want to accept that, you'd rather it involved those
mountains of stuff.
/Please/ stop making stupid, exaggerated and incorrect claims about what other people want or think.
I don't give a *beep* how easy or hard your programs are to compile -
they are utterly irrelevant to me. I don't have reason to believe they
are very relevant to anyone else either. But even if they were useful
to me, or at least of interest to me, it does not matter to me if they
need a whole range of standard utilities to build - because every system
I have, or ever have had, has the standard utilities.
Why is it so difficult for you to understand the concept of not caring?
It is absurd to suggest I /want/ your programs to need "make" or other utilities to compile - I *do* *not* *care* if it needs them or not.
There is no advantage or disadvantage to me if it uses make, sed, awk,
bash, date. There is no advantage or disadvantage to me if it does not
need them either. It is irrelevant.
It /would/ be an inconvenience to me if it required some obscure and
rarely used compiler like your own tools. TCC is also pretty obscure,
but Debian has it in it's repositories. Needing an invasive tool that screws up your system, like MSVC, would also be a pain, as would
anything that required very specific versions of tools. But that would
only matter if your programs were of interest.
And of course I accept that /you/ think it is important that you don't
use tools that everyone else finds convenient and useful. I don't understand your reasons for that
- it seems to be based on a determined
battle to ensure that you fail to get anything you deem to be "Linux
related" to work, combined with a fanaticism about "simple" that
completely misses the point. But I don't think anyone here doubts that /you/ think this is all desperately important, even though pretty much
no one else cares.
And /please/ stop making an arse of yourself here by calling a dozen
standard utilities "mountains of stuff".
On 19/01/2024 17:12, David Brown wrote:
On 19/01/2024 16:03, bart wrote:
On 19/01/2024 14:42, David Brown wrote:
On 19/01/2024 14:18, bart wrote:
It is also incomprehensible how someone can rant endlessly about
"simplification" and "easy of use" when their claimed solution is to
make non-standard limited duplications of the functionality already
found on other machines.
The programs I write ARE easy to build, using ONLY a compiler.
You don't want to accept that, you'd rather it involved those
mountains of stuff.
/Please/ stop making stupid, exaggerated and incorrect claims about what
other people want or think.
I don't give a *beep* how easy or hard your programs are to compile -
they are utterly irrelevant to me. I don't have reason to believe they
are very relevant to anyone else either. But even if they were useful
to me, or at least of interest to me, it does not matter to me if they
need a whole range of standard utilities to build - because every system
I have, or ever have had, has the standard utilities.
Why is it so difficult for you to understand the concept of not caring?
It is absurd to suggest I /want/ your programs to need "make" or other
utilities to compile - I *do* *not* *care* if it needs them or not.
There is no advantage or disadvantage to me if it uses make, sed, awk,
bash, date. There is no advantage or disadvantage to me if it does not
need them either. It is irrelevant.
It /would/ be an inconvenience to me if it required some obscure and
rarely used compiler like your own tools. TCC is also pretty obscure,
but Debian has it in it's repositories. Needing an invasive tool that
screws up your system, like MSVC, would also be a pain, as would
anything that required very specific versions of tools. But that would
only matter if your programs were of interest.
And of course I accept that /you/ think it is important that you don't
use tools that everyone else finds convenient and useful. I don't
understand your reasons for that
Because they don't work. Things like makefiles, even designed to work on >Windows, had a 50% failure rate.
On 19/01/2024 16:32, Janis Papanagnou wrote:
And as we see nothing helps if folks grew up and didn't leave
that bubble, didn't even dare to have a look outside and try
to grasp what's going on. Yet continuing uninformed rants about
something they never experienced nor understood to a minimum
extent.
It's not me who's in the bubble. How about that giant bubble called
'Linux'?
On 19/01/2024 17:12, David Brown wrote:
And of course I accept that /you/ think it is important that you don't
use tools that everyone else finds convenient and useful. I don't
understand your reasons for that
Because they don't work. Things like makefiles, even designed to work on Windows, had a 50% failure rate.
Apparently it is inconceivable to you to have an application written in
C say, that can be compiled either on Linux or Windows using only a C compiler.
But I will leave this alone now.
[...] Just including <unistd.h> has undefined behavior as far as
ISO C is concerned, [...]
On 2024-01-17, bart <bc@freeuk.com> wrote:
The only reason to use the shell for larger coding tasks that go
beyonod process control is that is that it's the only language you can
rely on being installed.
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...] Just including <unistd.h> has undefined behavior as far as
ISO C is concerned, [...]
Not true. The behavior of #include <unistd.h> is defined in
section 6.10.2 p2. One of two things is true: either the header
named is part of the implementation, or it isn't. If the named
header is part of the implementation, then it constitutes a
language extension, and so it must be documented (and defined).
On 2024-01-20, Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:...
named is part of the implementation, or it isn't. If the named
header is part of the implementation, then it constitutes a
language extension, and so it must be documented (and defined).
The problem with this reasoning is
1. Implementations in fact have internal headers that are not
documented, and which are not supposed to be included directly. You
will not get documentation for every single header that
is accessible via #include, and it is not reasonable for ISO C to
require it. I don't see where it does.
On 2024-01-20 21:50, James Kuyper wrote:The standard doesn't provide a definition for "extension", though it
"An implementation shall be accompanied by a document that defines allThat sentence says something subtly different from something like
implementation-defined and locale-specific characteristics and all
extensions." 4p9
"an implementation shall document everything that looks stable enough
to be an extension".
It is saying that there is required to be a document, and that in
that document there may be a list of one or more extensions.
The standard is defining the *fact* that this list is exhaustive:
it comprises all the extensions that are offered in the contract
between implementor and programmer.
Thus, for instance, if programmers empirically discover a behavior
which is not defined by ISO C, and which is not in that list,
it is not an extension. The standard states that that list has
all the extensions, and that discovered feature is not on it.
The sentence does not require implementors to document,
as an extension, every header file that is reachable by an #include directive.
However, it seems more reasonable to me that whether or not something is
an extension is not supposed to be up to the implementor to decide.
Instead, anything that is an extension must be listed. This would be an easier interpretation to defend if there were any explicit definition of "extension" in the standard.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:
This is a CPP question that arose last month. It's not about an
actual issue with the software, just out of curiosity and to be sure >>>>> it works reliable (it seemingly does).
In a C99 program on Linux (Ubuntu) I intended to use usleep() and
then also strnlen().
When I added usleep() and its include file I got an error and was
asked to define the CPP tag '_BSD_SOURCE'. I did so, and because I
wanted side effects of that tag kept as small as possible I
prepended it just before the respective #include and put it at the
end of my #include list
...other #includes...
#define _BSD_SOURCE
#include <unistd.h>
But as got obvious *that* way there had been side-effects and I
had to put the tag at the beginning of all include files (which
astonished me)
#define _BSD_SOURCE
#include <unistd.h>
...other #includes here...
For the strnlen() function I needed another CPP tag, '_GNU_SOURCE'.
So now I have both CPP tag definitions before the includes
I second the recommendations of Lowell Gilbert and others not to
define _BSD_SOURCE or _GNU_SOURCE (especially not _GNU_SOURCE)
but instead seek alternatives, which are readily available for
the two functionalities being sought in this case.
#define _GNU_SOURCE /* necessary for strnlen() in string.h */
#define _BSD_SOURCE /* necessary for usleep() in unistd.h */
...all #includes here...
For strnlen(), put an inline definition in a header file:
#ifndef HAVE_strnlen_dot_h_header
#define HAVE_strnlen_dot_h_header
#include <stddef.h>
static inline size_t
strnlen( const char *s, size_t n ){
extern void *memchr( const void *, int, size_t );
const char *p = memchr( s, 0, n );
return p ? (size_t){ p-s } : n;
}
#include <string.h>
#endif
Disclaimer: this code has been compiled but not tested.
strnlen() is specified by POSIX. It might make sense to
re-implement it if your code needs to work on a non-POSIX system
(that doesn't also provide it). Why would you want to do so
otherwise?
I'm trying to provide a helpful answer to the person I was
responding to, not espouse a philosophical viewpoint. Why do you
feel the need to start a style debate?
I don't. I simply asked you a question. You've refused to answer
it, and I won't waste my time asking again.
memchr() is declared in <string.h>. Why would you duplicate its
declaration rather than just using `#include <string.h>`?
I had a specific reason for writing the code the way I did.
It wasn't important to explain that so I didn't.
Unsurprisingly, you refuse to do so even when asked directly.
For usleep(), define an alternate function usnooze(), to be used
in place of usleep(). In header file usnooze.h:
[snip]
If your code doesn't need to be portable to systems that don't
provide usleep(), you can just use usleep(). If it does, its
probably better to modify the code so it uses nanosleep().
Not everyone agrees with that opinion. Again, I'm just trying to
provide an answer helpful to OP, not advance an agenda. Like I
said in the part of my posting that you left out, I don't want to
get involved in a style war. If OP wants to modify his code to
use nanosleep(), I'm fine with that. If want wants to keep using
usleep() or switch to using usnooze(), I'm fine with that too. I
think it's more important in this case to provide options than to
try to change someone's point of view.
As usual, you vaguely denigrate my opinion without sharing your own.
On 2024-01-21, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
However, it seems more reasonable to me that whether or not something is
an extension is not supposed to be up to the implementor to decide.
Instead, anything that is an extension must be listed. This would be an
easier interpretation to defend if there were any explicit definition of
"extension" in the standard.
The implication is that anything that works by accident must be listed
as an extension. If function arguments happen to be evaluated left to
right, with all side effects complete between them, this has to be
listed as an extension. Then in the future when the vendor finds that inconvenient for further compiler work, they have to take away the
extension.
On 1/21/24 12:55, Kaz Kylheku wrote:
On 2024-01-21, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
However, it seems more reasonable to me that whether or not
something is an extension is not supposed to be up to the
implementor to decide. Instead, anything that is an extension
must be listed. This would be an easier interpretation to defend
if there were any explicit definition of "extension" in the
standard.
The implication is that anything that works by accident must be
listed as an extension. If function arguments happen to be
evaluated left to right, with all side effects complete between
them, this has to be listed as an extension. Then in the future
when the vendor finds that inconvenient for further compiler work,
they have to take away the extension.
No, that's unspecified behavior, if 4p9 were intended to be
understood as mandating documentation of all unspecified behavior,
it would make the category of implementation-defined behavior
redundant.
Since extensions are also explicitly prohibited from changing the
behavior of strictly conforming code, I think that in order for
something qualify as an extension, it has to define behavior that
the standard leaves undefined.
In a certain sense, every
implementation implicitly defines that behavior - whatever it is
that actually happens when the behavior is undefined is that
definition. However, I think an extension should have to be
explicitly activated by some deliberate user choice, such as a
compiler option or by defining the behavior of some kinds of
syntax not allowed by the C standard. That's just IMHO, of course
- in the absence of a explicit definition in the standard, it's
hard to be sure what the committee intended.
On 2024-01-21, James Kuyper <jameskuyper@alumni.caltech.edu> wrote:
However, it seems more reasonable to me that whether or not something is
an extension is not supposed to be up to the implementor to decide.
Instead, anything that is an extension must be listed. This would be an
easier interpretation to defend if there were any explicit definition of
"extension" in the standard.
The implication is that anything that works by accident must be
listed as an extension. If function arguments happen to be
evaluated left to right, with all side effects complete between
them, this has to be listed as an extension. Then in the future
when the vendor finds that inconvenient for further compiler work,
they have to take away the extension.
I don't see how it can not be the implementor's privilege to
assert what is reliable for use and documented, versus what is
not.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Kaz Kylheku <433-929-6894@kylheku.com> writes:
It's important to understand what constitutes an extension. And
there are different kinds of extensions.
If an implementation chooses, say, to evaluate function arguments
always left-to-right, that is perfectly okay (and need not be an
extension). But if an there is to be a /guarantee/ that function
arguments will always be evaluated left-to-right, that means
providing a guarantee that the C standard does not; thus we have an
extension (to the requirements in the C standard), and the extension
must be documented. This case is one kind of extension.
If an implementation chooses to guarantee left-to-right evaluation, I
don't see anything in the standard that requires that guarantee to be documented. (Of course it can and should be.)
On 2024-01-20, Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
[...] Just including <unistd.h> has undefined behavior as far as
ISO C is concerned, [...]
Not true. The behavior of #include <unistd.h> is defined in
section 6.10.2 p2. One of two things is true: either the header
named is part of the implementation, or it isn't. If the named
header is part of the implementation, then it constitutes a
language extension, and so it must be documented (and defined).
The problem with this reasoning is
1. Implementations in fact have internal headers that are not
documented, and which are not supposed to be included directly.
You will not get documentation for every single header that
is accessible via #include, and it is not reasonable for ISO C to
require it. I don't see where it does.
2. A documented extension continues to be undefined behavior.
The behavior is "defined", but not "ISO C defined". So even
if all the implementation's internal headers were documented
as extensions, their use would still be UB.
Undefined behavior is for which "this document imposes not
requirements", right?
It is not behavior for which "this document, together with the
ocumentation accompanying a given implementation, imposes
no requirements". Just "this document" (and no other).
If "this document" imposes no requirements, it's "undefined
behavior", no matter who or what imposes additional requirements!
Intuitively, a header which is part of an implementation can do
anything. For instance #include <pascal.h> can cause the rest of
the translation unit to be analyzed as Pascal syntax, and not C.
An implementation can provide a header <reboot.h>, including
which causes a reboot at compile time, link time, or
execution time.
If such headers happen to exist, what in the standard is violated?
I used to experience a lot of push-back against the above views,
but I'm seeing that people are coming around.
On 2024-01-18, bart <bc@freeuk.com> wrote:
All I know about PowerShell is that instead of typing 'prog' to run the 'prog.exe' you've just created, you have to do '.\prog' just like you
have to do './prog' in Linux.
I don't know anything about PowerShell. In the case of the POSIX-like
shells, this is just a consequence of the . directory not being listed
in the PATH variable, which is an end-user configuration matter.
If you care about the convenience of just typing "prog", more than
about the security aspect, you can just add . to your PATH.
On Thu, 18 Jan 2024 19:40:44 -0000 (UTC), Kaz Kylheku ><433-929-6894@kylheku.com> wrote:
On 2024-01-18, bart <bc@freeuk.com> wrote:
That's not true for 'prog', but is true for any name of a PS cmdlet orAll I know about PowerShell is that instead of typing 'prog' to run the
'prog.exe' you've just created, you have to do '.\prog' just like you
have to do './prog' in Linux.
alias. In that case you must use a pathname (which can be EITHER
.\name OR ./name) OR name.exe.
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
Kaz Kylheku <433-929-6894@kylheku.com> writes:
It's important to understand what constitutes an extension. And
there are different kinds of extensions.
If an implementation chooses, say, to evaluate function arguments
always left-to-right, that is perfectly okay (and need not be an
extension). But if an there is to be a /guarantee/ that function
arguments will always be evaluated left-to-right, that means
providing a guarantee that the C standard does not; thus we have an
extension (to the requirements in the C standard), and the extension
must be documented. This case is one kind of extension.
If an implementation chooses to guarantee left-to-right evaluation, I
don't see anything in the standard that requires that guarantee to be documented. (Of course it can and should be.)
It's not entirely clear to me exactly what must or must not be
considered to be an "extension" as the C standard uses the term. The standard doesn't show the word "extension" in italics or provide a
definition in section 3.
Unless your point is that it's not a guarantee unless it's documented?
But I don't see that that documentation is required *by the standard*.
[...]
If an implementation chooses to define, for example, __uint128_t to
be the name of an integer type, that represents a third kind of
extension. (Annex J.5 also mentions such definitions as an example
on its list of "Common extensions".) Such cases must be documented
as extensions because, one, the expected default is that there be no
definition (which normally would cause a diagnostic because of some
constraint violation), and two, only the implementation is in a
position to know if the symbol in question was defined by the
implementation or by something else. Clearly any definition the
implementation provides signals a change from the expected default,
which by any reasonable definition of the word constitutes an
extension, and hence must be documented.
Again, that can and should be documented, but since any use of the
identifer __uint128_t has undefined behavior,
I don't see any
requirement in the standard that it must be documented.
(Aside: gcc has an extension adding __int128 as a keyword, supporting
types "__int128" and "unsigned __int128", but only on some target
systems. These do not qualify as extended integer types. I understand
that "__uint128_t" was a hypothetical example.)
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 299 |
Nodes: | 16 (2 / 14) |
Uptime: | 56:04:01 |
Calls: | 6,690 |
Files: | 12,225 |
Messages: | 5,345,062 |