That won't tell me the sizes or offsets. For that I need a different
program that applies sizeof() and offsetof() to each member, and
sizeof(S3).
However, that still won't tell me the actual /types/ of the fields. As
will as the width, if I need to access the individual fields, I will
need to know whether it's a signed int, unsigned int, or float.
That will need a third program!
But I guess all this won't cut any ice, since no matter how much of a
dog's dinner any C header file is, you are never going to admit that
might be anything wrong with it that, because there are always going
to be a few dozen more loops to jump through to extract the info that
you need.
You summed up the steps very well. What I don't get is why you don't
have a program to carry out those steps! If a program runs ten steps or twelve what does it matter? It's a program. It's supposed to do as many
steps as necessary.
If you were carrying out the steps manually I could understand it. But
that would be a nightmare. Please tell me you are not carrying out any
of the steps manually!!! All steps should be doable by a program which
will run on each intended target in about 0.01 seconds, shouldn't they?
On 2021-08-29 15:58, Bart wrote:
And I guess they've already been applied to popular libraries to
provide language-neutral versions of those APIs, complete with all the
named enums, and converted the macros needed to be able to use the
library.
Except I can't find anything like that.
Really?
1. It is even integrated in GCC. See the
-fdump-ada-spec
switch.
2. c2ada:
http://c2ada.sourceforge.net/c2ada.html
And I guess they've already been applied to popular libraries to provide language-neutral versions of those APIs, complete with all the named
enums, and converted the macros needed to be able to use the library.
Except I can't find anything like that.
And if I apply that gcc option to my raylib example, I get output like
this:
-- unsupported macro: LIGHTGRAY CLITERAL( 200, 200, 200, 255 )
-- unsupported macro: GRAY CLITERAL( 130, 130, 130, 255 )
-- unsupported macro: DARKGRAY CLITERAL( 80, 80, 80, 255 )
-- unsupported macro: YELLOW CLITERAL( 253, 249, 0, 255 )
On 29/08/2021 14:28, James Harris wrote:
That won't tell me the sizes or offsets. For that I need a different
program that applies sizeof() and offsetof() to each member, and
sizeof(S3).
However, that still won't tell me the actual /types/ of the fields.
As will as the width, if I need to access the individual fields, I
will need to know whether it's a signed int, unsigned int, or float.
That will need a third program!
But I guess all this won't cut any ice, since no matter how much of a
dog's dinner any C header file is, you are never going to admit that
might be anything wrong with it that, because there are always going
to be a few dozen more loops to jump through to extract the info that
you need.
You summed up the steps very well. What I don't get is why you don't
have a program to carry out those steps! If a program runs ten steps
or twelve what does it matter? It's a program. It's supposed to do as
many steps as necessary.
If you were carrying out the steps manually I could understand it. But
that would be a nightmare. Please tell me you are not carrying out any
of the steps manually!!! All steps should be doable by a program which
will run on each intended target in about 0.01 seconds, shouldn't they?
You're right. It's such a simple task, that there must be dozens of
existing programs that will do that job already: convert a C header file
into a more universal, 'flattened', format of API more suited for cross-language use (and even simpler to machine-read than the original C!).
And I guess they've already been applied to popular libraries to provide language-neutral versions of those APIs, complete with all the named
enums, and converted the macros needed to be able to use the library.
Except I can't find anything like that.
On 24/08/2021 13:55, David Brown wrote:
On 24/08/2021 09:27, James Harris wrote:
On 23/08/2021 10:55, David Brown wrote:
On 23/08/2021 11:04, James Harris wrote:
On 22/08/2021 22:46, David Brown wrote:
...
I was talking about what can be done (by programming) rather than about
what's supported by all extant OSes. For sure, an OS can impose a limit
on file sizes but I was arguing that it doesn't have to and, frankly,
that it shouldn't.
Unless you think that an OS should use arbitrary precision integers for
handling file sizes and offsets, then you are wrong.
There are /always/ limits. There is /always/ a balance between having
limits that are so high that they won't be a bottleneck, and having
types that can be handled quickly and efficiently without a waste of
run-time or data space.
The limit of a file's size would naturally be defined by the filesystem
on which it was stored or on which it was being written. Such a value
would be known by, and a property of, the FS driver.
With big modern processors, 64-bit sizes here are efficient, and files
are not going to hit that level in the near future. (There are files
sytems in use that approach 2 ^ 64 bytes in size, but not individual
files.)
Don't forget that files can have holes. So one does not need to store
(or even have capacity for) 2^64 bytes in order for a file's max offset
to be 2^64 - 1.
But more importantly, there's no need to prevent a small system from processing a big file.
On small systems, you use 32-bit for efficiency, and the same applies to
older big systems. (You have to get very old and very small to find
types smaller than 32-bit used for file sizes.)
I have in mind even smaller systems!
The OS /always/ imposes a limit, even if that limit is high.
No OS should do that. There's no need.
As I said, the max file size is naturally a property of the formatted
filesystem. That size would be set in the FS driver and made known to
the outside world. An OS could use the driver's published size just as
well and in the same way as I was suggesting that an application could
use the driver's published size.
The OS stands between the application and the filesystem. Any file
operation involves the application, the OS, and the filesystem. The
biggest file that can be handled is the minimum of the limits of all
three parts.
I have to disagree. All three parts can use the size determined by the filesystem.
An OS written 'properly' should, IMO, be able to run indefinitely and
should permit the addition of new filesystems which weren't even devised >>> when the OS was started. Again, the max file size would need to come
from the FS driver - which would be loadable and unloadable.
You have a strange idea about what is "properly written" here. For the
vast majority of OS's that are in use today, having run-time pluggable
file systems would be an insane idea. OS's are not just limited to *nix
and Windows.
Why would having loadable filesystem drivers be "an insane idea"?
I am not suggesting anything strange; AISI this is basic engineering,
nothing more.
/Appropriate/ levels of abstraction and flexibility is basic
engineering. /Appropriate/ limits and appropriate sizes for data is
basic engineering. Inappropriate generalisations and extrapolations
is not.
Again, I have to disagree. The question is: What defines how large a
file's offset can be?
The answer is just as simple: Each filesystem has its own range of max
sizes.
On 2021-08-29 16:38, Bart wrote:
And if I apply that gcc option to my raylib example, I get output like
this:
-- unsupported macro: LIGHTGRAY CLITERAL( 200, 200, 200, 255 )
-- unsupported macro: GRAY CLITERAL( 130, 130, 130, 255 )
-- unsupported macro: DARKGRAY CLITERAL( 80, 80, 80, 255 )
-- unsupported macro: YELLOW CLITERAL( 253, 249, 0, 255 )
Macros is what you usually must handle manually. No API should use them anyway. You cannot put a macro in a shared library.
On 29/08/2021 15:54, Dmitry A. Kazakov wrote:
On 2021-08-29 16:38, Bart wrote:
And if I apply that gcc option to my raylib example, I get output
like this:
-- unsupported macro: LIGHTGRAY CLITERAL( 200, 200, 200, 255 )
-- unsupported macro: GRAY CLITERAL( 130, 130, 130, 255 )
-- unsupported macro: DARKGRAY CLITERAL( 80, 80, 80, 255 )
-- unsupported macro: YELLOW CLITERAL( 253, 249, 0, 255 )
Macros is what you usually must handle manually. No API should use
them anyway. You cannot put a macro in a shared library.
A lot of 'functions' exported by APIs (and also documented as such) are really macros. Either which perform the task, or which map to other functions.
Another aspect of this translation is that when you run the tool, it
will do a specific 'rendering' of the header, which may take into
account the compiler used, whether it uses -m32 or -m64, compiler
options where they affect the results, and things like -D macros.
On 29/08/2021 13:47, James Harris wrote:
On 24/08/2021 13:55, David Brown wrote:
On 24/08/2021 09:27, James Harris wrote:
On 23/08/2021 10:55, David Brown wrote:
On 23/08/2021 11:04, James Harris wrote:
On 22/08/2021 22:46, David Brown wrote:
The limit of a file's size would naturally be defined by the filesystem
on which it was stored or on which it was being written. Such a value
would be known by, and a property of, the FS driver.
"Proof by repetitive assertion" is not convincing.
With big modern processors, 64-bit sizes here are efficient, and files
are not going to hit that level in the near future. (There are files
sytems in use that approach 2 ^ 64 bytes in size, but not individual
files.)
Don't forget that files can have holes. So one does not need to store
(or even have capacity for) 2^64 bytes in order for a file's max offset
to be 2^64 - 1.
That is true - and an argument against claiming that the OS will not
impose limits. Any normal (PC, server, etc.) OS today will have 64-bit
file sizes.
That might not be enough for specialised use in the future.
But more importantly, there's no need to prevent a small system from
processing a big file.
Of course there is - it's called "efficiency". You don't make every
real task on a small system slower in order to support file sizes that
will never be used with the system.
On small systems, you use 32-bit for efficiency, and the same applies to >>> older big systems. (You have to get very old and very small to find
types smaller than 32-bit used for file sizes.)
I have in mind even smaller systems!
What systems use file sizes that are smaller than "types smaller that
32-bit" ?
The OS /always/ imposes a limit, even if that limit is high.
No OS should do that. There's no need.
Again - efficiency. When all files that will ever be used with a given system will be far smaller than N, what is the point in making
everything work vastly slower to support arbitrary sized integers?
There are good reasons why OS's are written in languages like C, Ada,
Rust or even assembly, rather than Python.
As I said, the max file size is naturally a property of the formatted
filesystem. That size would be set in the FS driver and made known to
the outside world. An OS could use the driver's published size just as >>>> well and in the same way as I was suggesting that an application could >>>> use the driver's published size.
The OS stands between the application and the filesystem. Any file
operation involves the application, the OS, and the filesystem. The
biggest file that can be handled is the minimum of the limits of all
three parts.
I have to disagree. All three parts can use the size determined by the
filesystem.
And how is that supposed to work, exactly?
When the application wants to know the size of a file, it is going to
call an OS function such as "get_file_size_for_name(filename)". The OS
is going to take that filename, combine it with path information, and
figure out what file system it is on. Maybe it finds an inode number
for it. And then it call's the interface function implemented by the
plugin for the filesystem, "get_file_size_for_inode(filesystem_handle, inode)".
I guess in C, "filesystem_handle" will be passed as a void* pointer so
that the plugin can see exactly which filesystem it is using.
What do you suggest for the types for the return value of these functions?
An OS written 'properly' should, IMO, be able to run indefinitely and
should permit the addition of new filesystems which weren't even devised >>>> when the OS was started. Again, the max file size would need to come
from the FS driver - which would be loadable and unloadable.
You have a strange idea about what is "properly written" here. For the >>> vast majority of OS's that are in use today, having run-time pluggable
file systems would be an insane idea. OS's are not just limited to *nix >>> and Windows.
Why would having loadable filesystem drivers be "an insane idea"?
Most OS's don't have filesystems at all. And if they /do/ have one, it
will usually be dedicated. Remember, the vast majority of OS's in use
today are almost certainly unknown to you - they are not PC systems, or
even mobile phone systems, but embedded device systems. Supporting
pluggable filesystems in your smart lightbulb, or car engine controller,
or bluetooth-connected electric toothbrush /is/ insane.
I am not suggesting anything strange; AISI this is basic engineering,
nothing more.
/Appropriate/ levels of abstraction and flexibility is basic
engineering. /Appropriate/ limits and appropriate sizes for data is
basic engineering. Inappropriate generalisations and extrapolations
is not.
Again, I have to disagree. The question is: What defines how large a
file's offset can be?
The answer is just as simple: Each filesystem has its own range of max
sizes.
Your concept of "basic engineering" is severely lacking here.
On 29/08/2021 18:21, James Harris wrote:
On 29/08/2021 15:50, David Brown wrote:
"Proof by repetitive assertion" is not convincing.
There's nothing to prove. It is simply factual (and well known) that
different filesystems have different maximum file sizes. FAT12 has
different limits from FAT32, for example. Ergo, the maximum permitted
file size /is/ a natural property of the formatted filesystem. I guess
that's repetitive again but I cannot imagine what you think would need
to be added to that to establish the point.
In fact, someone could release a new filesystem tomorrow which had
higher limits than those supported by a certain OS today. Under what
you have proposed the OS would need to be altered and recompiled to
match. Therefore the max file size is not naturally a property of the OS.
What do you mean by a "file system"? Is it something that /has/ to be
dealt with via an OS, so the OS's limitations matter, or could it be
accessed via an API independently of OS?
With big modern processors, 64-bit sizes here are efficient, and files >>>>> are not going to hit that level in the near future. (There are files >>>>> sytems in use that approach 2 ^ 64 bytes in size, but not individual >>>>> files.)
Don't forget that files can have holes.
(If that was ever the case, then I /have/ forgotten!)
On 29/08/2021 15:50, David Brown wrote:
"Proof by repetitive assertion" is not convincing.
There's nothing to prove. It is simply factual (and well known) that different filesystems have different maximum file sizes. FAT12 has
different limits from FAT32, for example. Ergo, the maximum permitted
file size /is/ a natural property of the formatted filesystem. I guess
that's repetitive again but I cannot imagine what you think would need
to be added to that to establish the point.
In fact, someone could release a new filesystem tomorrow which had
higher limits than those supported by a certain OS today. Under what you
have proposed the OS would need to be altered and recompiled to match. Therefore the max file size is not naturally a property of the OS.
With big modern processors, 64-bit sizes here are efficient, and files >>>> are not going to hit that level in the near future. (There are files >>>> sytems in use that approach 2 ^ 64 bytes in size, but not individual
files.)
Don't forget that files can have holes.
That might not be enough for specialised use in the future.
Indeed.
But more importantly, there's no need to prevent a small system from
processing a big file.
Of course there is - it's called "efficiency". You don't make every
real task on a small system slower in order to support file sizes that
will never be used with the system.
Oh? How long do you think it would take to, say, add an offset to a
multiword integer and how long do you think it would take for a device
driver and io system to respond to a request to read or write the sector
at that offset?
On 29/08/2021 15:19, Dmitry A. Kazakov wrote:
On 2021-08-29 15:58, Bart wrote:
And I guess they've already been applied to popular libraries to
provide language-neutral versions of those APIs, complete with all
the named enums, and converted the macros needed to be able to use
the library.
Except I can't find anything like that.
Really?
1. It is even integrated in GCC. See the
-fdump-ada-spec
switch.
2. c2ada:
http://c2ada.sourceforge.net/c2ada.html
No, this is not what James had in mind. Which was writing little C
scripts that applied -E to preprocess code, and looking for specific
struct definitions, I think using grep or something.
On 2021-08-29 11:51, James Harris wrote:
On 29/08/2021 09:38, Dmitry A. Kazakov wrote:
On 2021-08-29 10:16, James Harris wrote:
On 24/08/2021 09:34, Dmitry A. Kazakov wrote:
Your array is allocated in the user space. Do you understand that?
There could be file offsets in both user and kernel space. Why? I
cannot see what you are driving at.
Passing anything from the user-space is extremely expensive.
Yes, the FS's offset would be a scalar.
Then why passing it by reference?
I'd pass the offset by reference because its size would not be known
at compile time.
See, it is neither scalar nor statically bound = dynamic.
It would be /implemented/ as an array but it would be /semantically/
an object. You could say that a 4-byte integer was implemented as an
array of four bytes, if you wanted, and it might need to be
implemented that way on a small CPU but it would still be semantically
an integer.
Again, scalar is not array. Neither implements another, they are already implementations.
On 29/08/2021 14:58, Bart wrote:
On 29/08/2021 14:28, James Harris wrote:
That won't tell me the sizes or offsets. For that I need a different
program that applies sizeof() and offsetof() to each member, and
sizeof(S3).
However, that still won't tell me the actual /types/ of the fields.
As will as the width, if I need to access the individual fields, I
will need to know whether it's a signed int, unsigned int, or float.
If you were carrying out the steps manually I could understand it.
But that would be a nightmare. Please tell me you are not carrying
out any of the steps manually!!! All steps should be doable by a
program which will run on each intended target in about 0.01 seconds,
shouldn't they?
BTW, I have tried to create my own version of such a tool, which
actually is not as simple as you are trying to make up. Or maybe I'm
just too thick to be able to do it.
It's built as an extension to a C compiler, but it is MY C compiler, so doesn't use any external tools. That means it's limited to the
capabilities of that compiler, which only supports a C subset.
Here's what happens when I apply it to this C header:
https://github.com/sal55/langs/blob/master/raylib.h
On 29/08/2021 11:31, Dmitry A. Kazakov wrote:
Again, scalar is not array. Neither implements another, they are
already implementations.
For what we have been discussing an array of three integers is no
different from a record of three integers;
nor is either different from
an object which holds three integers. They are all semantically ONE
object which holds THREE integers.
You can call records and objects
arrays rather than scalars if you want to but they would be used and
passed around as scalars.
On 29/08/2021 15:27, Bart wrote:
On 29/08/2021 14:58, Bart wrote:
On 29/08/2021 14:28, James Harris wrote:
That won't tell me the sizes or offsets. For that I need a
different program that applies sizeof() and offsetof() to each
member, and sizeof(S3).
However, that still won't tell me the actual /types/ of the fields.
As will as the width, if I need to access the individual fields, I
will need to know whether it's a signed int, unsigned int, or float.
...
If you were carrying out the steps manually I could understand it.
But that would be a nightmare. Please tell me you are not carrying
out any of the steps manually!!! All steps should be doable by a
program which will run on each intended target in about 0.01
seconds, shouldn't they?
...
BTW, I have tried to create my own version of such a tool, which
actually is not as simple as you are trying to make up. Or maybe I'm
just too thick to be able to do it.
It's built as an extension to a C compiler, but it is MY C compiler,
so doesn't use any external tools. That means it's limited to the
capabilities of that compiler, which only supports a C subset.
Here's what happens when I apply it to this C header:
https://github.com/sal55/langs/blob/master/raylib.h
You're threatening to bifurcate the discussion again! :-(
On the topic we were discussing if you have a C parser why not use it
(along with a C preprocessor) to extract the info you need - such a type definitions and structure offsets - from each target environment's C
header files? That's what I thought you wanted to do before.
On 29/08/2021 20:02, James Harris wrote:
On the topic we were discussing if you have a C parser why not use it
(along with a C preprocessor) to extract the info you need - such a
type definitions and structure offsets - from each target
environment's C header files? That's what I thought you wanted to do
before.
Did you read the rest of my post? A few lines down I linked to a file
which was generated by my compiler. But it cannot do a completely
automatic translation.
As for system headers, it's not practical to use those of other
compilers (as they will be full of implementation-specific features);
I
have to construct my own. Including the famous 'struct stat', by delving
deep into that rabbit-hole.
On 29/08/2021 20:17, Bart wrote:
On 29/08/2021 20:02, James Harris wrote:
...
On the topic we were discussing if you have a C parser why not use it
(along with a C preprocessor) to extract the info you need - such a
type definitions and structure offsets - from each target
environment's C header files? That's what I thought you wanted to do
before.
Did you read the rest of my post? A few lines down I linked to a file
which was generated by my compiler. But it cannot do a completely
automatic translation.
Yes, I did. I looked at the two files you linked and even the definition
of the macros which did not convert.
But translating your own sources is a new topic and was NOT what we were talking about.
...
As for system headers, it's not practical to use those of other
compilers (as they will be full of implementation-specific features);
That was the point: other environments do NOT have configuration files
but they often DO have C headers. To determine the configuration for
those environments you need to get the info from the C headers. And
that's best done by
Cpreprocessor < header | bart_parse env.conf
where bart_parse is your program which parses the output from the C preprocessor and updates env.conf with the required info.
On 29/08/2021 15:50, David Brown wrote:
On 29/08/2021 13:47, James Harris wrote:
On 24/08/2021 13:55, David Brown wrote:
On 24/08/2021 09:27, James Harris wrote:
On 23/08/2021 10:55, David Brown wrote:
On 23/08/2021 11:04, James Harris wrote:
On 22/08/2021 22:46, David Brown wrote:
...
The limit of a file's size would naturally be defined by the filesystem
on which it was stored or on which it was being written. Such a value
would be known by, and a property of, the FS driver.
"Proof by repetitive assertion" is not convincing.
There's nothing to prove. It is simply factual (and well known) that different filesystems have different maximum file sizes. FAT12 has
different limits from FAT32, for example. Ergo, the maximum permitted
file size /is/ a natural property of the formatted filesystem. I guess
that's repetitive again but I cannot imagine what you think would need
to be added to that to establish the point.
In fact, someone could release a new filesystem tomorrow which had
higher limits than those supported by a certain OS today. Under what you
have proposed the OS would need to be altered and recompiled to match.
Therefore the max file size is not naturally a property of the OS.
With big modern processors, 64-bit sizes here are efficient, and files >>>> are not going to hit that level in the near future. (There are files >>>> sytems in use that approach 2 ^ 64 bytes in size, but not individual
files.)
Don't forget that files can have holes. So one does not need to store
(or even have capacity for) 2^64 bytes in order for a file's max offset
to be 2^64 - 1.
That is true - and an argument against claiming that the OS will not
impose limits. Any normal (PC, server, etc.) OS today will have 64-bit
file sizes.
Some will.
That might not be enough for specialised use in the future.
Indeed.
But more importantly, there's no need to prevent a small system from
processing a big file.
Of course there is - it's called "efficiency". You don't make every
real task on a small system slower in order to support file sizes that
will never be used with the system.
Oh? How long do you think it would take to, say, add an offset to a
multiword integer and how long do you think it would take for a device
driver and io system to respond to a request to read or write the sector
at that offset?
Beware of premature optimisation.
On small systems, you use 32-bit for efficiency, and the same
applies to
older big systems. (You have to get very old and very small to find
types smaller than 32-bit used for file sizes.)
I have in mind even smaller systems!
What systems use file sizes that are smaller than "types smaller that
32-bit" ?
I thought you were the microcontroller man!
The OS /always/ imposes a limit, even if that limit is high.
No OS should do that. There's no need.
Again - efficiency. When all files that will ever be used with a given
system will be far smaller than N, what is the point in making
everything work vastly slower to support arbitrary sized integers?
There are good reasons why OS's are written in languages like C, Ada,
Rust or even assembly, rather than Python.
I don't think you understand the proposal but see below.
As I said, the max file size is naturally a property of the formatted >>>>> filesystem. That size would be set in the FS driver and made known to >>>>> the outside world. An OS could use the driver's published size just as >>>>> well and in the same way as I was suggesting that an application could >>>>> use the driver's published size.
The OS stands between the application and the filesystem. Any file
operation involves the application, the OS, and the filesystem. The
biggest file that can be handled is the minimum of the limits of all
three parts.
I have to disagree. All three parts can use the size determined by the
filesystem.
And how is that supposed to work, exactly?
I'll do my best to explain.
When the application wants to know the size of a file, it is going to
call an OS function such as "get_file_size_for_name(filename)". The OS
is going to take that filename, combine it with path information, and
figure out what file system it is on. Maybe it finds an inode number
for it. And then it call's the interface function implemented by the
plugin for the filesystem, "get_file_size_for_inode(filesystem_handle,
inode)".
OK. The filesystem driver (which would do most of the manipulation of offsets) could work with file offsets as integers of a size which was
known then the driver was compiled. E.g. the driver might support 48-bit offsets. If compiled for a 64-bit CPU it could manipulate them as ints.
If compiled for a 16-bit CPU it could manipulate them as three
successive ints.
I guess in C, "filesystem_handle" will be passed as a void* pointer so
that the plugin can see exactly which filesystem it is using.
What do you suggest for the types for the return value of these
functions?
Something Dmitry said makes, I think, this easier to explain. In an OO language you could think of the returns from your functions as objects.
The objects would be correctly sized for the filesystems to which they related. They could have different classes but they would all respond polymorphically to the same methods. The class of each would know the
maximum permitted offset.
An object holding a FAT12 offset would be at least 12 bits. An object
holding a FAT32 offset would be at least 32 bits. An object holding a
ZFS offset would be ... big enough for the ZFS volume to which it related.
Incidentally, going back to your concerns about performance, it could
well take longer to dispatch to a seek method than it would to execute
it!!! What I have suggested is fast, not slow.
An OS written 'properly' should, IMO, be able to run indefinitely and >>>>> should permit the addition of new filesystems which weren't even
devised
when the OS was started. Again, the max file size would need to come >>>>> from the FS driver - which would be loadable and unloadable.
You have a strange idea about what is "properly written" here. For the >>>> vast majority of OS's that are in use today, having run-time pluggable >>>> file systems would be an insane idea. OS's are not just limited to
*nix
and Windows.
Why would having loadable filesystem drivers be "an insane idea"?
Most OS's don't have filesystems at all. And if they /do/ have one, it
will usually be dedicated. Remember, the vast majority of OS's in use
today are almost certainly unknown to you - they are not PC systems, or
even mobile phone systems, but embedded device systems. Supporting
pluggable filesystems in your smart lightbulb, or car engine controller,
or bluetooth-connected electric toothbrush /is/ insane.
That does not answer the question. Toothbrush OSes do not need frame
buffers - but that does not make frame buffers insane.
I am not suggesting anything strange; AISI this is basic engineering, >>>>> nothing more.
/Appropriate/ levels of abstraction and flexibility is basic
engineering. /Appropriate/ limits and appropriate sizes for data is
basic engineering. Inappropriate generalisations and extrapolations
is not.
Again, I have to disagree. The question is: What defines how large a
file's offset can be?
The answer is just as simple: Each filesystem has its own range of max
sizes.
Your concept of "basic engineering" is severely lacking here.
Oh? What, specifically, is lacking?
For this purpose (creating C system header files to go with your own C compiler), you need to end up with an actual set of header files.
Which existing compilers do you look at for information? Ones like gcc
have incredibly elaborate headers, full of compiler-specific built-ins
and attributes and predefined macros.
If I take stdarg.h as an example, my version of that has this; it
defines on type, and 5 macros:
--------------------------------------------------------------
/* Header stdarg.h */
#ifndef $STDARG
#define $STDARG
typedef char * va_list;
#define va_start(ap,v) ap=((va_list)&v+8)
#define va_arg(ap,t) *(t*)((ap+=8)-8)
#define va_copy(dest,src) (dest=src)
#define va_end(ap) ( ap = (va_list)0 )
#endif
--------------------------------------------------------------
The one used by tdm/gcc uses those 3 headers shown below (about 300
lines); I haven't included the 600-line _mingw_h which has yet more
includes.
How do you get from this, to the above? Certainly not by any automatic process!
On 2021-08-29 21:14, James Harris wrote:
On 29/08/2021 11:31, Dmitry A. Kazakov wrote:
Again, scalar is not array. Neither implements another, they are
already implementations.
For what we have been discussing an array of three integers is no
different from a record of three integers;
Of course they are different. You seems confuse type with its machine representation. Many types may have similar representation, that does
not make them same.
nor is either different from an object which holds three integers.
They are all semantically ONE object which holds THREE integers.
They are not, because you stated that it is a composite type.
You can call records and objects arrays rather than scalars if you
want to but they would be used and passed around as scalars.
I do not know what does this mean.
You can pass a value either by value or by reference. By-value passing
could be done over a machine register.
Scalar is a type property, opposite to composite type.
C shouldn't come into it at all. A 'C type system' isn't really a type system, but just a thin layer over the underlying hardware.
Actually it works hard to obscure those underlying types without adding anything useful.
But regarding C headers, once upon a time there existed now completely forgotten language design principle that the program should be all documentation you needed.
On 26/08/2021 09:33, Dmitry A. Kazakov wrote:
On 2021-08-26 00:07, Bart wrote:
The headers are, in effect, and
automatic configuration system to match the compiler's needs.
However, this stuff should be that simple. Why dozens of different
types just to get basic file info? (There were 4 involved with dev_t,
but 10 different types for the members of struct stat, times 4, is 40!)
Because there are dozens of different entities involved. Welcome back to
reality.
Bart doesn't like that reality. He wants a reality where everything is designed to suit his convenience, and is fine-tuned for the systems he
uses and nothing else.
On 29/08/2021 23:10, Bart wrote:
For this purpose (creating C system header files to go with your own C
compiler), you need to end up with an actual set of header files.
Which existing compilers do you look at for information? Ones like gcc
have incredibly elaborate headers, full of compiler-specific built-ins
and attributes and predefined macros.
If I take stdarg.h as an example, my version of that has this; it
defines on type, and 5 macros:
You do know you are cherry-picking perhaps the worst case here, as the <stdargs.h> is very tightly connected to the compiler?
"language support" than a typical C header that provides some
declarations for external functions, some types, some constants, and
perhaps some macros.
I don't know what you mean by "C system headers", since you invariably
and knowingly mix up C standard library headers and headers for
OS-provided libraries.
#ifndef $STDARG
#define $STDARG
While it is perfectly allowable to use $ like this, and a conforming C program cannot use a $ in other identifiers, the use of $ as a "letter"
in identifiers is a common extension supported by a lot of C compilers.
The standard way of getting a "local" identifier in system headers is
to start them with two underscores, as such names are always reserved
for such purposes.
The one used by tdm/gcc uses those 3 headers shown below (about 300
lines); I haven't included the 600-line _mingw_h which has yet more
includes.
You do realise that in the world of real C compilers, standards and compatibility are important?
On 26/08/2021 11:43, Bart wrote:
...
C shouldn't come into it at all. A 'C type system' isn't really a type
system, but just a thin layer over the underlying hardware.
Actually it works hard to obscure those underlying types without
adding anything useful.
You shouldn't have a go at C's type system until you have tried to adapt yours for different word sizes. Your choice of making everything 64-bit
give you enormous luxuries that were not available in decades past!
I read somewhere (but can't find the reference or quotation) that the strength of a programming language comes not from the features it has,
but the restrictions it has.
On 25/08/2021 09:29, David Brown wrote:
...
I read somewhere (but can't find the reference or quotation) that the
strength of a programming language comes not from the features it has,
but the restrictions it has.
Haven't heard that one but it's similar to "A language design is
finished not when there's no more to add but when there's no more to
take away."
Thank Antoine:
https://www.brainyquote.com/quotes/antoine_de_saintexupery_103610
Due to the potential for standard libraries his comment arguably applies
more to a programming language than it does to almost anything else.
On 25/08/2021 09:29, David Brown wrote:
...
I read somewhere (but can't find the reference or quotation) that the
strength of a programming language comes not from the features it has,
but the restrictions it has.
Haven't heard that one but it's similar to "A language design is
finished not when there's no more to add but when there's no more to
take away."
Thank Antoine:
https://www.brainyquote.com/quotes/antoine_de_saintexupery_103610
Due to the potential for standard libraries his comment arguably applies
more to a programming language than it does to almost anything else.
On 30/08/2021 08:27, David Brown wrote:
On 29/08/2021 23:10, Bart wrote:
For this purpose (creating C system header files to go with your own C
compiler), you need to end up with an actual set of header files.
Which existing compilers do you look at for information? Ones like gcc
have incredibly elaborate headers, full of compiler-specific built-ins
and attributes and predefined macros.
If I take stdarg.h as an example, my version of that has this; it
defines on type, and 5 macros:
You do know you are cherry-picking perhaps the worst case here, as the
<stdargs.h> is very tightly connected to the compiler?
OK, let's take a much more portable feature, since it can be provided by
an external library, the definition of printf inside stdio.h. Here's
mine (which naughtily excludes 'extern'):
int printf(const char*, ...);
This is the one that comes with the gcc/mingw/tdm headers (you might
just be able to discern 'printf' somewhere in there!):
__mingw_ovr __attribute__((__format__ (gnu_printf, 1, 2))) __MINGW_ATTRIB_NONNULL(1)
int printf (const char *__format, ...)
{
int __retval;
__builtin_va_list __local_argv; __builtin_va_start( __local_argv,
__format );
__retval = __mingw_vfprintf( stdout, __format, __local_argv );
__builtin_va_end( __local_argv );
return __retval;
}
Here, I decided it was better to look at an alternate source of info.
It is more
"language support" than a typical C header that provides some
declarations for external functions, some types, some constants, and
perhaps some macros.
I don't know what you mean by "C system headers", since you invariably
and knowingly mix up C standard library headers and headers for
OS-provided libraries.
I'm making a distinction between general headers that could be processed
by any compiler (in theory), and those designed for a specific implementation.
I don't know where POSIX ones fall (since on those OSes that support it,
I think they are provided by the OS), or ones that start with <sys/...>,
but a few of those are commonly used by applications so I provide them, sometimes skeleton ones (eg. unistd.h) in order to be able to build a specific app.
#ifndef $STDARG
#define $STDARG
While it is perfectly allowable to use $ like this, and a conforming C
program cannot use a $ in other identifiers, the use of $ as a "letter"
in identifiers is a common extension supported by a lot of C compilers.
This is in code that should only be processed by my compiler. If it
clashes with user-code, that's too bad (but the compiler has worse
problems compiling arbitrary code).
Note that Tiny C doesn't allow $ at all, so such code wouldn't work with
that either.
The standard way of getting a "local" identifier in system headers is
to start them with two underscores, as such names are always reserved
for such purposes.
I don't like underscores because their visibility, already poor, and
whether multiple underscores blend into each other or not, depends on font.
The one used by tdm/gcc uses those 3 headers shown below (about 300
lines); I haven't included the 600-line _mingw_h which has yet more
includes.
You do realise that in the world of real C compilers, standards and
compatibility are important?
Yeah... but perhaps you also realise that system headers are, or should
be, specific to an implementation?
Trying to make a single header work for myriad targets sounds an
admirable idea; in practice it makes for nightmare code.
I'd rather there were half-a-dozen compact, clean-written versions of a header (of which you'd only see one anyway), than one that tries to do
all six variants which is likely to be more than six times the combined
size.
On 30/08/2021 11:44, James Harris wrote:
On 26/08/2021 11:43, Bart wrote:
...
C shouldn't come into it at all. A 'C type system' isn't really a
type system, but just a thin layer over the underlying hardware.
You shouldn't have a go at C's type system until you have tried to
adapt yours for different word sizes. Your choice of making everything
64-bit give you enormous luxuries that were not available in decades
past!
64-bit is simply using the native word size of the processor.
You can argue that C was created when there were more diverse word-sizes
and fewer byte-addressed machines, but it should have been obvious a few years after its creation, which way things were going. It had plenty of opportunity to get things in order.
On 29/08/2021 21:32, Dmitry A. Kazakov wrote:
Scalar is a type property, opposite to composite type.
I think James is confusing the way parameters are passed, with the
property of the type. In particular, he seems to using "scalar" to mean "passed by value as a single object", and "composite" to mean "passed by reference, using a pointer to the start of the object". This is, of
course, not correct terminology.
In a simple low-level non-OO language, a scaler type is a fundamental
type of the language that is not built up of other parts. For C, it is defined as an arithmetic type or a pointer type. C++ adds enumeration
types, pointer to member types, and nullptr_t.
Aliases and sub-types or sub-ranges of scalar types will also be scalar.
Aggregate types are any type that is not a scalar type. (These may also
be referred to as "composite types" in some languages, but in C that
term refers to a type that can be used to refer to two compatible types.)
Once you get to higher level languages, it gets more complicated. In
Python, is an arbitrary precision integer a scalar or a composite? You
can't break it down within the language, yet the implementation involves
tree data structures, memory management, etc.
Even within something like C++ (and presumably Ada, but you know that
better than I) you can create a type such as "Int256" that behaves identically to other integer types that are clearly scalar, while it is
in fact defined as an aggregate. It is an aggregate, but is used like a scalar.
So perhaps the scalar/aggregate distinction is not particularly useful.
On 2021-08-29 21:14, James Harris wrote:
On 29/08/2021 11:31, Dmitry A. Kazakov wrote:
Again, scalar is not array. Neither implements another, they are
already implementations.
For what we have been discussing an array of three integers is no
different from a record of three integers;
Of course they are different. You seems confuse type with its machine representation. Many types may have similar representation, that does
not make them same.
nor is either different from an object which holds three integers.
They are all semantically ONE object which holds THREE integers.
They are not, because you stated that it is a composite type.
You can call records and objects arrays rather than scalars if you
want to but they would be used and passed around as scalars.
I do not know what does this mean.
You can pass a value either by value or by reference. By-value passing
could be done over a machine register.
Scalar is a type property, opposite to composite type.
On 30/08/2021 12:45, Bart wrote:
You can argue that C was created when there were more diverse
word-sizes and fewer byte-addressed machines, but it should have been
obvious a few years after its creation, which way things were going.
It had plenty of opportunity to get things in order.
My point was that even today C can be used to write programs for
machines with word sizes you would hate whereas you've been able to
avoid the complexity of all those issues by simply avoiding supporting
such machines. So don't be too hard on C's type system. Most of it is
there for a reason.
you've been able to
avoid the complexity of all those issues by simply avoiding supporting
such machines.
On 30/08/2021 12:45, Bart wrote:
On 30/08/2021 11:44, James Harris wrote:
On 26/08/2021 11:43, Bart wrote:
...
C shouldn't come into it at all. A 'C type system' isn't really a
type system, but just a thin layer over the underlying hardware.
...
You shouldn't have a go at C's type system until you have tried to
adapt yours for different word sizes. Your choice of making
everything 64-bit give you enormous luxuries that were not available
in decades past!
64-bit is simply using the native word size of the processor.
I'm sure I've seen you espousing it for other things, too. ;-)
On 30/08/2021 09:03, David Brown wrote:
On 29/08/2021 21:32, Dmitry A. Kazakov wrote:
...
Scalar is a type property, opposite to composite type.
I think James is confusing the way parameters are passed, with the
property of the type. In particular, he seems to using "scalar" to mean
"passed by value as a single object", and "composite" to mean "passed by
reference, using a pointer to the start of the object". This is, of
course, not correct terminology.
No, not at all. First of all, it may have been Dmitry who brought in the
term "composite". Second, the situation doesn't change according to
whether an object is passed by value or by reference.
In a simple low-level non-OO language, a scaler type is a fundamental
type of the language that is not built up of other parts. For C, it is
defined as an arithmetic type or a pointer type. C++ adds enumeration
types, pointer to member types, and nullptr_t.
Are you saying the C standards have an official definition of 'scalar'
(which is along the lines you mention)?
Aliases and sub-types or sub-ranges of scalar types will also be scalar.
Aggregate types are any type that is not a scalar type. (These may also
be referred to as "composite types" in some languages, but in C that
term refers to a type that can be used to refer to two compatible types.)
Once you get to higher level languages, it gets more complicated. In
Python, is an arbitrary precision integer a scalar or a composite? You
can't break it down within the language, yet the implementation involves
tree data structures, memory management, etc.
Even within something like C++ (and presumably Ada, but you know that
better than I) you can create a type such as "Int256" that behaves
identically to other integer types that are clearly scalar, while it is
in fact defined as an aggregate. It is an aggregate, but is used like a
scalar.
So perhaps the scalar/aggregate distinction is not particularly useful.
I see scalars as objects which can be passed to a routine, returned from
a routine and operated on as whole units. At that level they are
scalars. If they can be subdivided by a routine which chooses to do so
then they won't be scalars /in such code/.
At the end of the day, if we are talking about digital computing all
objects other than a single bit could be treated as aggregates.
So scalar is arguably a semantic concept, reflecting how an object is processed.
On 30/08/2021 21:09, James Harris wrote:
On 30/08/2021 12:45, Bart wrote:
You can argue that C was created when there were more diverse
word-sizes and fewer byte-addressed machines, but it should have been
obvious a few years after its creation, which way things were going.
It had plenty of opportunity to get things in order.
My point was that even today C can be used to write programs for
machines with word sizes you would hate whereas you've been able to
avoid the complexity of all those issues by simply avoiding supporting
such machines. So don't be too hard on C's type system. Most of it is
there for a reason.
OK. It all depends on how much you want to emulate C in that regard, and
so inherit a lot of the same mess in its type system.
I suggested a few years ago that C should be split into two languages,
one that continues to target all odd-ball systems past, present and
future, and one that targets the same desktop-class machines that most
other languages seem to have settled on, including mine.
Namely, ones like D, Java, Julia, Dart, Rust, Odin, C#, Nim, Go ...
But it seems that you are keen to cover everything in one language, from
the tiniest microcontrollers, through desktop PCs and current
supercomputers, and up to massively large machines that need 128 bits to specify file sizes.
I'd say that's being a little ambitious. (Are you still writing OSes too?)
If you want to limit types to one single size of integer (except perhaps
for FFI or large arrays), then 64-bit is a good choice unless you need
to be as efficient as possible on small embedded systems. All modern
PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
so smaller sizes are not faster for general use. Outside of mathematics
and applications like cryptography, it is rare for an integer to be
bigger than 2 ^ 31. But it is extremely rare to need values bigger than
2 ^ 63. So for practical purposes, you can usually treat a 64-bit
integer as "unlimited".
On 2021-08-31 09:16, David Brown wrote:
If you want to limit types to one single size of integer (except perhaps
for FFI or large arrays), then 64-bit is a good choice unless you need
to be as efficient as possible on small embedded systems. All modern
PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
so smaller sizes are not faster for general use. Outside of mathematics
and applications like cryptography, it is rare for an integer to be
bigger than 2 ^ 31. But it is extremely rare to need values bigger than
2 ^ 63. So for practical purposes, you can usually treat a 64-bit
integer as "unlimited".
128-bit is appears in networking protocols, right, because of
cryptography.
There are other formats for which 128-bit is required,
e.g. IEEE decimal numbers have mantissa longer than 64-bit.
I think that the language should allow any range. If the target does not support the particular range, the compiler should deploy a library implementation, giving a warning.
On 2021-08-31 09:16, David Brown wrote:
If you want to limit types to one single size of integer (except perhaps
for FFI or large arrays), then 64-bit is a good choice unless you need
to be as efficient as possible on small embedded systems. All modern
PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
so smaller sizes are not faster for general use. Outside of mathematics
and applications like cryptography, it is rare for an integer to be
bigger than 2 ^ 31. But it is extremely rare to need values bigger than
2 ^ 63. So for practical purposes, you can usually treat a 64-bit
integer as "unlimited".
128-bit is appears in networking protocols, right, because of
cryptography. There are other formats for which 128-bit is required,
e.g. IEEE decimal numbers have mantissa longer than 64-bit.
I think that the language should allow any range. If the target does not support the particular range, the compiler should deploy a library implementation, giving a warning.
On 31/08/2021 11:21, Dmitry A. Kazakov wrote:
On 2021-08-31 09:16, David Brown wrote:
If you want to limit types to one single size of integer (except perhaps >>> for FFI or large arrays), then 64-bit is a good choice unless you need
to be as efficient as possible on small embedded systems. All modern
PC's or bigger embedded systems (phones, etc.) handle 64-bit natively,
so smaller sizes are not faster for general use. Outside of mathematics >>> and applications like cryptography, it is rare for an integer to be
bigger than 2 ^ 31. But it is extremely rare to need values bigger than >>> 2 ^ 63. So for practical purposes, you can usually treat a 64-bit
integer as "unlimited".
128-bit is appears in networking protocols, right, because of
cryptography.
No, not really.
As I said, there are bigger integers used in cryptography. But these
are often /much/ bigger - at least 256 bits, and perhaps up to 4096
bits. You don't use a normal plain integer type there.
There are other formats for which 128-bit is required,
e.g. IEEE decimal numbers have mantissa longer than 64-bit.
Again, these are not integers.
I think that the language should allow any range. If the target does not
support the particular range, the compiler should deploy a library
implementation, giving a warning.
Don't misunderstand me here - there are use-cases for all kinds of sizes
and types. A language that intends to be efficient is going to have to
be able to handle arrays of smaller types. And there has to be ways to handle bigger types for their occasional usage. The point is that there
is little need for multiple sizes for "normal" everyday usage - a signed 64-bit type really will work for practically everything.
On 30/08/2021 12:45, Bart wrote:
On 30/08/2021 11:44, James Harris wrote:
On 26/08/2021 11:43, Bart wrote:
...
C shouldn't come into it at all. A 'C type system' isn't really a
type system, but just a thin layer over the underlying hardware.
...
You shouldn't have a go at C's type system until you have tried to
adapt yours for different word sizes. Your choice of making
everything 64-bit give you enormous luxuries that were not available
in decades past!
64-bit is simply using the native word size of the processor.
I'm sure I've seen you espousing it for other things, too. ;-)
...
You can argue that C was created when there were more diverse
word-sizes and fewer byte-addressed machines, but it should have been
obvious a few years after its creation, which way things were going.
It had plenty of opportunity to get things in order.
My point was that even today C can be used to write programs for
machines with word sizes you would hate whereas you've been able to
avoid the complexity of all those issues by simply avoiding supporting
such machines. So don't be too hard on C's type system. Most of it is
there for a reason.
On 30/08/2021 21:09, James Harris wrote:
On 30/08/2021 12:45, Bart wrote:
On 30/08/2021 11:44, James Harris wrote:
On 26/08/2021 11:43, Bart wrote:
...
C shouldn't come into it at all. A 'C type system' isn't really a
type system, but just a thin layer over the underlying hardware.
...
You shouldn't have a go at C's type system until you have tried to
adapt yours for different word sizes. Your choice of making
everything 64-bit give you enormous luxuries that were not available
in decades past!
64-bit is simply using the native word size of the processor.
I'm sure I've seen you espousing it for other things, too. ;-)
...
You can argue that C was created when there were more diverse
word-sizes and fewer byte-addressed machines, but it should have been
obvious a few years after its creation, which way things were going.
It had plenty of opportunity to get things in order.
My point was that even today C can be used to write programs for
machines with word sizes you would hate whereas you've been able to
avoid the complexity of all those issues by simply avoiding supporting
such machines. So don't be too hard on C's type system. Most of it is
there for a reason.
Have a look at this benchmark:
https://github.com/sal55/langs/blob/master/bench.c
(Notice the comment that the code was modified for 64 bits.)
If I try and run this on 64-bit Windows, it crashes. gcc will kindly
point out the reason: it uses both 'int' and 'long', and here assumes
that long is wide enough (presumably wider than int) to store a pointer value.
I can fix it by changing 'long' to 'long long'. (Although for the
purpose here, probably intptr_t would be more apt.)
You might argue that it is Windows' or MS' fault for not using a more sensible width of 'long', but the fact is that C ALLOWS all these
disparate versions of types such as 'long'.
It might be the same width as 'int'; it might be double the width; it
might be anything so long as it's not smaller than 'int'.
So there is a clear advantage these days in an 'int' type that grows
with the word size and is the same size as an address. Although this benchmark might have then gone wrong in assuming that 'int' was 32 bits.
Yes, it was probably poorly written too. But the language doesn't help.
On 31/08/2021 13:24, Bart wrote:
Have a look at this benchmark:
https://github.com/sal55/langs/blob/master/bench.c
(Notice the comment that the code was modified for 64 bits.)
If I try and run this on 64-bit Windows, it crashes. gcc will kindly
point out the reason: it uses both 'int' and 'long', and here assumes
that long is wide enough (presumably wider than int) to store a pointer
value.
If you cast a pointer to a "long", you deserve all the problems you get.
Why on earth would you thing that is a good idea?
I can fix it by changing 'long' to 'long long'. (Although for the
purpose here, probably intptr_t would be more apt.)
Fix it by using pointers as pointers, and don't faff around with casting
them into random integer types! If you desperately need to hold
different types of pointers in the same variable (though it is often an indication of poor program structure), use "void *".
But you've got code there that casts pointers into longs, and then
compares them to an int...
You might argue that it is Windows' or MS' fault for not using a more
sensible width of 'long', but the fact is that C ALLOWS all these
disparate versions of types such as 'long'.
No, the fault here is the programmer's alone.
There /are/ cases where it makes sense to turn pointers into integers, because you want to do weird things such as compact storage in large
arrays, or low-level stuff like checking alignments. The type
"uintptr_t" has existed for that purpose for 20 years. The C
preprocessor to check type sizes and make your own local equivalent to "uintptr_t" has existed for 50 years. There has /never/ been an excuse
to cast a pointer to an int, or a long, or other guessed type and think
it is a good idea.
You are not alone in writing bad code like this, but that does not
excuse you.
It might be the same width as 'int'; it might be double the width; it
might be anything so long as it's not smaller than 'int'.
So there is a clear advantage these days in an 'int' type that grows
with the word size and is the same size as an address. Although this
benchmark might have then gone wrong in assuming that 'int' was 32 bits.
No, there is no sense in having the size of "int" related to the size of
a pointer, because the two have totally different purposes and should
not be mixed.
"long" means "at least 32-bit". Nothing more, nothing less - and if you
use it to mean more, you are using it incorrectly.
Yes, it was probably poorly written too. But the language doesn't help.
The language lets you write bad code. What language does not?
On 31/08/2021 14:31, David Brown wrote:
On 31/08/2021 13:24, Bart wrote:
Have a look at this benchmark:
https://github.com/sal55/langs/blob/master/bench.c
(Notice the comment that the code was modified for 64 bits.)
If I try and run this on 64-bit Windows, it crashes. gcc will kindly
point out the reason: it uses both 'int' and 'long', and here assumes
that long is wide enough (presumably wider than int) to store a pointer
value.
If you cast a pointer to a "long", you deserve all the problems you get.
Why on earth would you thing that is a good idea?
This is not my code. I made the minimum changes needed so that the
benchmark gave meaningful results, since I don't understand how it works.
I can fix it by changing 'long' to 'long long'. (Although for the
purpose here, probably intptr_t would be more apt.)
Fix it by using pointers as pointers, and don't faff around with casting
them into random integer types! If you desperately need to hold
different types of pointers in the same variable (though it is often an
indication of poor program structure), use "void *".
But you've got code there that casts pointers into longs, and then
compares them to an int...
I suspect that made more sense in the original BCPL (see below), which I think only had a single type, a machine word.
That was a very simple model appropriate to the word-addressed machines
at the time, but could now could be once again useful with 64-bit machines.
Look also at Knuth's MMIX assembler for a machine where everything is 64
bits too; registers can hold 64-bit ints or 64-bit floats or 64-bit addresses.
You might argue that it is Windows' or MS' fault for not using a more
sensible width of 'long', but the fact is that C ALLOWS all these
disparate versions of types such as 'long'.
No, the fault here is the programmer's alone.
This program used both 'int' and 'long' types; why? What assumptions
where there about certain properties of long that weren't present for int?
You can't pin this one on MS, because Linux32 has them the same size
too. (Although if the intention was for long to hold an address, that
would work here.)
There /are/ cases where it makes sense to turn pointers into integers,
because you want to do weird things such as compact storage in large
arrays, or low-level stuff like checking alignments. The type
"uintptr_t" has existed for that purpose for 20 years. The C
preprocessor to check type sizes and make your own local equivalent to
"uintptr_t" has existed for 50 years. There has /never/ been an excuse
to cast a pointer to an int, or a long, or other guessed type and think
it is a good idea.
You are not alone in writing bad code like this, but that does not
excuse you.
(If you think the C is poor, look at the orginal BCPL code, somewhere in
the links here:
https://www.cl.cam.ac.uk/~mr10/Bench.html)
It might be the same width as 'int'; it might be double the width; it
might be anything so long as it's not smaller than 'int'.
So there is a clear advantage these days in an 'int' type that grows
with the word size and is the same size as an address. Although this
benchmark might have then gone wrong in assuming that 'int' was 32 bits. >>>
No, there is no sense in having the size of "int" related to the size of
a pointer, because the two have totally different purposes and should
not be mixed.
It makes life simpler in many, many situations where both have to exist
in the same location.
This hasn't always been possible (eg. 8086 with
16-bit ints and 32-bit pointers and floats); but it is now.
For example, I use bytecode which is a 64-bit array of values, where any element can variously be:
* A bytecode index (8 bits) ...
* ... usually fixed up as a pointer to a function handler (64 bits)
* An immediate 64-bit int operand
* An immediate 64-bit float operand
* A label address (64 bits)
* A pointer to a symbol table entry
* The address of a static variable
* The offset of a local variable
* A pointer to a string object
You get the idea.
"long" means "at least 32-bit". Nothing more, nothing less - and if you
use it to mean more, you are using it incorrectly.
Yes, it was probably poorly written too. But the language doesn't help.
The language lets you write bad code. What language does not?
C makes it easier in this case by not being strict with its type system.
This example was from the 1990s, but you can still write such code now.
On 30/08/2021 23:25, Bart wrote:
Note that even C compilers for smaller devices such as Z80 can be
rather specialist products.
Any idea why C for Z80 should be a specialist product?
On 30/08/2021 21:09, James Harris wrote:
On 30/08/2021 12:45, Bart wrote:
So don't be too hard on C's type system. Most of it is
there for a reason.
OK. It all depends on how much you want to emulate C in that regard, and
so inherit a lot of the same mess in its type system.
But it seems that you are keen to cover everything in one language, from
the tiniest microcontrollers, through desktop PCs and current
supercomputers, and up to massively large machines that need 128 bits to specify file sizes.
I'd say that's being a little ambitious.
(Are you still writing OSes too?)
you've been able to
avoid the complexity of all those issues by simply avoiding supporting such machines.
It wasn't my job to support every possible machine.
Note that even C compilers for smaller devices such as Z80 can be rather specialist products.
On 31/08/2021 16:38, Bart wrote:
It makes life simpler in many, many situations where both have to exist
in the same location.
What situations? When do you need to interpret a pointer as an integer,
or an integer as a pointer? (You can share memory space between them
with a union - that's a different thing.)
For example, I use bytecode which is a 64-bit array of values, where any
element can variously be:
* A bytecode index (8 bits) ...
* ... usually fixed up as a pointer to a function handler (64 bits)
* An immediate 64-bit int operand
* An immediate 64-bit float operand
* A label address (64 bits)
* A pointer to a symbol table entry
* The address of a static variable
* The offset of a local variable
* A pointer to a string object
You get the idea.
Yes - it's a union.
There is no requirement or even benefit in having pointers and integers
the same size.
C makes it easier in this case by not being strict with its type system.
This example was from the 1990s, but you can still write such code now.
C is a lot stricter with types than many people seem to think.
On 31/08/2021 16:32, David Brown wrote:
On 31/08/2021 16:38, Bart wrote:
It makes life simpler in many, many situations where both have to exist
in the same location.
What situations? When do you need to interpret a pointer as an integer,
or an integer as a pointer? (You can share memory space between them
with a union - that's a different thing.)
A union is a clunkier way of doing the same way. But you can't as easily access the int/float/pointer representation.
For example, I use bytecode which is a 64-bit array of values, where any >>> element can variously be:
* A bytecode index (8 bits) ...
* ... usually fixed up as a pointer to a function handler (64 bits) >>> * An immediate 64-bit int operand
* An immediate 64-bit float operand
* A label address (64 bits)
* A pointer to a symbol table entry
* The address of a static variable
* The offset of a local variable
* A pointer to a string object
You get the idea.
Yes - it's a union.
There is no requirement or even benefit in having pointers and integers
the same size.
There is a benefit in this case. On a system with 32-bit pointers, if I
still wanted 64-bit immediates, it would be too inefficient to have
64-bit bytecode as vast majority of entries wouldn't need it.
I'd have to use 32-bit bytecode with immediates spanning two operands,
or indexed via indices into a table.
The point is that when you want a /number/ - an integer - then 64-bit is
more than sufficient for most purposes while also not being too big to
handle efficiently on most modern cpus. You could almost say that about 32-bit, but definitely say it now we have 64-bit. This is why no one
has bothered making a 128-bit processor - there simply isn't any use for
such large numbers as plain integers.
I think that the language should allow any range. If the target does not
support the particular range, the compiler should deploy a library
implementation, giving a warning.
Don't misunderstand me here - there are use-cases for all kinds of sizes
and types. A language that intends to be efficient is going to have to
be able to handle arrays of smaller types. And there has to be ways to handle bigger types for their occasional usage. The point is that there
is little need for multiple sizes for "normal" everyday usage - a signed 64-bit type really will work for practically everything.
There is also a quotation from Bjarne Stroustrup along the lines of
every complicated language having a simple language at its core
struggling to get out.
On 30/08/2021 22:50, James Harris wrote:
Are you saying the C standards have an official definition of 'scalar'
(which is along the lines you mention)?
Yes. 6.2.5p21 in C17. There are official drafts of the C standards available freely (the actual published ISO standards cost money, but the final drafts are free and just as good). Google for C17 draft, document N2176, for example.
On 31/08/2021 11:57, David Brown wrote:
...
The point is that when you want a /number/ - an integer - then 64-bit is
more than sufficient for most purposes while also not being too big to
handle efficiently on most modern cpus. You could almost say that about
32-bit, but definitely say it now we have 64-bit. This is why no one
has bothered making a 128-bit processor - there simply isn't any use for
such large numbers as plain integers.
In terms of the main registers 64 bits is fine, especially when they are
used for memory addresses as well as integers. However, that's not the
only option. ICL made a machine with a 128-bit accumulator as far back
as the 1970s!
Today, ZFS works with 128-bit numbers. So does MD5, and probably others.
SHA-2 apparently works with sizes from 224 bits upwards.
https://en.wikipedia.org/wiki/SHA-2
I think that the language should allow any range. If the target does not >>> support the particular range, the compiler should deploy a library
implementation, giving a warning.
Don't misunderstand me here - there are use-cases for all kinds of sizes
and types. A language that intends to be efficient is going to have to
be able to handle arrays of smaller types. And there has to be ways to
handle bigger types for their occasional usage. The point is that there
is little need for multiple sizes for "normal" everyday usage - a signed
64-bit type really will work for practically everything.
These seem OK:
int 64
int 128
uint 224
uint 256
etc
If a language includes code to manipulate 64-bit integers on 16-bit
hardware I cannot see a problem with it manipulating 256-bit integers on 64-bit hardware.
If a language includes code to manipulate 64-bit integers on 16-bit
hardware I cannot see a problem with it manipulating 256-bit integers on 64-bit hardware.
On 31/08/2021 08:32, David Brown wrote:
On 30/08/2021 22:50, James Harris wrote:
...
Are you saying the C standards have an official definition of 'scalar'
(which is along the lines you mention)?
Yes. 6.2.5p21 in C17. There are official drafts of the C standards
available freely (the actual published ISO standards cost money, but the
final drafts are free and just as good). Google for C17 draft, document
N2176, for example.
That's really interesting. I didn't know that C was still under
development.
Have you seen C emulation code for even 128/128, based on 64-bit ops?
It's scary! It might be a bit shorter in ASM, but it can still be a lot
of work.
On 2021-08-31 20:37, Bart wrote:
Have you seen C emulation code for even 128/128, based on 64-bit ops?
It's scary! It might be a bit shorter in ASM, but it can still be a
lot of work.
It is easier with 32-bit because of handling the carry (since you have
no access to it in a high-level language) and because multiplication is simpler. But in general nothing scary, all algorithms are well known.
On 31/08/2021 19:30, James Harris wrote:
On 31/08/2021 08:32, David Brown wrote:
On 30/08/2021 22:50, James Harris wrote:
...
Are you saying the C standards have an official definition of 'scalar' >>>> (which is along the lines you mention)?
Yes. 6.2.5p21 in C17. There are official drafts of the C standards
available freely (the actual published ISO standards cost money, but the >>> final drafts are free and just as good). Google for C17 draft, document >>> N2176, for example.
That's really interesting. I didn't know that C was still under
development.
What they're planning to add actually probably isn't very interesting
(it never is).
For example being able to do this in a function definition (in C2x):
void fn(int a, int, int c) {
}
Can you spot what it is?
Yes, you can leave out a parameter name without it being an error. (So
that if you do it inadvertently, you might silently end up using some
global of that name instead!)
On 31/08/2021 20:41, Bart wrote:
On 31/08/2021 19:30, James Harris wrote:
On 31/08/2021 08:32, David Brown wrote:
On 30/08/2021 22:50, James Harris wrote:
...
Are you saying the C standards have an official definition of 'scalar' >>>>> (which is along the lines you mention)?
Yes. 6.2.5p21 in C17. There are official drafts of the C standards >>>> available freely (the actual published ISO standards cost money, but the >>>> final drafts are free and just as good). Google for C17 draft, document >>>> N2176, for example.
That's really interesting. I didn't know that C was still under
development.
What they're planning to add actually probably isn't very interesting
(it never is).
That's intentional. C has been mostly stable since C99 - and stability
and backwards compatibility are two of its most important features as a language. Changes are only made if they are really important, tried and tested (in mainstream compilers and/or other languages, primarily C++),
give something that can't easily be achieved in other ways, and are not
going to risk conflicts with existing code.
For example being able to do this in a function definition (in C2x):
void fn(int a, int, int c) {
}
Can you spot what it is?
Yes, you can leave out a parameter name without it being an error. (So
that if you do it inadvertently, you might silently end up using some
global of that name instead!)
Trust you to find potential problems in every feature of C.
On 31/08/2021 19:03, James Harris wrote:
If a language includes code to manipulate 64-bit integers on 16-bit
hardware I cannot see a problem with it manipulating 256-bit integers
on 64-bit hardware.
Have you seen C emulation code for even 128/128, based on 64-bit ops?
It's scary! It might be a bit shorter in ASM, but it can still be a lot
of work.
Below I've listed all the ops in my IL that need to deal with 64-bit
integers (floats excluded), and which, if fully supported, need to also
deal with i128/u128 types.
At the moment, in my regular compiler, 128-bit support has many holes. Notably for divide; it only has i128/i64, which is needed to be able to
print 128-bit values.
BTW your u224 type is likely to need 64-bit alignment, so will probably occupy 256 bits anyway. So I can't see the point of that if you've also already got a 256-bit type.
Remember support will be hard enough without dealing with odd sizes.
On 31/08/2021 19:37, Bart wrote:
On 31/08/2021 19:03, James Harris wrote:
...
If a language includes code to manipulate 64-bit integers on 16-bit
hardware I cannot see a problem with it manipulating 256-bit integers
on 64-bit hardware.
Have you seen C emulation code for even 128/128, based on 64-bit ops?
It's scary! It might be a bit shorter in ASM, but it can still be a
lot of work.
Below I've listed all the ops in my IL that need to deal with 64-bit
integers (floats excluded), and which, if fully supported, need to
also deal with i128/u128 types.
I've got some C for unsigned wide unsigned operations. I think it's my
own code from a number of years ago (2015). Division is a bit hairy -
but none of the code is scary!
It all looks quite logical. It covers
shifts, gets, sets, add, sub, mul, div, rem
The bitwise operations and comparisons you mention would be trivial
because the data structure is just an array
ui16 value[4];
though I wrap it in a struct so the compiler will pass it by value. I
think the code was written to give me 64-bit numbers in a 16-bit
environment. That's why the array length is 4.
And note that it's 4, not 2. Theoretically, the code would work for ANY
size of integer as long as it fits into a whole number of words. Add
some overflow handling and signed integers (which would not be so easy)
and I may have the arbitrary-length arithmetic you keep telling me is so difficult to do!
At the moment, in my regular compiler, 128-bit support has many holes.
Notably for divide; it only has i128/i64, which is needed to be able
to print 128-bit values.
BTW your u224 type is likely to need 64-bit alignment, so will
probably occupy 256 bits anyway. So I can't see the point of that if
you've also already got a 256-bit type.
That's OK. I mentioned u224 because SHA-2 uses such a size. IOW there's
a real-world requirement for such a value. Another reason to provide
integers of arbitrary size - especially if it costs almost nothing.
Remember support will be hard enough without dealing with odd sizes.
Why would support be hard?
In terms of storage I expect I'd round up the size at least to a whole
number of words and treat incursion into upper bits in the same way as overflow.
I feel I'm just scratching the surface here.
On 2021-09-01 21:22, Bart wrote:
[...]
I feel I'm just scratching the surface here.
Yep, when you ignore mathematics, it bites you in the butt...
I you paid respect to it, you would notice that modular numbers never overflow. If the result is outside the range 0..K-1, you subtract (or
add) K until it is in.
For integers there is no problem either. As with modular numbers you do
all computations using a wider machine range. Only when you have to
assign the result to a variable or pass it to a subprogram you check if
it is in the range. That is all.
On 01/09/2021 20:54, Dmitry A. Kazakov wrote:
On 2021-09-01 21:22, Bart wrote:
[...]
I feel I'm just scratching the surface here.
Yep, when you ignore mathematics, it bites you in the butt...
I you paid respect to it, you would notice that modular numbers never
overflow. If the result is outside the range 0..K-1, you subtract (or
add) K until it is in.
For integers there is no problem either. As with modular numbers you
do all computations using a wider machine range. Only when you have to
assign the result to a variable or pass it to a subprogram you check
if it is in the range. That is all.
It's not that simple. Say you have variables of 8 bits each, but your language says that all calculations will be done at 64 bits.
So what happens when you calculate A << B >> C; do you do everything at
64 bits and then only truncate to 8 bits when storing in some 8-bit destination?
Or do you have to do that truncation after every binary operand?
In
which case, you have to bring it back up to a valid 64-bit value for the
next operation (eg. sign extend a possibly 13-bit result).
What about multiplying a 13-bit value by a 17-bit one; what should the intermediate result be truncated to?
On 2021-09-01 22:50, Bart wrote:
On 01/09/2021 20:54, Dmitry A. Kazakov wrote:
On 2021-09-01 21:22, Bart wrote:
[...]
I feel I'm just scratching the surface here.
Yep, when you ignore mathematics, it bites you in the butt...
I you paid respect to it, you would notice that modular numbers never
overflow. If the result is outside the range 0..K-1, you subtract (or
add) K until it is in.
For integers there is no problem either. As with modular numbers you
do all computations using a wider machine range. Only when you have
to assign the result to a variable or pass it to a subprogram you
check if it is in the range. That is all.
It's not that simple. Say you have variables of 8 bits each, but your
language says that all calculations will be done at 64 bits.
No, it says that the result of expression is OK if mathematically correct.
So what happens when you calculate A << B >> C; do you do everything
at 64 bits and then only truncate to 8 bits when storing in some 8-bit
destination?
Shifts are not arithmetic operations, if you meant shifts.
defined on either modular or integers. E.g. how do you shift mod 7
number? You can only define a numeric equivalent of bit shift for 2**N modular numbers.
Or do you have to do that truncation after every binary operand?
Yes, you implement bit shift as defined, where is a problem?
In which case, you have to bring it back up to a valid 64-bit value
for the next operation (eg. sign extend a possibly 13-bit result).
What for?
What about multiplying a 13-bit value by a 17-bit one; what should the
intermediate result be truncated to?
You cannot, it is a type error.
Or take as the example any operation resulting in higher order bits that should be discarded, affecting results of later ones.
In which case, you have to bring it back up to a valid 64-bit value
for the next operation (eg. sign extend a possibly 13-bit result).
What for?
You're really not interested in how real machines work, are you?
If the real machine only has 64 bits operations, and you are working
with 17-bit values, then you need something meaningful in those top 47
bits.
If you are working with ranges that aren't an exact number of bits, eg.
the range 0 to 9999, appox 14-15 bits, then you still need a full 64-bit value to be presented to the processor.
This is all a lot of mucking about that is needed compared with using
ranges of values that a processor can directly deal with, such as 0 to
65535. Or compared with using an i64 range only.
What about multiplying a 13-bit value by a 17-bit one; what should
the intermediate result be truncated to?
You cannot, it is a type error.
Why?
It's not that simple. Say you have variables of 8 bits each, but your language says that all calculations will be done at 64 bits.
So what happens when you calculate A << B >> C; do you do everything at
64 bits and then only truncate to 8 bits when storing in some 8-bit destination?
Or do you have to do that truncation after every binary operand? In
which case, you have to bring it back up to a valid 64-bit value for the
next operation (eg. sign extend a possibly 13-bit result).
Those decisions on how intermediate values are handled can give
different results. And can make it slower.
What about multiplying a 13-bit value by a 17-bit one; what should the intermediate result be truncated to?
On 01/09/2021 20:22, Bart wrote:
Maybe that's 'dynamic' precision. I see types like
int 17
int 81
as having a width which is arbitrary but fixed.
It's not necessarily hard, just less efficient,
Well, compare the HLL programmer having to create routines to handle
large or unusually sized integers with having the compiler do it.
Surely the bottom line is that the compiler is best placed to produce
the most efficient code, especially as it will have knowledge of both
what's required and the target architecture.
Again, with the sign already extended that should not be a problem.
* If you want wraparound overflow, this this need to be specially
programmed.
I'm not sure what that means. If the number's limit is at a power of 2
then what I think you might mean be wraparound should happen automatically.
At least to start with I think I'd try to limit the number of categories
but in a different way to you, e.g.
* Integers of a size which the machine is built to deal with
* Integers which are a whole number (2 or more) of such units
* Integers which have odd bits
The ones the machine can deal with - e.g. 8, 16, 32, 64 - would be
handled by normal routines.
The ones which needed 2 or more such units could be handled by long arithmetic.
The others would additionally need tweaking for the extra bits.
On 2021-09-02 13:14, Bart wrote:
Or take as the example any operation resulting in higher order bits
that should be discarded, affecting results of later ones.
Again, the result of expression is either mathematically correct or you
get an exception. What is unclear here? Take the formula. Compute it on
paper using rules of mathematics. That is you answer.
In which case, you have to bring it back up to a valid 64-bit value
for the next operation (eg. sign extend a possibly 13-bit result).
What for?
You're really not interested in how real machines work, are you?
So, you are interested how Gigabyte PSUs burn? They do brightly with a
lot of noise. Quite entertaining. Next?
If the real machine only has 64 bits operations, and you are working
with 17-bit values, then you need something meaningful in those top 47
bits.
I do not know what a 17-bit value is,
to implement integer and modular arithmetic on any existing machine.
If you are working with ranges that aren't an exact number of bits,
eg. the range 0 to 9999, appox 14-15 bits, then you still need a full
64-bit value to be presented to the processor.
Whatever machine number capable to represent all range.
Why?
Because it is two different types, obviously.
All problems you describe are imaginary. Once the semantics is defined
by attributing correct types, no ambiguity exists.
On 01/09/2021 18:13, James Harris wrote:
On 31/08/2021 19:37, Bart wrote:
On 31/08/2021 19:03, James Harris wrote:
ui16 value[4];
And note that it's 4, not 2. Theoretically, the code would work for
ANY size of integer as long as it fits into a whole number of words.
Add some overflow handling and signed integers (which would not be so
easy) and I may have the arbitrary-length arithmetic you keep telling
me is so difficult to do!
Arbitrary precision is different from fixed precision with dedicated
routines or inline code for each width of number.
It's not necessarily hard, just less efficient,
Remember support will be hard enough without dealing with odd sizes.
Why would support be hard?
In terms of storage I expect I'd round up the size at least to a whole
number of words and treat incursion into upper bits in the same way as
overflow.
So how would it work even with a simple example of 53 bits. You round up
the storage to 64 bits, so it might as well be i64. Arithmetic would
likely also be done using the machine's 64-bit operations.
But you now also need:
* To truncate the results to 53 bits, or check those 13 bits don't
indicate overflow (as the processor flags won't tell you)
* When loading signed 53 bits, presumably as two's complement, your sign
will be in bit 52, and bits 53 to 63 need to be properly signed-extended
if that pattern is not already stored in memory.
* You to take extra care with right shifts: you want copies of bit 52
shifted to bit 51 etc, but the machine's SAR instruction will assume the
sign bit is bit 63.
* If you want wraparound overflow, this this need to be specially
programmed.
I feel I'm just scratching the surface here. I have enough trouble just
with the following 3 categories:
* Short (8/16/32 bits)
* Normal (64 bits)
* Wide (128 bits)
With signed/unsigned for each type as another factor.
Most arithmetic will be done on Normal/Wide (64 or 128 bits); they are
only done on Short types for in-place arithmetic: A[i]+:=x where A is a byte array for example.
Imagine if I also had to deal with 9-15, 17-31, 33-62, and 65-127 bits!
On 02/09/2021 13:21, James Harris wrote:
On 01/09/2021 20:22, Bart wrote:
Surely the bottom line is that the compiler is best placed to produce
the most efficient code, especially as it will have knowledge of both
what's required and the target architecture.
If I'm looking at struct layout and one of the elements has type ui65,
then I would find that worrying. What's going on in that layout? If the programmer is specifying a type so exactly, then perhaps they are also concerned about the precise layout!
Same thing with an array of ui63. Would a million-element array occupy 8,000,000 bytes or 7,875,000?
I think if such types were specified with a value range, say 0 ..
2**65-1 or 0..2**63-1 [I should have used smaller examples!], or ones
not using powers of 2, that would suggest a programmer who doesn't care
about bit-representation or exact layouts so much, compared with someone
who writes ui65 or ui63.
They might be put out if they end up with ui128/ui72, or ui64.
I suppose what I'm saying is that when you specify an exact bit-width,
you should get that. Which means you then have to implement that array
with 63-bit elements.
Again, with the sign already extended that should not be a problem.
You need to be sure that the previous operation has the sign (on the msb
of the machine word) and intermediate bits matching the sign bit of your narrower field.
* If you want wraparound overflow, this this need to be specially
programmed.
I'm not sure what that means. If the number's limit is at a power of 2
then what I think you might mean be wraparound should happen
automatically.
Only if you truncate. If using a u8 type within a u64 register, then 255
+ 1 will yield 256, not 0.
For many subsequent ops, that doesn't matter. But if the next op shifts
right by 1 bit, you'll end up with 80 rather than 0.
At least to start with I think I'd try to limit the number of
categories but in a different way to you, e.g.
* Integers of a size which the machine is built to deal with
* Integers which are a whole number (2 or more) of such units
* Integers which have odd bits
The ones the machine can deal with - e.g. 8, 16, 32, 64 - would be
handled by normal routines.
The ones which needed 2 or more such units could be handled by long
arithmetic.
The others would additionally need tweaking for the extra bits.
I think that's a reasonable approach - put all the tricky ones to one side!
And with the middle category, once you've implemented that one, then you
will probably want to streamline the ops that are only twice natural
word size.
Because, if doing a simple op like & (bitwise and) for example, you
don't really want to run a loop over two elements either inline or in a called routine.
(But maybe your implementation already has specialised branches for
that, which can be recognised at compile-time.)
On 02/09/2021 12:44, Dmitry A. Kazakov wrote:
I do not know what a 17-bit value is,
Let's say it's a value in the range representable by a u17 types, that
is 0 to 131071.
If you're working with a conventional processor, then the first problem
you have is that storage is defined in whole multiples of 8-bit bytes.
The second is that load and store ops are in terms of 8, 16, 32, 64 or sometimes 128 bit units. An auxiliary one is that such ops may have
alignment needs.
The third is that the registers available for integer arithmetic are of
8, 16, 32 and 64 bits.
A fourth might be that arithmetic operations may not work directly on
all of those (but on x64 they do).
Notice how that 17-bit type really doesn't fit naturally into that model.
So I don't have such sizes as 'first-class' types. Bitfields are
generally dealt with extracting them from a more normal type, or
inserting them.
Whatever machine number capable to represent all range.
So, 16 bits for that range, or 64-bits if your language semantics say
that all intermediate calculations must be that size.
(Most C implementations use 32-bit calculations.)
If you don't have such rules, then something like this becomes ambiguous:
byte a := 255 > print a + 1
Does this display 0, 256, or report an runtime error?
All problems you describe are imaginary. Once the semantics is defined
by attributing correct types, no ambiguity exists.
I think you're talking nonsense.
Most people can add a 3-digit decimal number like 721 to a 4-digit one
like 9485 to yield the 5-digit 10179, without being concerned about the
being different types!
They /could/ represent different quantities, but here they are pure
numbers.
Numbers representing different quantites could well have different
types, which a language could help out with (eg. to stop you adding £721
and 9485 grams), but in this case, they are simply width-restricted.
721 could be 3 digits because it's not expected to store more than 999,
so it saves storage.
But the language has to say so. James' proposals are not clear on that matter.
On 2021-09-02 14:50, Bart wrote:
On 02/09/2021 12:44, Dmitry A. Kazakov wrote:
I do not know what a 17-bit value is,
Let's say it's a value in the range representable by a u17 types, that
is 0 to 131071.
If you're working with a conventional processor, then the first
problem you have is that storage is defined in whole multiples of
8-bit bytes.
Why is that a problem? Define u17.
The second is that load and store ops are in terms of 8, 16, 32, 64 or
sometimes 128 bit units. An auxiliary one is that such ops may have
alignment needs.
Why is that a problem again?
The third is that the registers available for integer arithmetic are
of 8, 16, 32 and 64 bits.
Ditto.
A fourth might be that arithmetic operations may not work directly on
all of those (but on x64 they do).
And?
Notice how that 17-bit type really doesn't fit naturally into that model.
What model? Again, what is the problem with implementing a mod 2**17
type, if u17 means that?
So I don't have such sizes as 'first-class' types. Bitfields are
generally dealt with extracting them from a more normal type, or
inserting them.
Bit fields have nothing to do with any integer types.
If you don't have such rules, then something like this becomes ambiguous:
byte a := 255 > print a + 1
Does this display 0, 256, or report an runtime error?
I don't know what is byte here. You must annotate the exact semantics.
Provided print takes byte as the argument, then
1. If byte is an integer range -256..255, the result is an exception.
2. If byte it mod 256, the result is 0
3. If byte is a memory unit octet, the result is type error.
All problems you describe are imaginary. Once the semantics is
defined by attributing correct types, no ambiguity exists.
I think you're talking nonsense.
Most people can add a 3-digit decimal number like 721 to a 4-digit one
like 9485 to yield the 5-digit 10179, without being concerned about
the being different types!
I do not know what "3-digit" means when you attach it to "decimal
number."
Pure numbers exist only in books on mathematics.
Numbers representing different quantites could well have different
types, which a language could help out with (eg. to stop you adding
£721 and 9485 grams), but in this case, they are simply width-restricted.
I have no idea what "width-restricted" means. Define types. Annotate all values with types,
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
On 2021-09-02 14:50, Bart wrote:
A fourth might be that arithmetic operations may not work directly on
all of those (but on x64 they do).
And?
OK. So according to you these types are no problem at all. You can use a
u17 type just as easily as a 16-bit or 32-bit type.
Perhaps you'd like to show some actual assembly code then this fragment:
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
On 2021-09-02 14:50, Bart wrote:
On 02/09/2021 12:44, Dmitry A. Kazakov wrote:
I do not know what a 17-bit value is,
Let's say it's a value in the range representable by a u17 types,
that is 0 to 131071.
If you're working with a conventional processor, then the first
problem you have is that storage is defined in whole multiples of
8-bit bytes.
Why is that a problem? Define u17.
The second is that load and store ops are in terms of 8, 16, 32, 64
or sometimes 128 bit units. An auxiliary one is that such ops may
have alignment needs.
Why is that a problem again?
The third is that the registers available for integer arithmetic are
of 8, 16, 32 and 64 bits.
Ditto.
A fourth might be that arithmetic operations may not work directly on
all of those (but on x64 they do).
And?
OK. So according to you these types are no problem at all. You can use a
u17 type just as easily as a 16-bit or 32-bit type.
Perhaps you'd like to show some actual assembly code then this fragment:
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
Bit fields have nothing to do with any integer types.
It sounds they have nothing to do with anything according to you.
If you don't have such rules, then something like this becomes
ambiguous:
byte a := 255 > print a + 1
Does this display 0, 256, or report an runtime error?
I don't know what is byte here. You must annotate the exact semantics.
Provided print takes byte as the argument, then
1. If byte is an integer range -256..255, the result is an exception.
2. If byte it mod 256, the result is 0
3. If byte is a memory unit octet, the result is type error.
In my language, the printed value is 256.
I do not know what "3-digit" means when you attach it to "decimal
number."
It's a number with 3 digits,
I have no idea what "width-restricted" means. Define types. Annotate
all values with types,
No, width here does not define a new type as you like to think of types.
On 02/09/2021 17:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
On 2021-09-02 14:50, Bart wrote:
...
A fourth might be that arithmetic operations may not work directly
on all of those (but on x64 they do).
And?
OK. So according to you these types are no problem at all. You can use
a u17 type just as easily as a 16-bit or 32-bit type.
Perhaps you'd like to show some actual assembly code then this fragment:
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
What is supposed to happen on overflow
and are there any particular
optimisation goals?
On 02/09/2021 17:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
On 2021-09-02 14:50, Bart wrote:
...
A fourth might be that arithmetic operations may not work directly
on all of those (but on x64 they do).
And?
OK. So according to you these types are no problem at all. You can use
a u17 type just as easily as a 16-bit or 32-bit type.
Perhaps you'd like to show some actual assembly code then this fragment:
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
What is supposed to happen on overflow and are there any particular optimisation goals?
On 02/09/2021 18:47, James Harris wrote:
On 02/09/2021 17:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
On 2021-09-02 14:50, Bart wrote:
...
A fourth might be that arithmetic operations may not work directly
on all of those (but on x64 they do).
And?
OK. So according to you these types are no problem at all. You can use
a u17 type just as easily as a 16-bit or 32-bit type.
Perhaps you'd like to show some actual assembly code then this fragment: >>>
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
Presumably it is roughly the same as you'd get in C with :
uint32_t a, b, c;
a = (b + c) % (1u << 17);
Types in a high level language are not constructs in assembly.
They
don't have to correspond to matching hardware or assembly-level
features. Just as a "bool" or an enumerated type in C is going to be
stored in a register or memory in exactly the same way as a the
processor might store a number, so a "u17" type (assuming that means a 0
.. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
integer. That's likely to be the same storage as a uint32_t. (Though
one processor I use has 20-bit registers, which would be more efficient
than using two of its 16-bit registers.)
It is the language semantics that defined the types and their
operations. Then it is up to the compiler - not the high level
programmer - to figure out how to turn that into efficient assembly.
What is supposed to happen on overflow and are there any particular
optimisation goals?
There are no overflows on a modular type.
On 02/09/2021 19:56, David Brown wrote:
On 02/09/2021 18:47, James Harris wrote:
On 02/09/2021 17:01, Bart wrote:
Perhaps you'd like to show some actual assembly code then this
fragment:
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
Presumably it is roughly the same as you'd get in C with :
uint32_t a, b, c;
a = (b + c) % (1u << 17);
Types in a high level language are not constructs in assembly.
With the 17-bit example there were decisions to be made regarding how
and where standalone 17-bit variables are stored in memory; whether each starts on a byte boundary or in the middle, and how exactly they will be loaded into registers.
They
don't have to correspond to matching hardware or assembly-level
features. Just as a "bool" or an enumerated type in C is going to be
stored in a register or memory in exactly the same way as a the
processor might store a number, so a "u17" type (assuming that means a 0
.. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
integer. That's likely to be the same storage as a uint32_t. (Though
one processor I use has 20-bit registers, which would be more efficient
than using two of its 16-bit registers.)
What about a uint21 type then? There will always be odd types that don't match! Unless a processor has a completely flexible ALU with any
bitwidth suported.
With the 17-bit example there were decisions to be made regarding how
and where standalone 17-bit variables are stored in memory;
whether each
starts on a byte boundary or in the middle, and how exactly they will be loaded into registers.
My point had been that the resulting code would be very different
compared with something like a 16-bit type which is an easy match for
the hardware.
This will affect the design of a language that may include such odd types.
They
don't have to correspond to matching hardware or assembly-level
features. Just as a "bool" or an enumerated type in C is going to be
stored in a register or memory in exactly the same way as a the
processor might store a number, so a "u17" type (assuming that means a 0
.. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
integer. That's likely to be the same storage as a uint32_t. (Though
one processor I use has 20-bit registers, which would be more efficient
than using two of its 16-bit registers.)
What about a uint21 type then? There will always be odd types that don't match! Unless a processor has a completely flexible ALU with any
bitwidth suported.
What is supposed to happen on overflow and are there any particular
optimisation goals?
There are no overflows on a modular type.
/He/ introduced modular types. I call them merely unsigned types which
/can/ overflow, and usually do so by wrapping.
In any case, I understood modular types could have any range at all, for example with values from 50 to 100 inclusive. Then, ensuring that
overflowing the limit of 100 wraps back to 50 can be fiddly without
hardware assistance.
On 2021-09-02 18:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
And?
OK. So according to you these types are no problem at all. You can use
a u17 type just as easily as a 16-bit or 32-bit type.
Right.
If you don't have such rules, then something like this becomes
ambiguous:
byte a := 255 > print a + 1
Does this display 0, 256, or report an runtime error?
I don't know what is byte here. You must annotate the exact semantics.
Provided print takes byte as the argument, then
1. If byte is an integer range -256..255, the result is an exception.
2. If byte it mod 256, the result is 0
3. If byte is a memory unit octet, the result is type error.
In my language, the printed value is 256.
I have no doubt that your language is broken.
I do not know what "3-digit" means when you attach it to "decimal
number."
It's a number with 3 digits,
Numbers have no digits.
On 2021-09-02 22:26, Bart wrote:
With the 17-bit example there were decisions to be made regarding how
and where standalone 17-bit variables are stored in memory;
You are confusing storing values in memory with implementation of
operations on them. You can store the value packed and use 1024-bit arithmetic to implement it. These are semantically independent choices.
whether each starts on a byte boundary or in the middle, and how
exactly they will be loaded into registers.
The best way possible for the target machine unless the programmer tells otherwise.
The following is a legal Ada program:
with Ada.Text_IO;
procedure Test is
type u17_1 is mod 2**17; -- Use defaults
type u17_2 is mod 2**17 with Size => 17; -- 17-bit, if matters
type Packed is record
H : Boolean;
I : u17_2; -- Here it matters, because the record is packed
J : Boolean;
end record with Pack => True;
X : u17_1 := 123;
Y : u17_2 := 123;
Z : Packed := (False, 123, True);
begin
Ada.Text_IO.Put_Line ("X'Size=" & Integer'Image (X'Size));
Ada.Text_IO.Put_Line ("Y'Size=" & Integer'Image (Y'Size));
Ada.Text_IO.Put_Line ("Z.I'Size=" & Integer'Image (Z.I'Size));
end Test;
It will print
32
32
17
The compiler uses the best possible machine representation of 32-bit
except the case when I instructed it to use exactly 17 bits.
My point had been that the resulting code would be very different
compared with something like a 16-bit type which is an easy match for
the hardware.
Why should anybody care?
This will affect the design of a language that may include such odd
types.
Why should it affect anything?
They
don't have to correspond to matching hardware or assembly-level
features. Just as a "bool" or an enumerated type in C is going to be
stored in a register or memory in exactly the same way as a the
processor might store a number, so a "u17" type (assuming that means a 0 >>> .. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
integer. That's likely to be the same storage as a uint32_t. (Though >>> one processor I use has 20-bit registers, which would be more efficient
than using two of its 16-bit registers.)
What about a uint21 type then? There will always be odd types that
don't match! Unless a processor has a completely flexible ALU with any
bitwidth suported.
Did you read what David wrote? It starts with: "They don't have to
correspond to matching hardware or assembly-level features."
What is supposed to happen on overflow and are there any particular
optimisation goals?
There are no overflows on a modular type.
/He/ introduced modular types. I call them merely unsigned types which
/can/ overflow, and usually do so by wrapping.
https://mathworld.wolfram.com/ModularArithmetic.html https://en.wikipedia.org/wiki/Modular_arithmetic
In any case, I understood modular types could have any range at all,
for example with values from 50 to 100 inclusive. Then, ensuring that
overflowing the limit of 100 wraps back to 50 can be fiddly without
hardware assistance.
Modulo 101 wraps to zero.
OK. But my type was specifically 50 to 100. But let's change it to 10050
to 10100, and I want 10100+1 to wrap to 10050. Now you additionally have
to consider whether such a type can be represented within 8 bits (or
possibly 6), and whether it will need 16 bits.
My decision? Just use u16! And let user-code take care of any wrapping,
like it has to for most things.
On 02/09/2021 18:02, Dmitry A. Kazakov wrote:
On 2021-09-02 18:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
And?
OK. So according to you these types are no problem at all. You can
use a u17 type just as easily as a 16-bit or 32-bit type.
Right.
OK.... But since you haven't shown any real code for a real processor
based on some detailed proposal of the semantics, I don't believe you.
If you don't have such rules, then something like this becomes
ambiguous:
byte a := 255 > print a + 1
Does this display 0, 256, or report an runtime error?
I don't know what is byte here. You must annotate the exact semantics.
Provided print takes byte as the argument, then
1. If byte is an integer range -256..255, the result is an exception.
2. If byte it mod 256, the result is 0
3. If byte is a memory unit octet, the result is type error.
In my language, the printed value is 256.
I have no doubt that your language is broken.
Well I make the rules for it so how can it be broken?
I do not know what "3-digit" means when you attach it to "decimal
number."
It's a number with 3 digits,
Numbers have no digits.
Mine do.
But ... why am I forced to explain this stuff, what are you, 7?
On 2021-09-03 00:08, Bart wrote:
On 02/09/2021 18:02, Dmitry A. Kazakov wrote:
On 2021-09-02 18:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
And?
OK. So according to you these types are no problem at all. You can
use a u17 type just as easily as a 16-bit or 32-bit type.
Right.
OK.... But since you haven't shown any real code for a real processor
based on some detailed proposal of the semantics, I don't believe you.
with Ada.Text_IO; use Ada.Text_IO;
procedure Test is
type u17_2 is mod 2**17 with Size => 17;
type Packed is record
H : Boolean;
I : u17_2;
J : Boolean;
end record with Pack => True;
Z : Packed := (False, 123, True);
begin
Put_Line ("Z.I'Size=" & Integer'Image (Z.I'Size));
Put_Line ("Increment=" & u17_2'Image (Z.I + 1));
end Test;
Prints:
Z.I'Size= 17
Increment= 124
I leave disassembling to you as an exercise.
Well I make the rules for it so how can it be broken?
Broke rules, broken language.
The debate is not about whether it's possible at all. It is:
So having 255 + 1 = 256 is broken?!
In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!
On 2021-09-03 13:34, Bart wrote:
The debate is not about whether it's possible at all. It is:
They why were you keeping asking how to do it?
So having 255 + 1 = 256 is broken?!
Yes when the value 256 does not belong to the type.
In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!
Which is exactly right.
2 + 2 = 0 (mod 4)
On 03/09/2021 08:23, Dmitry A. Kazakov wrote:
On 2021-09-03 00:08, Bart wrote:
Ada unoptimised code for Z.I:=Z.I+1 using godbolt.org (optimising
eliminates all the code) (looks like 32-bit):
Well I make the rules for it so how can it be broken?
Broke rules, broken language.
So having 255 + 1 = 256 is broken?!
In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!
If I try the same in my language:
record Packed =
int32 cont : (x:2, y:2)
end
Packed z
z.x:=2
z.y:=2
fprintln "# + # is #", z.x, z.y, z.x+z.y
It says:
2 + 2 is 4
I think I'll stick with mine, thank you very much!
On 03/09/2021 14:05, Dmitry A. Kazakov wrote:
On 2021-09-03 13:34, Bart wrote:
The debate is not about whether it's possible at all. It is:
They why were you keeping asking how to do it?
Because you keep saying that these odd types are just as efficient and
as easy to implement as machine-friendly types.
So having 255 + 1 = 256 is broken?!
Yes when the value 256 does not belong to the type.
In your Ada code, if I make the field 'u2' (mod 2**2 and size 2), and
store 2 into Z.I, then print Z.I+Z.I, it tells me that 2 + 2 = 0!
Which is exactly right.
2 + 2 = 0 (mod 4)
OK, I will get rid of that mod type and use this:
type u2 is range 0..3;
Now 2 + 2 = 4 when though I'm adding two u2 types and even though I'm
using u2'Image on that 4.
Exactly the same as my language, which you said was broken. Here, run this:
with Ada.Text_IO; use Ada.Text_IO;
procedure Test is
type byte is range 0..255;
Z: byte := 255;
begin
Put_Line (byte'Image (Z+1));
end Test;
This displays 256,
which you said was wrong for my language:
byte Z:=255
println Z+1
On 03/09/2021 13:34, Bart wrote:
On 03/09/2021 08:23, Dmitry A. Kazakov wrote:
On 2021-09-03 00:08, Bart wrote:
Ada unoptimised code for Z.I:=Z.I+1 using godbolt.org (optimising
eliminates all the code) (looks like 32-bit):
It would be helpful here if Dmitry wrote some Ada code that has
side-effects, but does not include such big functions as printing, nor
should it be a complete program.
For example, if i wanted to show how a C compiler would handle
incrementing a 32-bit int, I could write :
#include <stdint.h>
volatile int32_t vx, vy;
void foo1(void) {
int32_t x = vx;
x++;
vy = x;
}
int32_t foo2(int32_t a) {
int32_t x = a;
x++;
return x;
}
Whether "foo1" or "foo2" styles are used is not that important. What
/is/ important is that you can run the compiler with optimisation
enabled, since comparing unoptimised code for quality, size or speed is utterly meaningless.
So having 255 + 1 = 256 is broken?!
Yes, if the type in question is "unsigned modulo 2 ** 8". Then the
correct result for 255 + 1 is 0.
I think I'll stick with mine, thank you very much!
Your code is wrong - /if/ you have types that have the same specific semantics, such as modulo wrapping.
In Ada, it is quite a lot simpler:
1. If a, b and c don't have compatible types, it's a compiler error.
I'm guessing that your language is basically like C, except that I
expect you to have wrapping semantics for signed types, and your "int"
is 64-bit on 64-bit systems. Is that right?
While I am not a big fan of Ada in general - it's far too wordy for my
tastes - I think its arithmetic semantics are clearer, more consistent,
and more flexible.
On 03/09/2021 16:19, David Brown wrote:
On 03/09/2021 13:34, Bart wrote:
On 03/09/2021 08:23, Dmitry A. Kazakov wrote:
On 2021-09-03 00:08, Bart wrote:
Ada unoptimised code for Z.I:=Z.I+1 using godbolt.org (optimising
eliminates all the code) (looks like 32-bit):
It would be helpful here if Dmitry wrote some Ada code that has
side-effects, but does not include such big functions as printing, nor
should it be a complete program.
For example, if i wanted to show how a C compiler would handle
incrementing a 32-bit int, I could write :
#include <stdint.h>
volatile int32_t vx, vy;
void foo1(void) {
int32_t x = vx;
x++;
vy = x;
}
int32_t foo2(int32_t a) {
int32_t x = a;
x++;
return x;
}
Whether "foo1" or "foo2" styles are used is not that important. What
/is/ important is that you can run the compiler with optimisation
enabled, since comparing unoptimised code for quality, size or speed is
utterly meaningless.
Extracting and inserting bitfields will always have extra bit-twiddling.
And there will be extra rules as to how they will be laid out.
So having 255 + 1 = 256 is broken?!
Yes, if the type in question is "unsigned modulo 2 ** 8". Then the
correct result for 255 + 1 is 0.
The type of what? "+" is defined in my language between two matching
types from this set (pointers not shown):
i64 u64 r32 r64 i128 u128
Operands of other types will be converted to suitable types as needed,
but the result will be one of these as well.
(In-place "+", ie. augmented assignment, has different rules.)
I think I'll stick with mine, thank you very much!
Your code is wrong - /if/ you have types that have the same specific
semantics, such as modulo wrapping.
Yes it has modulo wrapping for unsigned types. But that happens at this point:
18446744073709551615 + 1
Those 8-bit or 2-bit values are promoted to u64, and both 255 and 3 are
a long way below that threshold.
In Ada, it is quite a lot simpler:
1. If a, b and c don't have compatible types, it's a compiler error.
I use this approach for more elaborate types, such as user-defined types.
Other languages make it more complicated with classes and inheritance
and overloading of multi-parameter functions. Then it makes simple
numeric promotions seem like child's play!
When I'm working with just numbers, then I don't need restrictions:
x := 1e1'000'000L
y := 3
println x
println x + y.[0]
(Dynamic code.) Here I'm adding x, a 1 million-digit number, with the
value of the least significant bit of x. The output is:
1.e1000000
1000 .... 0001
No fuss. Ada on the other hand would probably have kittens.
I'm guessing that your language is basically like C, except that I
expect you to have wrapping semantics for signed types, and your "int"
is 64-bit on 64-bit systems. Is that right?
Yes. In the past I didn't have promotions. So if A and B were byte u8
types, then arithmetic was done at 8 bits, and 255 + 1 would give 0.
This was useful on smaller devices where 8-bit add was more efficient
than 16 bits or more.
It's not so useful now, in an 'open' expression when the result is not
put back into a byte type, but used more generally. And it gives
surprising results with:
A + 1
when A is 255 for example.
(Also, I got the impression that not all processors support arithmetic
on narrower types.)
While I am not a big fan of Ada in general - it's far too wordy for my
tastes - I think its arithmetic semantics are clearer, more consistent,
and more flexible.
Ada is just too heavy-going and much too strict for my taste. Its type
system would mean stopping to do battle with the language every five
minutes, and creating ever more elaborate workarounds when you've tied yourself up in knots.
I had enough trouble with my 'byte' and 'char' types. The casts needed
to convert between ref byte and ref char started to poison everything.
In the end I just allowed implicit conversions between them.
Maybe it will suit your work (why don't you use it anyway?). But it
doesn't suit my stuff.
On 03/09/2021 19:30, Bart wrote:
Yes it has modulo wrapping for unsigned types. But that happens at this
point:
18446744073709551615 + 1
Those 8-bit or 2-bit values are promoted to u64, and both 255 and 3 are
a long way below that threshold.
Different languages, different rules.
I don't think it is appropriate to say that one way is "right" and the
other is "broken" - both are "broken" when viewed with the other set of rules.
When I'm working with just numbers, then I don't need restrictions:
x := 1e1'000'000L
y := 3
println x
println x + y.[0]
No fuss. Ada on the other hand would probably have kittens.
Ada is designed to be strict - you get exactly what you ask for, but you
have to ask exactly for what you want. The idea is that it should be
very difficult to write (and compile) incorrect code, at the cost of
making it a bit more difficult to write correct code. Your language is
at the other end of the scale, where you make it as easy as possible to
write correct code, at the cost of making it easier to write incorrect
code.
It's not so useful now, in an 'open' expression when the result is not
put back into a byte type, but used more generally. And it gives
surprising results with:
A + 1
when A is 255 for example.
That will depend on how you view the type of literals like "1".
As I say, there's room for different types of language.
You don't use a forgiving, easy-to write language
like Python when you are making a car engine controller or medical
equipment.
You don't use a strict, hard-to-write language like Ada when
making a little utility program on a PC.
There is also room for
different kinds of programmers - someone who feels "workarounds" are appropriate should not be doing the kind of programming where Ada is a typical language choice, whereas someone who insists that every function written has full documentation and unit tests and is reviewed by at
least two independent groups is not going to work on a task where time-to-market and development costs are more important than quality.
On 04/09/2021 12:04, David Brown wrote:
On 03/09/2021 19:30, Bart wrote:
Ada is designed to be strict - you get exactly what you ask for, but you
have to ask exactly for what you want. The idea is that it should be
very difficult to write (and compile) incorrect code, at the cost of
making it a bit more difficult to write correct code. Your language is
at the other end of the scale, where you make it as easy as possible to
write correct code, at the cost of making it easier to write incorrect
code.
For the static language, yes. For the dynamic one, in which I wrote that example (it was not worthwhile to add arbitrary precision into the
other), it actually does quite a lot of runtime checks, while still
letting you do lots of naughty or underhand things if you want:
It's not so useful now, in an 'open' expression when the result is not
put back into a byte type, but used more generally. And it gives
surprising results with:
A + 1
when A is 255 for example.
That will depend on how you view the type of literals like "1".
Exactly. There can be a surprising amount of confusion about the types
of literals.
Because of the way my language treats mixed signed/unsigned arithmetic
(as signed operations), and because literals 0 to 2**63-1 are signed,
then this can give unexpected results here:
u64 A = 0xC000'0000'000'000
if A > 1 then ...
A is treated as signed, so a negative value in this case: it will give
the opposite behaviour to what is expected (A <= 1). So I had to
introduce some special rules for certain operators such as relative
compares:
A > B Not allowed for mixed types (cast required)
A > 1 When A is unsigned then so is the literal
(Yeah, I got bitten by this.... But it shows I can recognise a potential source of bugs and can do something about it)
As I say, there's room for different types of language.
Not according to DAK. There could only be one language (ideally a lot
better than Ada, but Ada will do if the ideal doesn't exist); it doesn't matter how big or complex or slow to compile or inefficient it is; and
it can only employ the latest concepts in distributed computing (or
whatever he was on about) in every application, even if it doesn't need it.
You don't use a forgiving, easy-to write language
like Python when you are making a car engine controller or medical
equipment.
Apparently it's used a lot in all sorts of 'enterprise' areas like
finance, or apps like Instagram.
But I don't like it (mostly the language, but also 'Pythonistas').
You don't use a strict, hard-to-write language like Ada when
making a little utility program on a PC.
See above!
There is also room for
different kinds of programmers - someone who feels "workarounds" are
appropriate should not be doing the kind of programming where Ada is a
typical language choice, whereas someone who insists that every function
written has full documentation and unit tests and is reviewed by at
least two independent groups is not going to work on a task where
time-to-market and development costs are more important than quality.
Agreed (for a change).
Ada also solves this kind of problem by not allowing comparisons between different types. (I don't know how it handles literals - that's beyond
my rather limited knowledge of the language.)
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons between
different types. (I don't know how it handles literals - that's beyond
my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a
lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer etc.
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons between >>> different types. (I don't know how it handles literals - that's beyond >>> my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a
lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer etc. >>
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
I must admit I haven't tried result type overloading (I've only played
very briefly with Ada). But I'm sceptical.
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's beyond >>>> my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a
lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
In this case, I need to use casts, eg:
println 18446744073709551615 + u128(1)
In Ada however it doesn't compile.
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's
beyond
my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1
My_Custom_Integer etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
That is merely a consequence of most C implementations being capped at a 32-bit type. The language allows a wider int.
I can write this:
u64 a:=0
u128 b:=18446744073709551615
println a + 1
println b + 1
Putput is:
1
18446744073709551616
Promotion rules will widen the literal when used in a binary op.
However in other cases it won't; here it displays zero:
println 18446744073709551615 + 1
In this case, I need to use casts, eg:
println 18446744073709551615 + u128(1)
In Ada however it doesn't compile.
On 2021-09-05 12:08, Bart wrote:
In Ada however it doesn't compile.
Of course it does:
with Ada.Text_IO; use Ada.Text_IO;
procedure Test is
begin
Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1));
end Test;
Prints:
18446744073709551616
No casts needed.
On 05/09/2021 11:43, Dmitry A. Kazakov wrote:
On 2021-09-05 12:08, Bart wrote:
In Ada however it doesn't compile.
Of course it does:
with Ada.Text_IO; use Ada.Text_IO;
procedure Test is
begin
Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1)); >> end Test;
Prints:
18446744073709551616
No casts needed.
OK, my last post just crossed yours. I'd figured this out. Except my
Gnat implementation doesn't support Long_Long_Long (neither does rextester.com).
But even if it did, really? Apart from being uglier than 1ULL, what sort
of type is that?
BTW it looks to me like you're having to use an i128 type in order to represent a value that fits into u64.
On 05/09/2021 11:08, Bart wrote:
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
In this case, I need to use casts, eg:
println 18446744073709551615 + u128(1)
In Ada however it doesn't compile.
Ada can't display that literal anyway. I can try something like this,
but only for i64.max not u64.max:
Put_Line (Long_Long_Integer'Image (9223372036854775807));
This of course is not ugly at all.
Isn't it wonderful to just do 'print x', whether x is a named constant, variable or literal, and not worry about what type it happens to be?
BTW it looks to me like you're having to use an i128 type in order to
represent a value that fits into u64.
No.
On 05/09/2021 12:11, Dmitry A. Kazakov wrote:
BTW it looks to me like you're having to use an i128 type in order to
represent a value that fits into u64.
No.
So what type is Long_Long_Integer, and how does it differ from
Long_Integer?
On 2021-09-05 13:16, Bart wrote:
On 05/09/2021 12:11, Dmitry A. Kazakov wrote:
BTW it looks to me like you're having to use an i128 type in order
to represent a value that fits into u64.
No.
So what type is Long_Long_Integer, and how does it differ from
Long_Integer?
Defined in GNAT Ada package Standard:
type Long_Integer is range -2**31..2**31-1;
type Long_Long_Integer is range -2**63..2**63-1;
type Long_Long_Long_Integer is range -2**128..2**128-1;
On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
On 2021-09-05 13:16, Bart wrote:
On 05/09/2021 12:11, Dmitry A. Kazakov wrote:
BTW it looks to me like you're having to use an i128 type in order
to represent a value that fits into u64.
No.
So what type is Long_Long_Integer, and how does it differ from
Long_Integer?
Defined in GNAT Ada package Standard:
type Long_Integer is range -2**31..2**31-1;
type Long_Long_Integer is range -2**63..2**63-1;
type Long_Long_Long_Integer is range -2**128..2**128-1;
Correction:
type Long_Long_Long_Integer is range -2**127..2**127-1;
On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
On 2021-09-05 13:16, Bart wrote:
On 05/09/2021 12:11, Dmitry A. Kazakov wrote:
BTW it looks to me like you're having to use an i128 type in order >>>>>> to represent a value that fits into u64.
No.
So what type is Long_Long_Integer, and how does it differ from
Long_Integer?
Defined in GNAT Ada package Standard:
type Long_Integer is range -2**31..2**31-1;
type Long_Long_Integer is range -2**63..2**63-1;
type Long_Long_Long_Integer is range -2**128..2**128-1;
Correction:
type Long_Long_Long_Integer is range -2**127..2**127-1;
So, I was right, it's using a 128-bit type.
But that 2**127-1 is interesting, assuming this is actual code and not a representation of a built-in type.
On 05/09/2021 16:10, Dmitry A. Kazakov wrote:
On 2021-09-05 15:59, Bart wrote:
On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
On 2021-09-05 13:16, Bart wrote:
On 05/09/2021 12:11, Dmitry A. Kazakov wrote:
BTW it looks to me like you're having to use an i128 type in
order to represent a value that fits into u64.
No.
So what type is Long_Long_Integer, and how does it differ from
Long_Integer?
Defined in GNAT Ada package Standard:
type Long_Integer is range -2**31..2**31-1;
type Long_Long_Integer is range -2**63..2**63-1;
type Long_Long_Long_Integer is range -2**128..2**128-1;
Correction:
type Long_Long_Long_Integer is range -2**127..2**127-1;
So, I was right, it's using a 128-bit type.
But that 2**127-1 is interesting, assuming this is actual code and
not a representation of a built-in type.
There is no such distinction in Ada. I could write
type My_Integer is range -2**127..2**127-1;
with the same effect. This is the power of a properly designed
language. Here is the complete program:
with Ada.Text_IO;
procedure Test is
type My_Integer is range -2**127..2**127-1;
package IO is new Ada.Text_IO.Integer_IO (My_Integer);
use IO;
begin
Put (18446744073709551615 + 1);
end Test;
My point was,
could you write 2**127 in any other context than a range
specifier,
without having to pedantically define everything about the
types of the literals involved as well as result types?
On 2021-09-05 15:59, Bart wrote:
On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
On 2021-09-05 13:16, Bart wrote:
On 05/09/2021 12:11, Dmitry A. Kazakov wrote:
BTW it looks to me like you're having to use an i128 type in
order to represent a value that fits into u64.
No.
So what type is Long_Long_Integer, and how does it differ from
Long_Integer?
Defined in GNAT Ada package Standard:
type Long_Integer is range -2**31..2**31-1;
type Long_Long_Integer is range -2**63..2**63-1;
type Long_Long_Long_Integer is range -2**128..2**128-1;
Correction:
type Long_Long_Long_Integer is range -2**127..2**127-1;
So, I was right, it's using a 128-bit type.
But that 2**127-1 is interesting, assuming this is actual code and not
a representation of a built-in type.
There is no such distinction in Ada. I could write
type My_Integer is range -2**127..2**127-1;
with the same effect. This is the power of a properly designed language.
Here is the complete program:
with Ada.Text_IO;
procedure Test is
type My_Integer is range -2**127..2**127-1;
package IO is new Ada.Text_IO.Integer_IO (My_Integer);
use IO;
begin
Put (18446744073709551615 + 1);
end Test;
On 2021-09-05 17:37, Bart wrote:
On 05/09/2021 16:10, Dmitry A. Kazakov wrote:
On 2021-09-05 15:59, Bart wrote:
On 05/09/2021 13:19, Dmitry A. Kazakov wrote:
On 2021-09-05 14:18, Dmitry A. Kazakov wrote:
On 2021-09-05 13:16, Bart wrote:
On 05/09/2021 12:11, Dmitry A. Kazakov wrote:
BTW it looks to me like you're having to use an i128 type in >>>>>>>>> order to represent a value that fits into u64.
No.
So what type is Long_Long_Integer, and how does it differ from
Long_Integer?
Defined in GNAT Ada package Standard:
type Long_Integer is range -2**31..2**31-1;
type Long_Long_Integer is range -2**63..2**63-1;
type Long_Long_Long_Integer is range -2**128..2**128-1;
Correction:
type Long_Long_Long_Integer is range -2**127..2**127-1;
So, I was right, it's using a 128-bit type.
But that 2**127-1 is interesting, assuming this is actual code and
not a representation of a built-in type.
There is no such distinction in Ada. I could write
type My_Integer is range -2**127..2**127-1;
with the same effect. This is the power of a properly designed
language. Here is the complete program:
with Ada.Text_IO;
procedure Test is
type My_Integer is range -2**127..2**127-1;
package IO is new Ada.Text_IO.Integer_IO (My_Integer);
use IO;
begin
Put (18446744073709551615 + 1);
end Test;
My point was,
Point or question?
could you write 2**127 in any other context than a range specifier,
Silly question:
X : constant := 2**127;
without having to pedantically define everything about the types of
the literals involved as well as result types?
Anyway in this case it is not a range specifier, it a definition of an integer type.
All numeric types, all string types come with literals, naturally.
type Roman_Digit is ('I', 'V', 'X', 'L', 'C', 'D');
type Roman_Number is array (Positive range <>) of Roman_Digit;
Nine : Roman_Number := "IX";
That is why there is no built-in types in Ada. There are some predefined
for convenience types.
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's beyond >>>> my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a
lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
Same with string and character literals. You just write:
S1 : String := "abc"; -- Latin-1
S2 : Wide_String := "abc"; -- UCS-2
S3 : Wide_Wide_String := "abc"; -- UCS-4
begin
if S2(3) = 'c' then -- Here 'c' is resolved to 16-bit character ...
I must admit I haven't tried result type overloading (I've only played
very briefly with Ada). But I'm sceptical.
It is an arbitrary limitation made by lazy compiler writes (we know some (:-)). Without result overloading they can resolve all types strictly bottom-up.
BTW, there is a similar case with overriding. In C++ only the first
[hidden] argument supports overriding. In Ada terms it is a controlled argument. In Ada any argument and/or the result can be controlled. So
you could dispatch on the result.
On 2021-09-05 12:08, Bart wrote:
However in other cases it won't; here it displays zero:
println 18446744073709551615 + 1
In this case, I need to use casts, eg:
println 18446744073709551615 + u128(1)
with Ada.Text_IO; use Ada.Text_IO;
procedure Test is
begin
Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1));
end Test;
Prints:
18446744073709551616
No casts needed.
On 05/09/2021 11:39, Dmitry A. Kazakov wrote:
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's beyond >>>>> my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
(Forgive me if I've misunderstood the Ada syntax here.)
This is creating new instances of a type - that's not the same as
overloading on the return type as a general feature. Certainly if you
first have overloading on return type as a feature of a language, you
can use that for initialisation or object creation. But it is not
necessary.
C++ does give you result type overloading, if you want it, via classes
and conversion operators :
class A {
int x_;
public :
A(int x) : x_ (x) {}
operator int8_t() { return x_ + 100; }
operator int16_t() { return x_ + 200; }
operator int32_t() { return x_ + 400; }
};
int8_t b1 = A(5);
int16_t b2 = A(5);
int32_t b3 = A(5);
On 05/09/2021 17:13, Dmitry A. Kazakov wrote:
On 2021-09-05 17:37, Bart wrote:
could you write 2**127 in any other context than a range specifier,
Silly question:
X : constant := 2**127;
Well, the rules keep changing!
Here you managed to define X with an i128
value without needing to also define a long_long_long_integer type. So
what's the type of X?
without having to pedantically define everything about the types of
the literals involved as well as result types?
Anyway in this case it is not a range specifier, it a definition of an
integer type.
All numeric types, all string types come with literals, naturally.
type Roman_Digit is ('I', 'V', 'X', 'L', 'C', 'D');
type Roman_Number is array (Positive range <>) of Roman_Digit;
Nine : Roman_Number := "IX";
That is why there is no built-in types in Ada. There are some
predefined for convenience types.
Presumably you need to allow built-in literals such as "2", "127", and operations such as "**". Which suggests they are not just for
convenience, but necessity.
On 05/09/2021 11:43, Dmitry A. Kazakov wrote:
On 2021-09-05 12:08, Bart wrote:
However in other cases it won't; here it displays zero:
println 18446744073709551615 + 1
In this case, I need to use casts, eg:
println 18446744073709551615 + u128(1)
[Presumably you need to compile/interpret and run this code?]
[...]
with Ada.Text_IO; use Ada.Text_IO;
procedure Test is
begin
Put_Line (Long_Long_Long_Integer'Image (18446744073709551615 + 1)); >> end Test;
Prints:
18446744073709551616
No casts needed.
~$ a68g -p "LONG 18446744073709551615 + 1"
+18446744073709551616
No separate compilation/execution needed. You guys seem to like
writing excessive code. There is a reason why some languages are
more productive than others!
On 2021-09-05 18:45, Bart wrote:
On 05/09/2021 17:13, Dmitry A. Kazakov wrote:
On 2021-09-05 17:37, Bart wrote:
could you write 2**127 in any other context than a range specifier,
Silly question:
X : constant := 2**127;
Well, the rules keep changing!
You just do not understand them because you cannot think out of the box
of your language. There are many ways to skin the cat.
Here you managed to define X with an i128 value without needing to
also define a long_long_long_integer type. So what's the type of X?
Universal_Integer
On 05/09/2021 18:22, Dmitry A. Kazakov wrote:
Here you managed to define X with an i128 value without needing to
also define a long_long_long_integer type. So what's the type of X?
Universal_Integer
I tried to create some variables of type 'universal_integer', but it
said that was undefined.
On 2021-09-05 21:36, Bart wrote:
On 05/09/2021 18:22, Dmitry A. Kazakov wrote:
Here you managed to define X with an i128 value without needing to
also define a long_long_long_integer type. So what's the type of X?
Universal_Integer
I tried to create some variables of type 'universal_integer', but it
said that was undefined.
No object can have a universal type. Named constant is no object.
On 05/09/2021 21:10, Dmitry A. Kazakov wrote:
On 2021-09-05 21:36, Bart wrote:
On 05/09/2021 18:22, Dmitry A. Kazakov wrote:
Here you managed to define X with an i128 value without needing to
also define a long_long_long_integer type. So what's the type of X?
Universal_Integer
I tried to create some variables of type 'universal_integer', but it
said that was undefined.
No object can have a universal type. Named constant is no object.
So how can I print that X defined as 2**127?
[Presumably you need to compile/interpret and run this code?]That's usually how it's done.
You had to use a special option "-p".
If I spend 10 minutes adding a
similar option, then I can do the same:
C:\qx>qq -p:"print 2L**512-1" 134078079299425970995740249982058461274793658205923933777235 614437217640300735469768018742981669034276900318581864860508 53753882811946569946433649006084095
Furthermore, I can use double quotes which A68G had some trouble with:
C:\qx>qq -p:"print ""hello"""
hello
This is actually not unusual; gcc can take input from the console [...].
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's
beyond
my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1
My_Custom_Integer etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
That is merely a consequence of most C implementations being capped at a 32-bit type.
The language allows a wider int.
I can write this:
u64 a:=0
u128 b:=18446744073709551615
println a + 1
println b + 1
Putput is:
1
18446744073709551616
Promotion rules will widen the literal when used in a binary op.
However in other cases it won't; here it displays zero:
println 18446744073709551615 + 1
In this case, I need to use casts, eg:
println 18446744073709551615 + u128(1)
In Ada however it doesn't compile.
On 2021-09-05 18:44, David Brown wrote:
On 05/09/2021 11:39, Dmitry A. Kazakov wrote:
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's
beyond
my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a >>>>> lot. Literals are semantically overloaded parameterless functions. 1 >>>>> Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer >>>>> etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too >>>> easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
(Forgive me if I've misunderstood the Ada syntax here.)
This is creating new instances of a type - that's not the same as
overloading on the return type as a general feature. Certainly if you
first have overloading on return type as a feature of a language, you
can use that for initialisation or object creation. But it is not
necessary.
Overloading is not necessary, only convenient. The argument is that
there is no logical reason why allow it in arguments and not in results.
C++ does give you result type overloading, if you want it, via classes
and conversion operators :
class A {
int x_;
public :
A(int x) : x_ (x) {}
operator int8_t() { return x_ + 100; }
operator int16_t() { return x_ + 200; }
operator int32_t() { return x_ + 400; }
};
int8_t b1 = A(5);
int16_t b2 = A(5);
int32_t b3 = A(5);
This is a slightly different thing.
But I always admired C++ mechanism of creating ad-hoc subtypes like
above, in effect:
A
|
int8_t
If extended it could become a very powerful thing, e.g. to create a
common sub/supertype for two unrelated types via implicit type conversions.
A B
| ^
V |
Ad-hoc parent/child
And then per user-provided implicit conversions:
A -> ad-hoc parent -> B
One could call f(B) with A.
On 05/09/2021 12:08, Bart wrote:
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
That is merely a consequence of most C implementations being capped at a
32-bit type.
No, it is not.
You rarely have any need to use the suffixes on literals. It typically
only matters if you are using the literal in an expression where its
natural type would overflow.
However in other cases it won't; here it displays zero:
println 18446744073709551615 + 1
You complained earlier about how terrible it was for "a + 1" to give 0
when "a" is an 8-bit unsigned modulo type containing 255. Yet here your language does exactly the same thing, just with 64-bit types.
You are
doing the same thing, just drawing your arbitrary lines in different
places - unlike Ada which is consistent and clear.
(And yes, I know C
has an equally arbitrary line and that its size is
implementation-dependent.)
On 06/09/2021 09:10, David Brown wrote:
You complained earlier about how terrible it was for "a + 1" to give 0
when "a" is an 8-bit unsigned modulo type containing 255. Yet here your
language does exactly the same thing, just with 64-bit types.
Yes, because it hits the upper limit of the machine's integer type;
there is no way of representing the next value (without the incredibly wasteful technique of making all 64-bit operations have 128-bit results,
and then you'd complain that u128.max + 1 gives 0 too.).
ALL literals and intermediate and
final results have the same universal integer type.
But you can't use such a type and such freedoms in any useful contexts.
On 06/09/2021 09:10, David Brown wrote:
On 05/09/2021 12:08, Bart wrote:
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
Anyway, isn't it nice to have 1 written as 1 instead of ugly 1ULL?
That is merely a consequence of most C implementations being capped at a >>> 32-bit type.
No, it is not.
You rarely have any need to use the suffixes on literals. It typically
only matters if you are using the literal in an expression where its
natural type would overflow.
You need to use it when literals in the range of approx 0..2**32 are
used within expressions with 64-bit results. Such as common ones like 1
or 2.
That's because int is capped at 32 bits on every C I've ever used. A
64-bit int would mean literals of 0..2**31-1 or 0..2**32-1 having i64 or
u64 types.
However in other cases it won't; here it displays zero:
println 18446744073709551615 + 1
You complained earlier about how terrible it was for "a + 1" to give 0
when "a" is an 8-bit unsigned modulo type containing 255. Yet here your
language does exactly the same thing, just with 64-bit types.
Yes, because it hits the upper limit of the machine's integer type;
there is no way of representing the next value (without the incredibly wasteful technique of making all 64-bit operations have 128-bit results,
and then you'd complain that u128.max + 1 gives 0 too.).
There is no artificial limit like 255 or 65535. If you're calculating a
hash expresson from an 8-bit character value, you don't want a
modulo-256 result just because characters are stored as only 8 bits.
Didn't you say elsewhere that values requiring between 33 and 64 bits to represent were incredibly rare? You don't want ordinary expressions to overflow when those results are a long way from 2**64.
You are
doing the same thing, just drawing your arbitrary lines in different
places - unlike Ada which is consistent and clear.
One thing I've learned is that Ada is anything but that! In some
contexts, the ordinary rules go out the window: literals can have any magnitude, results can have any magnitude (apart from being under 2**64
on my machine, and 2**128 on DAKs), ALL literals and intermediate and
final results have the same universal integer type.
But you can't use such a type and such freedoms in any useful contexts.
(And yes, I know C
has an equally arbitrary line and that its size is
implementation-dependent.)
C's arbitrary line hasn't adapted between the 32 to 64-bit migration as
it did with 16 to 32 bits.
Which among other things, means limiting multi-character constants to
'ABCD'. Mine go up to 'ABCDEFGHIJKLMNOP' (up to 8 chars is 64 bits,
up to 16 is 128).
On 06/09/2021 12:03, Bart wrote:
C's arbitrary line hasn't adapted between the 32 to 64-bit migration as
it did with 16 to 32 bits.
The choice of size to use for "int" has its pros and cons. 64-bit
reduces the risk of overflow from "very rarely a risk" to "extremely
rarely a risk". But the cost is more precious L0 cache space, and
slower code for some operations (especially on earlier 64-bit systems).
Which among other things, means limiting multi-character constants to
'ABCD'. Mine go up to 'ABCDEFGHIJKLMNOP' (up to 8 chars is 64 bits,
up to 16 is 128).
I have never seen - or even heard of - a sensible use of multi-character constants.
On 06/09/2021 12:14, David Brown wrote:
On 06/09/2021 12:03, Bart wrote:
C's arbitrary line hasn't adapted between the 32 to 64-bit migration as
it did with 16 to 32 bits.
The choice of size to use for "int" has its pros and cons. 64-bit
reduces the risk of overflow from "very rarely a risk" to "extremely
rarely a risk". But the cost is more precious L0 cache space, and
slower code for some operations (especially on earlier 64-bit systems).
You can apply the same argument to 16 vs 32 bits.
But my use of a default 64 bits applies to in-register calculations; registers are 64 bits anyway, and stack slots are 64 bits too.
So passing or returning 64 bit values it not a real overhead. Using individual 'int' variables on the stack frame is not much of one either
(and many locals will reside in registers anyway).
So memory issues only come with with arrays and structs, and there you
can choose to use narrower types.
But bear in mind also that pointers will be 64 bits anyway, which also
put pressure on memory.
Which among other things, means limiting multi-character constants to
'ABCD'. Mine go up to 'ABCDEFGHIJKLMNOP' (up to 8 chars is 64 bits,
up to 16 is 128).
I have never seen - or even heard of - a sensible use of multi-character
constants.
Here's example from my code:
case a.value
when 'PROC' then ...
when 'MODULE' then ...
when 'CMDNAME' then ...
...
case target
when 'X64' then ...
println target:"m" # might display X64
It's using a 64-bit integer instead of a string, which can be compared
more easily. You can use it as a quick and dirty enum value too.
On 06/09/2021 16:26, Bart wrote:
On 06/09/2021 12:14, David Brown wrote:
I have never seen - or even heard of - a sensible use of multi-character >>> constants.
Here's example from my code:
case a.value
when 'PROC' then ...
when 'MODULE' then ...
when 'CMDNAME' then ...
...
case target
when 'X64' then ...
println target:"m" # might display X64
It's using a 64-bit integer instead of a string, which can be compared
more easily. You can use it as a quick and dirty enum value too.
I prefer to use proper enums - there is no advantage to making them
dirty, as it certainly won't be faster and you lose all the static
checking benefits of real enumerations (with a good compiler or linter).
On 29/08/2021 21:17, James Harris wrote:
On 29/08/2021 20:17, Bart wrote:
On 29/08/2021 20:02, James Harris wrote:
...
On the topic we were discussing if you have a C parser why not use
it (along with a C preprocessor) to extract the info you need - such
a type definitions and structure offsets - from each target
environment's C header files? That's what I thought you wanted to do
before.
Did you read the rest of my post? A few lines down I linked to a file
which was generated by my compiler. But it cannot do a completely
automatic translation.
Yes, I did. I looked at the two files you linked and even the
definition of the macros which did not convert.
But translating your own sources is a new topic and was NOT what we
were talking about.
Well, you asked why I didn't use my parser to extract that info, but how
you think that second file got generated!
That was the point: other environments do NOT have configuration files
but they often DO have C headers. To determine the configuration for
those environments you need to get the info from the C headers. And
that's best done by
Cpreprocessor < header | bart_parse env.conf
where bart_parse is your program which parses the output from the C
preprocessor and updates env.conf with the required info.
For this purpose (creating C system header files to go with your own C compiler), you need to end up with an actual set of header files.
Which existing compilers do you look at for information? Ones like gcc
have incredibly elaborate headers, full of compiler-specific built-ins
and attributes and predefined macros.
On 2021-08-29 14:39, James Harris wrote:
That would be fine. You could have an OO HLL store
length, pointer to dispatch table, offset
You do not need length, because it can be deduced from the tag. Then tag
is more universal than a pointer to the table. Pointer do not work with multiple-dispatch, for example.
That's probably a better way to explain it (i.e. in terms of
polymorphism).
One issue is representation. Each OO language (e.g. Ada and C++) could
represent such an object differently. The OS would not necessarily
support the OO layouts of any particular language. And non-OO
languages would also need access.
If the OS is built with OOP in mind, wold make the type tags a part of
its calling convention.
It is just history of UNIX and DOS/Windows that they grew out from hobby projects. It need not to be this way. E.g. in VMS all languages used the
same convention. You could call whatever function from any language
straight away.
Say you had programs written in Ada, C and C++ all accessing the same
open file. A change in file position or max offset made by one of them
would need to be propagate to the other two. The C code could update
the system structure directly. How would Ada and C++ keep their
objects in sync?
Class-wide is only the interface. It will dispatch to the FS
implementation which will use a specific type.
C++ unsigned
|
V
API (<unsigned-tag>, unsigned-value)
|
| dispatch to NTFS
V
NTFS uint64_t (unsigned-value) converts to the native type
|
V
API (<uint64_t-tag>, uint64_t-value)
|
V
Ada Integer (uint64_t-value) converts to the desired type
On 29/08/2021 22:10, Bart wrote:
Well, you asked why I didn't use my parser to extract that info, but
how you think that second file got generated!
I suggested using your parser to /extract/ data, not translate header
files. And I suggested using it on the C headers of target environments,
not on your own headers.
I tend to avoid saying what I would do as each person's goals are
different but as you keep thinking of something other than what I am
saying I think I'll have to in this case so as to be as clear as I can.
If I wanted to get data types and struct layouts for a target
environment and the only machine-readable description of that
environment was in C headers I'd
1. create a .c file with the required #includes
2. run it through the target's preprocessor
3. parse the output to extract the data I needed
4. store the extracted data in a configuration file for the target
5. use the configuration file to set up my own types and structures for
the target environment.
Further, since I may not even have access to a given target environment,
if the above process was unable to parse anything it needed to I'd have
the parser produce a report of what it could not handle for sending back
to me so I could update the parser or take remedial steps.
As the end of the day, I thought you were lamenting that there's no
master config file and all info is in C headers. The above steps are
intended to remedy that and create the master config file I thought you wanted.
...
That was the point: other environments do NOT have configuration
files but they often DO have C headers. To determine the
configuration for those environments you need to get the info from
the C headers. And that's best done by
Cpreprocessor < header | bart_parse env.conf
where bart_parse is your program which parses the output from the C
preprocessor and updates env.conf with the required info.
For this purpose (creating C system header files to go with your own C
compiler), you need to end up with an actual set of header files.
Which existing compilers do you look at for information? Ones like gcc
have incredibly elaborate headers, full of compiler-specific built-ins
and attributes and predefined macros.
Even if a compiler has a hundred versions of stat.h and they depend on a thousand other files it does not matter. All you'd need to do, AISI, is
run the preprocessor in the correct environment. It will process the
many headers and produce _one_ copy of the info you need.
On 29/08/2021 14:11, Dmitry A. Kazakov wrote:
If the OS is built with OOP in mind, wold make the type tags a part of
its calling convention.
That may not be practical. Different languages - and even different
compilers - can implement OOP with different memory layouts.
It is just history of UNIX and DOS/Windows that they grew out from
hobby projects. It need not to be this way. E.g. in VMS all languages
used the same convention. You could call whatever function from any
language straight away.
Do you have a link which shows how that was implemented? I can only find overviews ATM.
Say you had programs written in Ada, C and C++ all accessing the same
open file. A change in file position or max offset made by one of
them would need to be propagate to the other two. The C code could
update the system structure directly. How would Ada and C++ keep
their objects in sync?
Class-wide is only the interface. It will dispatch to the FS
implementation which will use a specific type.
C++ unsigned
|
V
API (<unsigned-tag>, unsigned-value)
|
| dispatch to NTFS
V
NTFS uint64_t (unsigned-value) converts to the native type
|
V
API (<uint64_t-tag>, uint64_t-value)
|
V
Ada Integer (uint64_t-value) converts to the desired type
I am not sure what that diagram is meant to show. (Should the arrows be bidirectional, for example?)
Alternatively, if you are saying that each language implementation would
have its own API/ABI which would allow that language to /model/ OS
objects but translate them to and from OS structures ... then I would agree.
Incidentally, perhaps your diagram shows the value of having a tag or
pointer separate from an object's data - something mentioned in the
"small tuples" thread.
On 2021-09-09 13:44, James Harris wrote:
On 29/08/2021 14:11, Dmitry A. Kazakov wrote:
If the OS is built with OOP in mind, wold make the type tags a part
of its calling convention.
That may not be practical. Different languages - and even different
compilers - can implement OOP with different memory layouts.
Irrelevant. OS defines its API, conventions, representations. What else
a language does, OS does not care.
It is just history of UNIX and DOS/Windows that they grew out from
hobby projects. It need not to be this way. E.g. in VMS all languages
used the same convention. You could call whatever function from any
language straight away.
Do you have a link which shows how that was implemented? I can only
find overviews ATM.
That would not help you, because VAX-11 and ALpha architectures are unfortunately dead.
Say you had programs written in Ada, C and C++ all accessing the
same open file. A change in file position or max offset made by one
of them would need to be propagate to the other two. The C code
could update the system structure directly. How would Ada and C++
keep their objects in sync?
Class-wide is only the interface. It will dispatch to the FS
implementation which will use a specific type.
C++ unsigned
|
V
API (<unsigned-tag>, unsigned-value)
|
| dispatch to NTFS
V
NTFS uint64_t (unsigned-value) converts to the native type
|
V
API (<uint64_t-tag>, uint64_t-value)
|
V
Ada Integer (uint64_t-value) converts to the desired type
I am not sure what that diagram is meant to show. (Should the arrows
be bidirectional, for example?)
Nope, you said C++ writes value and Ada reads it.
On 09/09/2021 15:14, Dmitry A. Kazakov wrote:
On 2021-09-09 13:44, James Harris wrote:
On 29/08/2021 14:11, Dmitry A. Kazakov wrote:
It is just history of UNIX and DOS/Windows that they grew out from
hobby projects. It need not to be this way. E.g. in VMS all
languages used the same convention. You could call whatever function
from any language straight away.
Do you have a link which shows how that was implemented? I can only
find overviews ATM.
That would not help you, because VAX-11 and ALpha architectures are
unfortunately dead.
No worries. But just because a system is no longer in use does not mean
that it contained only bad ideas.
Say you had programs written in Ada, C and C++ all accessing the
same open file. A change in file position or max offset made by one
of them would need to be propagate to the other two. The C code
could update the system structure directly. How would Ada and C++
keep their objects in sync?
Class-wide is only the interface. It will dispatch to the FS
implementation which will use a specific type.
C++ unsigned
|
V
API (<unsigned-tag>, unsigned-value)
|
| dispatch to NTFS
V
NTFS uint64_t (unsigned-value) converts to the native type
|
V
API (<uint64_t-tag>, uint64_t-value)
|
V
Ada Integer (uint64_t-value) converts to the desired type
I am not sure what that diagram is meant to show. (Should the arrows
be bidirectional, for example?)
Nope, you said C++ writes value and Ada reads it.
Nope, I said they both access the file.
On 2021-09-02 18:47, James Harris wrote:
On 02/09/2021 17:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
On 2021-09-02 14:50, Bart wrote:
...
A fourth might be that arithmetic operations may not work directly
on all of those (but on x64 they do).
And?
OK. So according to you these types are no problem at all. You can
use a u17 type just as easily as a 16-bit or 32-bit type.
Perhaps you'd like to show some actual assembly code then this fragment: >>>
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
Why do you want it?
What is supposed to happen on overflow
What overflow? It is a modular number, they do never overflow. It is up
to you to implement arithmetic correctly using appropriate instructions.
and are there any particular optimisation goals?
When doing modular arithmetic you must minimize checks by proving that
the intermediates are correct regardless the arguments. Say, you decided
to implement arithmetic using 32-bit machine numbers. Then with b+c you
have nothing to worry about. You load b and c into 32-bit registers, you
sum them. Then you verify if the result is greater than 2**17-1, if yes,
you subtract 2**17. Difficult?
My thinking was that if Bart is setting the problem he is at liberty to define the parameters thereof.
He did not need to follow your preferred
concept of unsigned numbers.
On 02/09/2021 18:47, James Harris wrote:
On 02/09/2021 17:01, Bart wrote:
Perhaps you'd like to show some actual assembly code then this fragment: >>>
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
Presumably it is roughly the same as you'd get in C with :
uint32_t a, b, c;
a = (b + c) % (1u << 17);
Types in a high level language are not constructs in assembly. They
don't have to correspond to matching hardware or assembly-level
features. Just as a "bool" or an enumerated type in C is going to be
stored in a register or memory in exactly the same way as a the
processor might store a number, so a "u17" type (assuming that means a 0
.. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
integer. That's likely to be the same storage as a uint32_t. (Though
one processor I use has 20-bit registers, which would be more efficient
than using two of its 16-bit registers.)
On 02/09/2021 18:12, Dmitry A. Kazakov wrote:
On 2021-09-02 18:47, James Harris wrote:
On 02/09/2021 17:01, Bart wrote:
On 02/09/2021 15:52, Dmitry A. Kazakov wrote:
On 2021-09-02 14:50, Bart wrote:
...
A fourth might be that arithmetic operations may not work directly >>>>>> on all of those (but on x64 they do).
And?
OK. So according to you these types are no problem at all. You can
use a u17 type just as easily as a 16-bit or 32-bit type.
Perhaps you'd like to show some actual assembly code then this
fragment:
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
Why do you want it?
I didn't say I wanted it. I said I'd be interested to see it as I
suspected you'd not answer that bit. I was right, wasn't I!
What is supposed to happen on overflow
What overflow? It is a modular number, they do never overflow. It is
up to you to implement arithmetic correctly using appropriate
instructions.
My thinking was that if Bart is setting the problem he is at liberty to define the parameters thereof. He did not need to follow your preferred concept of unsigned numbers.
In fact, I asked him what he meant because using modular arithmetic
looked too easy.
and are there any particular optimisation goals?
When doing modular arithmetic you must minimize checks by proving that
the intermediates are correct regardless the arguments. Say, you
decided to implement arithmetic using 32-bit machine numbers. Then
with b+c you have nothing to worry about. You load b and c into 32-bit
registers, you sum them. Then you verify if the result is greater than
2**17-1, if yes, you subtract 2**17. Difficult?
Again, that looked too easy. I suspected Bart had some other criteria in mind.
On 02/09/2021 19:56, David Brown wrote:
On 02/09/2021 18:47, James Harris wrote:
On 02/09/2021 17:01, Bart wrote:
...
Perhaps you'd like to show some actual assembly code then this
fragment:
u17 a,b,c
a := b + c
I'd be particularly interested in how a,b,c are laid out in memory.
I'd be interested to see Dmitry's assembly code - but I suspect he'll
not answer that part.
Presumably it is roughly the same as you'd get in C with :
uint32_t a, b, c;
a = (b + c) % (1u << 17);
Yes, could be.
Types in a high level language are not constructs in assembly. They
don't have to correspond to matching hardware or assembly-level
features. Just as a "bool" or an enumerated type in C is going to be
stored in a register or memory in exactly the same way as a the
processor might store a number, so a "u17" type (assuming that means a 0
.. 2 ^ 17 - 1 modulo type) will be stored in the same way as some
integer. That's likely to be the same storage as a uint32_t. (Though
one processor I use has 20-bit registers, which would be more efficient
than using two of its 16-bit registers.)
That's weird! Which processor is it?
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at liberty
to define the parameters thereof.
No, he must stay within the rational framework. Redefining mathematics
is irrational.
He did not need to follow your preferred concept of unsigned numbers.
Different categories of numbers exist independently of any preferences.
He must say what he means in commonly accepted terms. That will clarify
and determine everything.
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at liberty
to define the parameters thereof.
No, he must stay within the rational framework. Redefining mathematics
is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
Besides, computing is related to but not the same as mathematics.
He did not need to follow your preferred concept of unsigned numbers.
Different categories of numbers exist independently of any
preferences. He must say what he means in commonly accepted terms.
That will clarify and determine everything.
Hence my request for specification.
On 2021-10-01 19:54, James Harris wrote:
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at liberty
to define the parameters thereof.
No, he must stay within the rational framework. Redefining
mathematics is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
There is no such thing. He might mean "natural" numbers, then he should
have said so. Using the term "unsigned" is misguided because it is C's nomenclature. There is nothing natural in C... (:-))
1. There is the set N of natural numbers which is a subset of the set of integer numbers Z. There is nothing special about N that could
distinguish it from any other subrange of Z, like 100..1000. Implement
ranges and be done with that.
2. There are modular numbers of modulo K=1, 2, ...
Implementation of natural numbers using modular machine arithmetic for
one quite idiotic purpose to squeeze one more bit of representation is possible, but expensive.
This is why reasonable languages do not bother
with that mess. They just provide modular numbers of modulo 2**K using machine instructions. It is as simple and efficient as shoe polish.
All this is especially bizzare because he is ready to drop all small
numeric types and go full 64-bit on 16-bit microcontrollers. Come on!
On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
On 2021-10-01 19:54, James Harris wrote:
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at
liberty to define the parameters thereof.
No, he must stay within the rational framework. Redefining
mathematics is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
There is no such thing. He might mean "natural" numbers, then he
should have said so. Using the term "unsigned" is misguided because it
is C's nomenclature. There is nothing natural in C... (:-))
Fortunately I'm not a mathematician so can get on with things without
getting bogged down with such stuff.
On 01/10/2021 21:19, Bart wrote:
On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
On 2021-10-01 19:54, James Harris wrote:
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at
liberty to define the parameters thereof.
No, he must stay within the rational framework. Redefining
mathematics is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
There is no such thing. He might mean "natural" numbers, then he
should have said so. Using the term "unsigned" is misguided because
it is C's nomenclature. There is nothing natural in C... (:-))
Fortunately I'm not a mathematician so can get on with things without
getting bogged down with such stuff.
Be careful not to accept Dmitry's beration! Programming is not
mathematics: it's both a superset and a subset thereof. Where we use
small integers mathematics and programming are consistent and that leads
to false expectations of wider harmony but that's misleading when values approach the limits of a certain type. Your use of "unsigned" is better
than Dmitry's "natural" for that reason. Natural integers are a
theoretical concept useful in mathematics. By contrast, unsigned
integers, because they are familiar from their use in C and especially
if we can assume a certain representation and, thus, behaviour, are an engineering concept better suited to programming. As a consequence IMO
they can help towards a clear specification.
On 03/10/2021 09:00, James Harris wrote:
On 01/10/2021 21:19, Bart wrote:
On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
On 2021-10-01 19:54, James Harris wrote:
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at
liberty to define the parameters thereof.
No, he must stay within the rational framework. Redefining
mathematics is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
There is no such thing. He might mean "natural" numbers, then he
should have said so. Using the term "unsigned" is misguided because
it is C's nomenclature. There is nothing natural in C... (:-))
Fortunately I'm not a mathematician so can get on with things without
getting bogged down with such stuff.
Be careful not to accept Dmitry's beration! Programming is not
mathematics: it's both a superset and a subset thereof. Where we use
small integers mathematics and programming are consistent and that leads
to false expectations of wider harmony but that's misleading when values
approach the limits of a certain type. Your use of "unsigned" is better
than Dmitry's "natural" for that reason. Natural integers are a
theoretical concept useful in mathematics. By contrast, unsigned
integers, because they are familiar from their use in C and especially
if we can assume a certain representation and, thus, behaviour, are an
engineering concept better suited to programming. As a consequence IMO
they can help towards a clear specification.
I'm sorry, but that is just wrong. The only explanation I have is that
you think "mathematics" means "arithmetic I learned in primary school".
The way integer types work in programming is all defined
mathematically.
It doesn't matter whether you are looking at
implementations in terms of bits and representations, or the higher
level usage of the integers. It doesn't matter if you are talking
signed, unsigned, wrapping, overflowing, saturating, or anything else -
it's all mathematics. Even the concept of undefined behaviour is
solidly embedded in mathematics.
Obviously some of the behaviours and characteristics of the integer
types in any real programming language will differ from those of
everyday numbers (mostly due to size limits), but that does not mean
they are not defined mathematically.
On 03/10/2021 09:00, James Harris wrote:
On 01/10/2021 21:19, Bart wrote:
On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
On 2021-10-01 19:54, James Harris wrote:
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at
liberty to define the parameters thereof.
No, he must stay within the rational framework. Redefining
mathematics is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
There is no such thing. He might mean "natural" numbers, then he
should have said so. Using the term "unsigned" is misguided because
it is C's nomenclature. There is nothing natural in C... (:-))
Fortunately I'm not a mathematician so can get on with things without
getting bogged down with such stuff.
Be careful not to accept Dmitry's beration! Programming is not
mathematics: it's both a superset and a subset thereof. Where we use
small integers mathematics and programming are consistent and that leads
to false expectations of wider harmony but that's misleading when values
approach the limits of a certain type. Your use of "unsigned" is better
than Dmitry's "natural" for that reason. Natural integers are a
theoretical concept useful in mathematics. By contrast, unsigned
integers, because they are familiar from their use in C and especially
if we can assume a certain representation and, thus, behaviour, are an
engineering concept better suited to programming. As a consequence IMO
they can help towards a clear specification.
I'm sorry, but that is just wrong. The only explanation I have is that
you think "mathematics" means "arithmetic I learned in primary school".
The way integer types work in programming is all defined
mathematically. It doesn't matter whether you are looking at
implementations in terms of bits and representations, or the higher
level usage of the integers. It doesn't matter if you are talking
signed, unsigned, wrapping, overflowing, saturating, or anything else -
it's all mathematics. Even the concept of undefined behaviour is
solidly embedded in mathematics.
On 03/10/2021 19:06, David Brown wrote:
On 03/10/2021 09:00, James Harris wrote:
On 01/10/2021 21:19, Bart wrote:
On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
On 2021-10-01 19:54, James Harris wrote:
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at
liberty to define the parameters thereof.
No, he must stay within the rational framework. Redefining
mathematics is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
There is no such thing. He might mean "natural" numbers, then he
should have said so. Using the term "unsigned" is misguided because
it is C's nomenclature. There is nothing natural in C... (:-))
Fortunately I'm not a mathematician so can get on with things without
getting bogged down with such stuff.
Be careful not to accept Dmitry's beration! Programming is not
mathematics: it's both a superset and a subset thereof. Where we use
small integers mathematics and programming are consistent and that leads >>> to false expectations of wider harmony but that's misleading when values >>> approach the limits of a certain type. Your use of "unsigned" is better
than Dmitry's "natural" for that reason. Natural integers are a
theoretical concept useful in mathematics. By contrast, unsigned
integers, because they are familiar from their use in C and especially
if we can assume a certain representation and, thus, behaviour, are an
engineering concept better suited to programming. As a consequence IMO
they can help towards a clear specification.
I'm sorry, but that is just wrong. The only explanation I have is that
you think "mathematics" means "arithmetic I learned in primary school".
And what's wrong with that? AFAICS arithmetic is exactly what a
processor is designed to do.
It's no different from doing sums with an abacus which uses a fixed
number of digits.
How do signed, unsigned, wrapping, overflow, saturating etc work with an abacus?
The way integer types work in programming is all defined
mathematically.
How they work in a processor depends on the ALU design. Most I've looked
at work the same way. Departing significantly from that in a language
can be expensive as you'd have to emulate the behaviour.
It doesn't matter whether you are looking at
implementations in terms of bits and representations, or the higher
level usage of the integers. It doesn't matter if you are talking
signed, unsigned, wrapping, overflowing, saturating, or anything else -
it's all mathematics. Even the concept of undefined behaviour is
solidly embedded in mathematics.
Personally I'd prefer if mathematics kept its nose out of things.
Especially in programming language design with its type theory and
lambda calculus and all the the rest.
You only end up with exotic FP languages that only professors of
computer science can understand.
Obviously some of the behaviours and characteristics of the integer
types in any real programming language will differ from those of
everyday numbers (mostly due to size limits), but that does not mean
they are not defined mathematically.
On 03/10/2021 19:06, David Brown wrote:
The way integer types work in programming is all defined
mathematically. It doesn't matter whether you are looking at
implementations in terms of bits and representations, or the higher
level usage of the integers. It doesn't matter if you are talking
signed, unsigned, wrapping, overflowing, saturating, or anything else -
it's all mathematics. Even the concept of undefined behaviour is
solidly embedded in mathematics.
I have to disagree. For sure, one can express computer arithmetic in mathematical terms but because of the limits imposed by fixed
representations there will be many caveats which are not normally
present in mathematics.
But as everyone here
knows, that is not always the case, as was shown in the example I saw discussed recently of
255 + 1
what that results in is decided by what I would call 'engineering', and
not by the normal rules of mathematics.
On 03/10/2021 19:06, David Brown wrote:
On 03/10/2021 09:00, James Harris wrote:
On 01/10/2021 21:19, Bart wrote:
On 01/10/2021 19:54, Dmitry A. Kazakov wrote:
On 2021-10-01 19:54, James Harris wrote:
On 25/09/2021 19:49, Dmitry A. Kazakov wrote:
On 2021-09-25 20:38, James Harris wrote:
My thinking was that if Bart is setting the problem he is at
liberty to define the parameters thereof.
No, he must stay within the rational framework. Redefining
mathematics is irrational.
AIUI Bart specified unsigned numbers, not modular arithmetic.
There is no such thing. He might mean "natural" numbers, then he
should have said so. Using the term "unsigned" is misguided because
it is C's nomenclature. There is nothing natural in C... (:-))
Fortunately I'm not a mathematician so can get on with things without
getting bogged down with such stuff.
Be careful not to accept Dmitry's beration! Programming is not
mathematics: it's both a superset and a subset thereof. Where we use
small integers mathematics and programming are consistent and that leads >>> to false expectations of wider harmony but that's misleading when values >>> approach the limits of a certain type. Your use of "unsigned" is better
than Dmitry's "natural" for that reason. Natural integers are a
theoretical concept useful in mathematics. By contrast, unsigned
integers, because they are familiar from their use in C and especially
if we can assume a certain representation and, thus, behaviour, are an
engineering concept better suited to programming. As a consequence IMO
they can help towards a clear specification.
I'm sorry, but that is just wrong. The only explanation I have is that
you think "mathematics" means "arithmetic I learned in primary school".
No need to apologise! Disagreement is the grist of discussion!
The way integer types work in programming is all defined
mathematically. It doesn't matter whether you are looking at
implementations in terms of bits and representations, or the higher
level usage of the integers. It doesn't matter if you are talking
signed, unsigned, wrapping, overflowing, saturating, or anything else -
it's all mathematics. Even the concept of undefined behaviour is
solidly embedded in mathematics.
I have to disagree. For sure, one can express computer arithmetic in mathematical terms but because of the limits imposed by fixed
representations there will be many caveats which are not normally
present in mathematics. Two cases:
1. Integer arithmetic where all values - including intermediate results
- remain in range for the data type. In this, the computer implements
normal mathematics.
2. Integer arithmetic where either a result or an intermediate value
does not fit in the range assigned. For these a decision has to be made
(by hardware, by language or by compiler) as to what to do with the non-compliant value. As you say, there are various options but they have
to be cast semantically in terms of "if this happens then do that"
rather than following the normal rules of mathematics. Worse, exactly
where the limits apply can even depend on implementation.
Essentially, you and I appear to have a difference over what one should
see as 'mathematics' but I don't think we disagree over substance.
Unfortunately, many programming tutorials encourage programmers to
simply assume that any value is 'large enough' and so will behave
according to the normal rules of mathematics. But as everyone here
knows, that is not always the case, as was shown in the example I saw discussed recently of
255 + 1
what that results in is decided by what I would call 'engineering', and
not by the normal rules of mathematics.
On 03/10/2021 20:27, Bart wrote:
Processors are designed to do many things. Exactly duplicating standard mathematical integers is not one of those things. Being usable to model
a limited version of those integers - following somewhat different mathematical rules and definitions - /is/ one of those things.
Personally I'd prefer if mathematics kept its nose out of things.
Especially in programming language design with its type theory and
lambda calculus and all the the rest.
Yes, I know you are happy with a ducttape and string solution -
looks okay, and tests okay, it's okay by you. Many of us prefer a more
solid theoretical and mathematical foundation to what we do - we'd
rather /know/ it is okay, in addition to testing that it is okay (since
we all get our sums wrong occasionally).
No. The point of having mathematicians, computer scientists, and other
more theoretical people involved is so that you have the right base for
what you are doing. /Then/ you can let more practical-minded (but
perhaps more problem-focused, or user-focused) people build on it.
On 03/10/2021 22:20, James Harris wrote:
1. Integer arithmetic where all values - including intermediate results
- remain in range for the data type. In this, the computer implements
normal mathematics.
2. Integer arithmetic where either a result or an intermediate value
does not fit in the range assigned. For these a decision has to be made
(by hardware, by language or by compiler) as to what to do with the
non-compliant value. As you say, there are various options but they have
to be cast semantically in terms of "if this happens then do that"
rather than following the normal rules of mathematics. Worse, exactly
where the limits apply can even depend on implementation.
And it is all defined mathematically.
We are talking finite sets with partial operations (for C-style signed integers) or closed operations (for C-style unsigned integers), rather
than infinite sets, but it is all mathematics.
Mathematics on standard integers doesn't define 1/0. Mathematics on a
finite set for C-style signed integers leaves a lot more values
undefined on more operations. It doesn't mean it is not mathematical.
Essentially, you and I appear to have a difference over what one should
see as 'mathematics' but I don't think we disagree over substance.
Yes. But then, I am mathematically trained, and know a lot more about
what it means than most people, who usually think of school-level sums
and possibly weird stuff involving letters instead of numbers.
Unfortunately, many programming tutorials encourage programmers to
simply assume that any value is 'large enough' and so will behave
according to the normal rules of mathematics. But as everyone here
knows, that is not always the case, as was shown in the example I saw
discussed recently of
255 + 1
what that results in is decided by what I would call 'engineering', and
not by the normal rules of mathematics.
Engineering is about applying the mathematical (and perhaps physical, chemical, etc.) laws to practical situations. An engineer who does not understand that there is a mathematical basis for what they do is in the wrong profession. (I certainly don't mean that they should understand
the mathematics involved - but they should understand that there /is/ mathematics involved, and that the mathematics is what justifies the
rules and calculations they apply.)
On 2021-10-03 22:20, Jreally ames Harris wrote:
On 03/10/2021 19:06, David Brown wrote:
The way integer types work in programming is all defined
mathematically. It doesn't matter whether you are looking at
implementations in terms of bits and representations, or the higher
level usage of the integers. It doesn't matter if you are talking
signed, unsigned, wrapping, overflowing, saturating, or anything else -
it's all mathematics. Even the concept of undefined behaviour is
solidly embedded in mathematics.
I have to disagree. For sure, one can express computer arithmetic in
mathematical terms but because of the limits imposed by fixed
representations there will be many caveats which are not normally
present in mathematics.
You took it upside down. Any computer arithmetic is an approximation
(model) of some mathematical structure. There is simply nothing else and cannot be anything else. Just because in a very improbable case that something really new is discovered, it is studied using the mathematical apparatus, not by buggy code.
As a rule of thumb take that you will never find anything new, only your
bugs and errors.
Note the word *model*. Where the model becomes inadequate certain well-defined actions are fired at compile time (desirable) or as a last resort at run time.
But as everyone here knows, that is not always the case, as was shown
in the example I saw discussed recently of
255 + 1
what that results in is decided by what I would call 'engineering',
and not by the normal rules of mathematics.
It is *always* decided by the model. The engineer selects the best
fitting model for the case at hand using various criteria of choice (performance, resource limitations, lack of time and qualification, economical viability, maintainability etc).
The model is normally adequate for all inputs considered legal. Illegal inputs are detected and processed, e.g. by handling exceptions. They are especially a subject of risk estimation.
I just have a different way of treating numeric types. So i64 is a
signed integer type, and i8 i16 i32 are just narrower, storage versions
of the /same type/.
On 03/10/2021 23:05, David Brown wrote:
On 03/10/2021 20:27, Bart wrote:
Processors are designed to do many things. Exactly duplicating standard
mathematical integers is not one of those things. Being usable to model
a limited version of those integers - following somewhat different
mathematical rules and definitions - /is/ one of those things.
No, it's just arithmetic with a limited number of digits. And usually in binary.
It's engineering, not maths. But of course you can apply maths to anything.
Personally I'd prefer if mathematics kept its nose out of things.
Especially in programming language design with its type theory and
lambda calculus and all the the rest.
Yes, I know you are happy with a ducttape and string solution -
Look at the recent thread on clc there I compare my 'tabledata' feature
for defining parallel data sets, with C's X-macros to do the same job.
Which one was more analogous to using duct-tape and string?!
if it
looks okay, and tests okay, it's okay by you. Many of us prefer a more
solid theoretical and mathematical foundation to what we do - we'd
rather /know/ it is okay, in addition to testing that it is okay (since
we all get our sums wrong occasionally).
As I said, it's engineering. And also, for programming languages,
aesthetic design.
It's not that easy devising languages that are simple, clear and easy to reason about. Far better than pages of arcane symbols.
(Look at Knuth's MIX language. Being an academic doesn't mean you're a
whizz at language design.)
No. The point of having mathematicians, computer scientists, and other
more theoretical people involved is so that you have the right base for
what you are doing. /Then/ you can let more practical-minded (but
perhaps more problem-focused, or user-focused) people build on it.
Fine, let the mathematicians come up with the formulae and algorithms,
but stay out of how I want to design /my/ language.
Most of the maths I did at school (pure and applied maths) did come in
very useful for my work, so I appreciate some of it.
But not when I'm brow-beaten with it by the likes of DAK.
On 03/10/2021 23:06, Dmitry A. Kazakov wrote:
On 2021-10-03 22:20, Jreally ames Harris wrote:
On 03/10/2021 19:06, David Brown wrote:
The way integer types work in programming is all defined
mathematically. It doesn't matter whether you are looking at
implementations in terms of bits and representations, or the higher
level usage of the integers. It doesn't matter if you are talking
signed, unsigned, wrapping, overflowing, saturating, or anything else - >>>> it's all mathematics. Even the concept of undefined behaviour is
solidly embedded in mathematics.
I have to disagree. For sure, one can express computer arithmetic in
mathematical terms but because of the limits imposed by fixed
representations there will be many caveats which are not normally
present in mathematics.
You took it upside down. Any computer arithmetic is an approximation
(model) of some mathematical structure. There is simply nothing else
and cannot be anything else. Just because in a very improbable case
that something really new is discovered, it is studied using the
mathematical apparatus, not by buggy code.
As a rule of thumb take that you will never find anything new, only
your bugs and errors.
Note the word *model*. Where the model becomes inadequate certain
well-defined actions are fired at compile time (desirable) or as a
last resort at run time.
I'm not sure I agree with any of that! Integer arithmetic does not
generate approximations.
In a digital computer all results on integer
operands are precisely defined.
On 2021-10-04 10:29, James Harris wrote:
I'm not sure I agree with any of that! Integer arithmetic does not
generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point model
with exact addition and multiplication and there is the floating-point
model with all operations inexact and even non-associative.
On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
On 2021-10-04 10:29, James Harris wrote:
...
I'm not sure I agree with any of that! Integer arithmetic does not
generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point
model with exact addition and multiplication and there is the
floating-point model with all operations inexact and even
non-associative.
Non-associativity can also apply to integers. Consider
A + B + C
Even in such a simple expression if overflow of intermediate results is
to be detected then (A + B) + C is not A + (B + C).
On 04/09/2021 13:35, Bart wrote:
...
I just have a different way of treating numeric types. So i64 is a
signed integer type, and i8 i16 i32 are just narrower, storage
versions of the /same type/.
That's an intriguing comment. Dmitry and I once had a good argument
about what constitutes a type.
Would you accept that i8, i16 etc are different concrete types even if
they are the same abstract type?
On 03/10/2021 23:14, David Brown wrote:
On 03/10/2021 22:20, James Harris wrote:
...
1. Integer arithmetic where all values - including intermediate results
- remain in range for the data type. In this, the computer implements
normal mathematics.
2. Integer arithmetic where either a result or an intermediate value
does not fit in the range assigned. For these a decision has to be made
(by hardware, by language or by compiler) as to what to do with the
non-compliant value. As you say, there are various options but they have >>> to be cast semantically in terms of "if this happens then do that"
rather than following the normal rules of mathematics. Worse, exactly
where the limits apply can even depend on implementation.
And it is all defined mathematically.
We are talking finite sets with partial operations (for C-style signed
integers) or closed operations (for C-style unsigned integers), rather
than infinite sets, but it is all mathematics.
OK, then how would you define integer computing's
A - B
in terms of mathematics?
Mathematics on standard integers doesn't define 1/0. Mathematics on a
finite set for C-style signed integers leaves a lot more values
undefined on more operations. It doesn't mean it is not mathematical.
OK, then how would you define integer computing's
A / B
in terms of mathematics?
No need to reply but I'd suggest to you that because of the limits of computer fixed representation both of those are much more complex than
just 'mathematics'!
Essentially, you and I appear to have a difference over what one should
see as 'mathematics' but I don't think we disagree over substance.
Yes. But then, I am mathematically trained, and know a lot more about
what it means than most people, who usually think of school-level sums
and possibly weird stuff involving letters instead of numbers.
That's intriguing. What do you mean by "weird stuff involving letters
instead of numbers"?
Unfortunately, many programming tutorials encourage programmers to
simply assume that any value is 'large enough' and so will behave
according to the normal rules of mathematics. But as everyone here
knows, that is not always the case, as was shown in the example I saw
discussed recently of
255 + 1
what that results in is decided by what I would call 'engineering', and
not by the normal rules of mathematics.
Engineering is about applying the mathematical (and perhaps physical,
chemical, etc.) laws to practical situations. An engineer who does not
understand that there is a mathematical basis for what they do is in the
wrong profession. (I certainly don't mean that they should understand
the mathematics involved - but they should understand that there /is/
mathematics involved, and that the mathematics is what justifies the
rules and calculations they apply.)
Well, I would say that engineering includes being aware of and
accommodating limits - including those limits where simple mathematics
breaks down and no longer applies. YMMV.
On 04/10/2021 10:39, James Harris wrote:
On 03/10/2021 23:14, David Brown wrote:
On 03/10/2021 22:20, James Harris wrote:
...
1. Integer arithmetic where all values - including intermediate results >>>> - remain in range for the data type. In this, the computer implements
normal mathematics.
2. Integer arithmetic where either a result or an intermediate value
does not fit in the range assigned. For these a decision has to be made >>>> (by hardware, by language or by compiler) as to what to do with the
non-compliant value. As you say, there are various options but they have >>>> to be cast semantically in terms of "if this happens then do that"
rather than following the normal rules of mathematics. Worse, exactly
where the limits apply can even depend on implementation.
And it is all defined mathematically.
We are talking finite sets with partial operations (for C-style signed
integers) or closed operations (for C-style unsigned integers), rather
than infinite sets, but it is all mathematics.
OK, then how would you define integer computing's
A - B
in terms of mathematics?
Mathematics on standard integers doesn't define 1/0. Mathematics on a
finite set for C-style signed integers leaves a lot more values
undefined on more operations. It doesn't mean it is not mathematical.
OK, then how would you define integer computing's
A / B
in terms of mathematics?
No need to reply but I'd suggest to you that because of the limits of
computer fixed representation both of those are much more complex than
just 'mathematics'!
I'd agree that they are somewhat complicated by the limited sizes of fixed-size types - but they are still just "mathematics".
How about:
1. For a given fixed size of computer integer, "A - B" is defined as the result of normal mathematical integer subtraction as long as that result
fits within the type.
(That's basically how C defines it, if you stick to "int" or ignore the promotion stuff.)
or
2. "A - B" is defined as the result of normal mathematical integer subtraction, reduced modulo 2^n as necessary to fit in the range of the
type.
(That's how "gcc -fwrapv" defines it.)
or
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
or
4. "A - B" is defined as either the result of normal integer subtraction
if that fits within the range of the type, or an exception condition otherwise.
These are all perfectly reasonable mathematical definitions for
subtraction. Note that partial functions, such as in definition 1, are
quite standard in mathematics.
Well, I would say that engineering includes being aware of and
accommodating limits - including those limits where simple mathematics
breaks down and no longer applies. YMMV.
Mathematics doesn't break down. That's the point.
There are other ways of defining these operations.
With a word size of 8 bits, and operands A, B and results are of an u8
type, you can just enumerate all possible results of A + B. Maybe use a function F(A,B) or a table [0..255,0..255]u8 T.
The results will generally correspond to doing that operation on a real
ALU, but you could in theory define the operations however you like.
See, you don't need maths, unless you're of those who says you need mathematics to show that 1+1 is 2.
On 2021-10-04 21:55, Bart wrote:
There are other ways of defining these operations.
With a word size of 8 bits, and operands A, B and results are of an u8
type, you can just enumerate all possible results of A + B. Maybe use
a function F(A,B) or a table [0..255,0..255]u8 T.
The results will generally correspond to doing that operation on a
real ALU, but you could in theory define the operations however you like.
See, you don't need maths, unless you're of those who says you need
mathematics to show that 1+1 is 2.
You must be trolling again, because nobody could be so ignorant.
Never heard of:
- multiplication table
- function table
- logarithmic tables
- artillery tables
1. There is the set N of natural numbers which is a subset of the set of integer numbers Z. There is nothing special about N that could
distinguish it from any other subrange of Z, like 100..1000. Implement ranges and be done with that.
That's ....
You must be trolling again, because nobody could be so ignorant.
On 04/10/2021 19:19, David Brown wrote:
On 04/10/2021 10:39, James Harris wrote:
On 03/10/2021 23:14, David Brown wrote:
On 03/10/2021 22:20, James Harris wrote:
...
1. Integer arithmetic where all values - including intermediate
results
- remain in range for the data type. In this, the computer implements >>>>> normal mathematics.
2. Integer arithmetic where either a result or an intermediate value >>>>> does not fit in the range assigned. For these a decision has to be
made
(by hardware, by language or by compiler) as to what to do with the
non-compliant value. As you say, there are various options but they
have
to be cast semantically in terms of "if this happens then do that"
rather than following the normal rules of mathematics. Worse, exactly >>>>> where the limits apply can even depend on implementation.
And it is all defined mathematically.
We are talking finite sets with partial operations (for C-style signed >>>> integers) or closed operations (for C-style unsigned integers), rather >>>> than infinite sets, but it is all mathematics.
OK, then how would you define integer computing's
A - B
in terms of mathematics?
OK, then how would you define integer computing's
Mathematics on standard integers doesn't define 1/0. Mathematics on a >>>> finite set for C-style signed integers leaves a lot more values
undefined on more operations. It doesn't mean it is not mathematical. >>>
A / B
in terms of mathematics?
No need to reply but I'd suggest to you that because of the limits of
computer fixed representation both of those are much more complex than
just 'mathematics'!
I'd agree that they are somewhat complicated by the limited sizes of
fixed-size types - but they are still just "mathematics".
How about:
1. For a given fixed size of computer integer, "A - B" is defined as the
result of normal mathematical integer subtraction as long as that result
fits within the type.
(That's basically how C defines it, if you stick to "int" or ignore the
promotion stuff.)
or
2. "A - B" is defined as the result of normal mathematical integer
subtraction, reduced modulo 2^n as necessary to fit in the range of the
type.
(That's how "gcc -fwrapv" defines it.)
or
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
or
4. "A - B" is defined as either the result of normal integer subtraction
if that fits within the range of the type, or an exception condition
otherwise.
These are all perfectly reasonable mathematical definitions for
subtraction. Note that partial functions, such as in definition 1, are
quite standard in mathematics.
There are other ways of defining these operations.
With a word size of 8 bits, and operands A, B and results are of an u8
type, you can just enumerate all possible results of A + B. Maybe use a function F(A,B) or a table [0..255,0..255]u8 T.
The results will generally correspond to doing that operation on a real
ALU, but you could in theory define the operations however you like.
See, you don't need maths, unless you're of those who says you need mathematics to show that 1+1 is 2.
Well, I would say that engineering includes being aware of and
accommodating limits - including those limits where simple mathematics
breaks down and no longer applies. YMMV.
Mathematics doesn't break down. That's the point.
No? How would you define mathematically this function:
function f(n) =
if n = 666 then
return "ABC"
else
return n
fi
end
If you don't like that string, change it to some arbitrary integer. Once you've done that, insert:
system("format c:")
before that first return. (I won't try this one in case it actually does
it!)
On 04/10/2021 21:55, Bart wrote:
There are other ways of defining these operations.
Indeed there are. I wasn't trying to be complete.
With a word size of 8 bits, and operands A, B and results are of an u8
type, you can just enumerate all possible results of A + B. Maybe use a
function F(A,B) or a table [0..255,0..255]u8 T.
The results will generally correspond to doing that operation on a real
ALU, but you could in theory define the operations however you like.
You are mixing a definition of the operation (the specification, if you prefer) with implementations. The implementation is irrelevant to the definition of the function or operation.
You could, I suppose, write out all the results of the addition in a
huge table and call that your specification.
See, you don't need maths, unless you're of those who says you need
mathematics to show that 1+1 is 2.
Are you trolling,
or do you not understand what mathematics is?
If you
are giving precise definitions of the operations and functions
(including partial functions - that is, leaving the results for some
inputs as undefined) then it is /maths/. If you define the operation A
+ B on numbers from two sets, based on a complete or partial enumeration
of all the results, then it is /maths/.
If I say I want my language to have an operation ¤ that works on two-bit numbers according to the table:
| 00 | 01 | 10 | 11
----------------------
00 | 00 | 11 | xx | 11
01 | 10 | 11 | xx | 11
10 | 11 | xx | xx | 11
11 | xx | 10 | 00 | 11
where "xx" means "the function is not defined on these inputs", then
that is /maths/. It is a mathematical definition. It tells you what combinations of inputs are valid, and how to calculate the results of
any given valid set of inputs.
No? How would you define mathematically this function:
function f(n) =
if n = 666 then
return "ABC"
else
return n
fi
end
That in itself is nearly a mathematical definition of the function. It
is precise and unambiguous, merely missing information about the set of
input values. (The set of output values is implied by the function definition, once the possible inputs are defined.)
Why would you think it was /not/ mathematics? Do you think "if" is not allowed in mathematics?
On 05/10/2021 15:13, David Brown wrote:
On 04/10/2021 21:55, Bart wrote:
There are other ways of defining these operations.
Indeed there are. I wasn't trying to be complete.
I meant defining your categories.
With a word size of 8 bits, and operands A, B and results are of an u8
type, you can just enumerate all possible results of A + B. Maybe use a
function F(A,B) or a table [0..255,0..255]u8 T.
The results will generally correspond to doing that operation on a real
ALU, but you could in theory define the operations however you like.
You are mixing a definition of the operation (the specification, if you
prefer) with implementations. The implementation is irrelevant to the
definition of the function or operation.
You could, I suppose, write out all the results of the addition in a
huge table and call that your specification.
See, you don't need maths, unless you're of those who says you need
mathematics to show that 1+1 is 2.
Are you trolling,
Again with the trolling...
or do you not understand what mathematics is?
Maybe I don't. I stopped 'getting it' since I started being involved
with computers.
If you
are giving precise definitions of the operations and functions
(including partial functions - that is, leaving the results for some
inputs as undefined) then it is /maths/. If you define the operation A
+ B on numbers from two sets, based on a complete or partial enumeration
of all the results, then it is /maths/.
If I say I want my language to have an operation ¤ that works on two-bit
numbers according to the table:
| 00 | 01 | 10 | 11
----------------------
00 | 00 | 11 | xx | 11
01 | 10 | 11 | xx | 11
10 | 11 | xx | xx | 11
11 | xx | 10 | 00 | 11
(Is there supposed to be a pattern here?)
where "xx" means "the function is not defined on these inputs", then
that is /maths/. It is a mathematical definition. It tells you what
combinations of inputs are valid, and how to calculate the results of
any given valid set of inputs.
If you want to call it maths, then fine. (Is there anything that isn't
maths then?)
But if there existed a huge table to define the possible values of i64 +
i64, where overflow wraps as it does on an x64 processors, or one where overflow is xx as it is in C, that still wouldn't satisfy DAK despite it being apparently valid mathematical behaviour.
There is something else about it that he just doesn't like. But he
brings his superior knowledge of maths to it in an effort to prove that
this is wrong behaviour, and the right behaviour can only be what he
says it is.
No? How would you define mathematically this function:
function f(n) =
if n = 666 then
return "ABC"
else
return n
fi
end
That in itself is nearly a mathematical definition of the function. It
is precise and unambiguous, merely missing information about the set of
input values. (The set of output values is implied by the function
definition, once the possible inputs are defined.)
If I run this program, n can be any value that can be represented by the types of that dynamic language, which include numeric types, strings,
lists, dicts etc. "=" is defined between any two types.
It will only return "ABC" when n is:
* Integer 666
* Unsigned integer 666
* Float 666.0
* Decimal 666L
* String "ABC"
Otherwise it returns n. (There is no overload mechanism for "+", which /would/ make poorly defined.)
However, this is still too lax for certain people with an irrational
dislike of dynamic typing, even when you show that these types+values
can all be considered runtime data of the one variant type.
Why would you think it was /not/ mathematics? Do you think "if" is not
allowed in mathematics?
Not in the maths I did.
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
or
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
?
On 2021-10-04 11:38, James Harris wrote:
On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
On 2021-10-04 10:29, James Harris wrote:
...
I'm not sure I agree with any of that! Integer arithmetic does not
generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point
model with exact addition and multiplication and there is the
floating-point model with all operations inexact and even
non-associative.
Non-associativity can also apply to integers. Consider
A + B + C
Even in such a simple expression if overflow of intermediate results
is to be detected then (A + B) + C is not A + (B + C).
No, overflow is outside the model. Inside the model computer integer arithmetic is associative.
[ Equivalent modification of computing algorithms to keep the model
adequate is yet another issue, and also mathematics. ]
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
On 2021-10-04 11:38, James Harris wrote:
On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
On 2021-10-04 10:29, James Harris wrote:
...
I'm not sure I agree with any of that! Integer arithmetic does not
generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point
model with exact addition and multiplication and there is the
floating-point model with all operations inexact and even
non-associative.
Non-associativity can also apply to integers. Consider
A + B + C
Even in such a simple expression if overflow of intermediate results
is to be detected then (A + B) + C is not A + (B + C).
No, overflow is outside the model. Inside the model computer integer
arithmetic is associative.
Then the model is inadequate - and that's party my point. Computers do
not implement the normal mathematical model for integers, but a subset
and a superset thereof, even for something as simple as addition of
integers.
On 05/10/2021 18:35, David Brown wrote:
...
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
or
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
?
I haven't read the messages which led to this one, yet, but this one
caught my eye due to the graphics - the delta symbol and the clever
extended bracing. Very impressive!
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it were
using 2's complement representation and x was the most negative number.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
On 2021-10-04 11:38, James Harris wrote:
On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
On 2021-10-04 10:29, James Harris wrote:
...
I'm not sure I agree with any of that! Integer arithmetic does not
generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point
model with exact addition and multiplication and there is the
floating-point model with all operations inexact and even
non-associative.
Non-associativity can also apply to integers. Consider
A + B + C
Even in such a simple expression if overflow of intermediate results
is to be detected then (A + B) + C is not A + (B + C).
No, overflow is outside the model. Inside the model computer integer
arithmetic is associative.
Then the model is inadequate - and that's party my point.
Computers do
not implement the normal mathematical model for integers,
On 04/10/2021 10:39, James Harris wrote:
On 03/10/2021 23:14, David Brown wrote:
On 03/10/2021 22:20, James Harris wrote:
...
1. Integer arithmetic where all values - including intermediate results >>>> - remain in range for the data type. In this, the computer implements
normal mathematics.
2. Integer arithmetic where either a result or an intermediate value
does not fit in the range assigned. For these a decision has to be made >>>> (by hardware, by language or by compiler) as to what to do with the
non-compliant value. As you say, there are various options but they have >>>> to be cast semantically in terms of "if this happens then do that"
rather than following the normal rules of mathematics. Worse, exactly
where the limits apply can even depend on implementation.
And it is all defined mathematically.
We are talking finite sets with partial operations (for C-style signed
integers) or closed operations (for C-style unsigned integers), rather
than infinite sets, but it is all mathematics.
OK, then how would you define integer computing's
A - B
in terms of mathematics?
Mathematics on standard integers doesn't define 1/0. Mathematics on a
finite set for C-style signed integers leaves a lot more values
undefined on more operations. It doesn't mean it is not mathematical.
OK, then how would you define integer computing's
A / B
in terms of mathematics?
No need to reply but I'd suggest to you that because of the limits of
computer fixed representation both of those are much more complex than
just 'mathematics'!
I'd agree that they are somewhat complicated by the limited sizes of fixed-size types - but they are still just "mathematics".
How about:
1. For a given fixed size of computer integer, "A - B" is defined as the result of normal mathematical integer subtraction as long as that result
fits within the type.
(That's basically how C defines it, if you stick to "int" or ignore the promotion stuff.)
or
2. "A - B" is defined as the result of normal mathematical integer subtraction, reduced modulo 2^n as necessary to fit in the range of the
type.
(That's how "gcc -fwrapv" defines it.)
or
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
or
4. "A - B" is defined as either the result of normal integer subtraction
if that fits within the range of the type, or an exception condition otherwise.
Unfortunately, many programming tutorials encourage programmers to
simply assume that any value is 'large enough' and so will behave
according to the normal rules of mathematics. But as everyone here
knows, that is not always the case, as was shown in the example I saw
discussed recently of
255 + 1
what that results in is decided by what I would call 'engineering', and >>>> not by the normal rules of mathematics.
Engineering is about applying the mathematical (and perhaps physical,
chemical, etc.) laws to practical situations. An engineer who does not >>> understand that there is a mathematical basis for what they do is in the >>> wrong profession. (I certainly don't mean that they should understand
the mathematics involved - but they should understand that there /is/
mathematics involved, and that the mathematics is what justifies the
rules and calculations they apply.)
Well, I would say that engineering includes being aware of and
accommodating limits - including those limits where simple mathematics
breaks down and no longer applies. YMMV.
Mathematics doesn't break down. That's the point.
On 05/10/2021 18:35, David Brown wrote:
...
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
or
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
?
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it were
using 2's complement representation and x was the most negative number.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
On 2021-10-05 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
or
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
?
[...]
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it
were using 2's complement representation and x was the most negative
number.
You do not understand difference between definition and implementation?
Definitions never fail.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
Of course it will. You can easily verify it in any reasonable language.
All of them follow the implementation principle, the result is either mathematically correct (within the model constraints and tolerance) or
else some exceptional action happens, which could be an exception or
some ideal non-numeric value like NaN.
For purely educational purpose, I suggest you reading the classic book
"Computer Methods for Mathematical Computations"
by George Elmer Forsythe, Michael A. Malcolm and Cleve B. Moler. I have
no hope in Bart, but it could open your eyes.
The book begins with an instructive example of finding roots of a
quadratic equation using the school formula. It shows how awful such an implementation would be and how to fix that.
On 05/10/2021 20:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
On 2021-10-04 11:38, James Harris wrote:
On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
On 2021-10-04 10:29, James Harris wrote:
...
I'm not sure I agree with any of that! Integer arithmetic does not >>>>>> generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point
model with exact addition and multiplication and there is the
floating-point model with all operations inexact and even
non-associative.
Non-associativity can also apply to integers. Consider
A + B + C
Even in such a simple expression if overflow of intermediate results
is to be detected then (A + B) + C is not A + (B + C).
No, overflow is outside the model. Inside the model computer integer
arithmetic is associative.
Then the model is inadequate - and that's party my point. Computers do
not implement the normal mathematical model for integers, but a subset
and a superset thereof, even for something as simple as addition of
integers.
Well, mathematics cheats a little. Because A + B in maths is just:
A + B
It usually doesn't need to evaluate it, so overflow is irrelevant!
But suppose, given some concrete values for A and B, it DID need to
evaluate A + B into a concrete result.
Who or what does that a calculation: a human? a machine? Whatever actual physical thing is used, may well have the same limitations as a
programming language.
On 05/10/2021 21:02, Dmitry A. Kazakov wrote:
On 2021-10-05 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
or
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
?
[...]
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it
were using 2's complement representation and x was the most negative
number.
You do not understand difference between definition and implementation?
I understand the difference very well. Better than you, it seems. ;-)
Definitions never fail.
How would /you/ mathematically define abs() on integers?
Well, mathematics cheats a little. Because A + B in maths is just:
A + B
It usually doesn't need to evaluate it, so overflow is irrelevant!
But suppose, given some concrete values for A and B, it DID need to
evaluate A + B into a concrete result.
Who or what does that a calculation: a human? a machine? Whatever
actual physical thing is used, may well have the same limitations as
a programming language.
To me that's the key difference between maths equations that just
look pretty, and code that is actually executed. In the latter you
have to make some compromises.
On 05/10/2021 17:47, Bart wrote:
On 05/10/2021 15:13, David Brown wrote:
or do you not understand what mathematics is?
Maybe I don't. I stopped 'getting it' since I started being involved
with computers.
But you still feel qualified to argue that computing is not mathematics,
or that computer operations are not mathematically defined?
Different behaviours of, for example, signed integer arithmetic can have perfectly good mathematical definitions. In that sense it is not
possible to call wrapping semantics, or undefined overflow semantics,
"right" or "wrong" - they are both valid, and you can give clear
mathematical definitions in both cases. You can use these definitions
to prove things about the arithmetic, such as commutative laws. Some
things can be proved about one version of the definition but not others
- with C-style signed arithmetic, you can prove that "x + 1 > x" is
always true.
A dislike of dynamic typing is not irrational - nor is a dislike of
strong static typing. (And you really mean "weak" typing, rather than "dynamic" typing here.)
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
On 05/10/2021 18:35, David Brown wrote:
...
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
or
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
?
I haven't read the messages which led to this one, yet, but this one
caught my eye due to the graphics - the delta symbol and the clever
extended bracing. Very impressive!
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it were
using 2's complement representation and x was the most negative number.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
That looks like a useful function!
On 05/10/2021 20:45, Bart wrote:
Well, mathematics cheats a little. Because A + B in maths is just:
A + B
It usually doesn't need to evaluate it, so overflow is irrelevant!
But suppose, given some concrete values for A and B, it DID need to
evaluate A + B into a concrete result.
Others have partially addressed this point. But there are
several other partial answers:
(a) This [suitably generalised] is the entire purpose of the
branch of mathematics called "numerical analysis"*. We
needed to get concrete results long before we had computing
machines.
These were examples of function definitions with conditionals in common mathematics using real numbers - they cannot be implemented directly in computer code. If you want a mathematical definition of "abs" for fixed
size integer types in a programming language, you must adapt it to a different mathematical definition that is suitable for the domain you
are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still
maths. Two possibilities for n-bit two's complement signed integers
could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ int_min, if x = int_min
or
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ undefined, if x = int_min
Both are good, solid mathematical definitions - and both can be
implemented. They have slightly different characteristics, each with
their pros and cons.
On 05/10/2021 23:21, Andy Walker wrote:
On 05/10/2021 20:45, Bart wrote:
Well, mathematics cheats a little. Because A + B in maths is just:
A + B
It usually doesn't need to evaluate it, so overflow is irrelevant!
But suppose, given some concrete values for A and B, it DID need to
evaluate A + B into a concrete result.
Others have partially addressed this point. But there are
several other partial answers:
(a) This [suitably generalised] is the entire purpose of the
branch of mathematics called "numerical analysis"*. We
needed to get concrete results long before we had computing
machines.
My approach to solving problems would be computation or trial and error rather than doing things analytically,
I sometimes like to solve puzzles, but I'm not good enough to do it
manually, and don't have the maths skills to do it that way.
So I use brute force, with a computer program, if I think it is
practical in a reasonable time.
Here's an example of a puzzle [...].
Is what I did to solve this (aside from designing and implementing
the language used) maths? Not in my view, as I was trying to avoid
using it. But apparently it was.
On 04/10/2021 01:58, Bart wrote:
On 03/10/2021 23:05, David Brown wrote:
On 03/10/2021 20:27, Bart wrote:
Processors are designed to do many things. Exactly duplicating standard >>> mathematical integers is not one of those things. Being usable to model >>> a limited version of those integers - following somewhat different
mathematical rules and definitions - /is/ one of those things.
No, it's just arithmetic with a limited number of digits. And usually in
binary.
It's engineering, not maths. But of course you can apply maths to anything.
Engineering /is/ applied maths!
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
On 2021-10-04 11:38, James Harris wrote:
On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
On 2021-10-04 10:29, James Harris wrote:
...
I'm not sure I agree with any of that! Integer arithmetic does not >>>>>> generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point
model with exact addition and multiplication and there is the
floating-point model with all operations inexact and even
non-associative.
Non-associativity can also apply to integers. Consider
A + B + C
Even in such a simple expression if overflow of intermediate results
is to be detected then (A + B) + C is not A + (B + C).
No, overflow is outside the model. Inside the model computer integer
arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid
mechanics and strength of materials is inadequate in general case, but perfectly well for building bridges.
Do not overflow your numbers, OK?
Computers do not implement the normal mathematical model for integers,
They always do, within well-defined constraints and tolerances.
On 05/10/2021 18:35, David Brown wrote:
On 05/10/2021 17:47, Bart wrote:
On 05/10/2021 15:13, David Brown wrote:
or do you not understand what mathematics is?
Maybe I don't. I stopped 'getting it' since I started being involved
with computers.
But you still feel qualified to argue that computing is not mathematics,
or that computer operations are not mathematically defined?
I don't know why everyone seems determined to question my credentials.
I do my stuff without using maths or knowingly using it; I don't care so
long as I get results.
To me it just seems obvious and intuitive.
But FWIW, I didn't study it beyond 'A' level pure maths at school
(though getting a top grade in it), then I decided to do a CS degree
rather than pursue mathematics further.
I then forgot most of it, except for the subjects I needed, for which I
had to go out and re-purchase the textbooks I'd thrown away, in order to
get the necessary formulae.
I did spend rather a lot of time in the early 80s programming 3D
floating point graphics, on machines with no floating point, not even
integer multiply and divide, for which I had to write emulation code
(yeah, those Taylor series or whatever it was proved useful after all,
but you also needed some ingenuity).
I don't really care about anyone else's background, however, can I just
ask: how many here having a go at me for my irreverent approach to mathematics, have actually coded anything like arbitrary procession
floating point, and have incorporated it into a language?
Different behaviours of, for example, signed integer arithmetic can have
perfectly good mathematical definitions. In that sense it is not
possible to call wrapping semantics, or undefined overflow semantics,
"right" or "wrong" - they are both valid, and you can give clear
mathematical definitions in both cases. You can use these definitions
to prove things about the arithmetic, such as commutative laws. Some
things can be proved about one version of the definition but not others
- with C-style signed arithmetic, you can prove that "x + 1 > x" is
always true.
The arbitrary precision library I mentioned above is limited only by
memory space and runtime.
Which actually makes it harder to define or predict behaviour since it depends on environmental factors - and the user's patience.
But for the sorts of, by comparison, miniscule values represented by i64
and u64, those would never be a problem.
So, I provide a choice: use efficient types if you know they will not overflow, or use a big-number type.
A dislike of dynamic typing is not irrational - nor is a dislike of
strong static typing. (And you really mean "weak" typing, rather than
"dynamic" typing here.)
It's usually strong. It's weak for equal/not-equal, but Python has that
same behaviour: you can compare any two objects; the result will be
false if not compatible.
You never came across functions like Dirac's delta function (very
popular in engineering circles) :
𝛿(x) = ⎧ +∞, if x = 0
⎨
⎩ 0, otherwise
That looks like a useful function!
[Bart:]You never came across functions like Dirac's delta function [...]
That looks like a useful function!Yes, it's a great one - it's useful in many cases. It is the derivative
of the step function, which is another useful function defined using conditionals. [...]
On 04/10/2021 10:50, David Brown wrote:
Engineering /is/ applied maths!
/includes/
!
Engineering is a lot of things: science, materials, chemistry,
mathematics, biology, etc.
On 05/10/2021 18:35, David Brown wrote:
On 05/10/2021 17:47, Bart wrote:
On 05/10/2021 15:13, David Brown wrote:
or do you not understand what mathematics is?
Maybe I don't. I stopped 'getting it' since I started being involved
with computers.
But you still feel qualified to argue that computing is not mathematics,
or that computer operations are not mathematically defined?
I don't know why everyone seems determined to question my credentials.
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it were
using 2's complement representation and x was the most negative number.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
These were examples of function definitions with conditionals in common mathematics using real numbers - they cannot be implemented directly in computer code. If you want a mathematical definition of "abs" for fixed
size integer types in a programming language, you must adapt it to a different mathematical definition that is suitable for the domain you
are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still
maths. Two possibilities for n-bit two's complement signed integers
could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ int_min, if x = int_min
On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
On 2021-10-04 11:38, James Harris wrote:
On 04/10/2021 10:15, Dmitry A. Kazakov wrote:
On 2021-10-04 10:29, James Harris wrote:
...
I'm not sure I agree with any of that! Integer arithmetic does
not generate approximations.
Approximation means that the model is inexact, integers are
incomputable, so you must give up something, typically the range.
The case of real numbers is more showing. There is the fixed-point >>>>>> model with exact addition and multiplication and there is the
floating-point model with all operations inexact and even
non-associative.
Non-associativity can also apply to integers. Consider
A + B + C
Even in such a simple expression if overflow of intermediate
results is to be detected then (A + B) + C is not A + (B + C).
No, overflow is outside the model. Inside the model computer integer
arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid
mechanics and strength of materials is inadequate in general case, but
perfectly well for building bridges.
Do not overflow your numbers, OK?
That's maths. Engineering includes how to respond to overflow (or the potential thereof).
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's beyond >>>> my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a
lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1 My_Custom_Integer
etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
On 2021-10-06 16:56, James Harris wrote:
On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
No, overflow is outside the model. Inside the model computer
integer arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid
mechanics and strength of materials is inadequate in general case,
but perfectly well for building bridges.
Do not overflow your numbers, OK?
That's maths. Engineering includes how to respond to overflow (or the
potential thereof).
No, engineering is how to *avoid* overflows.
Fine, Dmitry. You try to write code to avoid
A * B
overflowing before you execute the multiply.
On 2021-10-06 16:51, James Harris wrote:
On 04/10/2021 10:50, David Brown wrote:
Engineering /is/ applied maths!
/includes/
!
Engineering is a lot of things: science, materials, chemistry,
mathematics, biology, etc.
Engineering is application of science for solving practical problems.
Mathematics is a science and is a basis of all other sciences (as well
as many pseudo-sciences).
Ergo, the statement stands.
On 2021-10-06 16:56, James Harris wrote:
On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
No, overflow is outside the model. Inside the model computer
integer arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid
mechanics and strength of materials is inadequate in general case,
but perfectly well for building bridges.
Do not overflow your numbers, OK?
That's maths. Engineering includes how to respond to overflow (or the
potential thereof).
No, engineering is how to *avoid* overflows.
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
On 2021-09-05 10:54, David Brown wrote:
On 04/09/2021 17:16, Dmitry A. Kazakov wrote:
On 2021-09-04 15:39, David Brown wrote:
Ada also solves this kind of problem by not allowing comparisons
between
different types. (I don't know how it handles literals - that's
beyond
my rather limited knowledge of the language.)
When operations can be overloaded in the result type that simplifies a >>>> lot. Literals are semantically overloaded parameterless functions. 1
Integer overloads, 1 Unsigned_16, 1 Long_Integer, 1
My_Custom_Integer etc.
I'm not very keen on overloading in the result type - it feels to me
that it would be too easy to lose track of what is going on, and to too
easy to have code that appears identical (same expression, same
variables, same types, etc.) but completely different effects.
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
Assuming that Create is the same as Create() contrast
T := Create;
with
T := Create + 0;
Why should the former work and the latter, presumably, fail?
On 2021-10-06 18:15, James Harris wrote:
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
On 2021-10-06 16:56, James Harris wrote:
On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
No, overflow is outside the model. Inside the model computer
integer arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid
mechanics and strength of materials is inadequate in general case,
but perfectly well for building bridges.
Do not overflow your numbers, OK?
That's maths. Engineering includes how to respond to overflow (or
the potential thereof).
No, engineering is how to *avoid* overflows.
Fine, Dmitry. You try to write code to avoid
A * B
overflowing before you execute the multiply.
Here is how it is done:
1. The problem domain. The numeric type is selected from there. E.g. typically A is a measurement of something you know the range of.
2. The algorithm. The formula A*B is a part of some larger algorithm.
E.g. some iterative approximation etc. Here comes the mathematics. Most
of good algorithms allow estimations of the upper and lover bounds. It
is not that difficult, the really difficult part is rounding errors
analysis. You may have no overflows, but the result is garbage.
So from #1 and #2 you know the maximum range of the intermediates and
declare the corresponding type. In Ada one would take that type for A
and B but also constrain them to the domain's range to prevent wrong
inputs.
That is the method most people would use.
A more advanced but also difficult approach is static analysis. You
could prove that no overflows happens. Usually it requires
transformation of the algorithm, because automatic provers have serious limitations. For Ada there is such a framework, the SPARK Ada.
On 06/10/2021 01:04, Bart wrote:
It's not something I have ever had cause to do - and while it could be interesting, there are too many other interesting things in life. One
thing that is not interesting, however, is a pissing competition about
who has done what.
The mathematics of making a straightforward arbitrary precision library, floating point or integer, are not particularly advanced - it's just
simple arithmetic. Making a nice interface is hard. Making a system
where it is easy to get right, hard to get wrong, and you don't end up
with memory leaks is hard. Making efficient implementations is hard.
Making algorithms that scale well is hard. Doing long multiplication
and multi-digit addition is primary school arithmetic - you just have to realise that, and not be scared by the big numbers.
So an arbitrary precision floating point library is a significant
achievement - but /not/ because of the mathematics.
If you have included error analysis and correctness proofs, then the
maths gets hard. If you have included FFT multiplication algorithms,
the maths gets hard. If you have included partitioning to support
parallel implementations - with correctness proofs, of course - the
maths gets hard.
None of that is particularly related to the kind of mathematics that was under discussion here.
Yes, that's all hard stuff. And yes, you can make a "cheap and
cheerful" system without bothering about being sure it is correct - especially when there is no dividing line between the library/language
user and the author, and you can change the toolchain to fix any issues
as you go along. And yes, such "cheap and cheerful" solutions do have real-world practical uses.
On 2021-10-06 18:11, James Harris wrote:
On 05/09/2021 10:39, Dmitry A. Kazakov wrote:
Why:
declare
X : T;
Y : S;
begin
Foo (X);
Foo (Y);
is OK, but
declare
X : T := Create;
Y : S := Create;
begin
is not?
Assuming that Create is the same as Create() contrast
T := Create;
with
T := Create + 0;
Why should the former work and the latter, presumably, fail?
It would not. I assume that T is an integer or modular type. So, if
T := T + 0;
does not fail
T := Create + 0;
would not fail either. Here is a complete example:
type T is range 0..100;
function Create return T is
begin
return 0;
end Create;
type S is range -100..200;
function Create return S is
begin
return 0;
end Create;
X : T := Create + 0;
Y : S := Create + 0;
What about
X := 0 + 1 * Create * 2 + 3;
Does the type of X effectively propagate through the constants so that
Ada knows which version of Create to call?
FWIW, I would have to write your original code as
X = T.Create()
On 06/10/2021 11:32, Bart wrote:
I sometimes like to solve puzzles, but I'm not good enough to do it
manually, and don't have the maths skills to do it that way.
The maths skills required to solve puzzles are almost never
very advanced. I find it hard to believe that anyone with A-level
maths [final year of secondary education, for non-UK readers] would
find the problems usually described as "puzzles" at all difficult
in terms of the skills needed.
I worked my way through most of the
books by [eg] Dudeney, Loyd and Gardner long before A-level.
So I use brute force, with a computer program, if I think it is
practical in a reasonable time.
Sure. So do we all [FSVO "we"].
tackling the "Project Euler" problems:
https://projecteuler.net/archives
The problems range from trivial to extremely difficult, and from
really interesting to bafflingly boring, but there are over 700 to
choose from.
You seem to think that there is some rigid dividing line
between "maths" and "not-maths". Not so.
Maybe you've forgotten the boring topics that constituted A-levelI sometimes like to solve puzzles, but I'm not good enough to do itThe maths skills required to solve puzzles are almost never
manually, and don't have the maths skills to do it that way.
very advanced. I find it hard to believe that anyone with A-level
maths [final year of secondary education, for non-UK readers] would
find the problems usually described as "puzzles" at all difficult
in terms of the skills needed.
maths in the 70s. Little would have been any help at all with the
more advanced puzzles.
I worked my way through most of the books by [eg] Dudeney, Loyd andI had some of those (don't know Loyd though), I can't remember how
Gardner long before A-level.
hard they were, except that Dudeney especially seemed to be for fun.
I don't think that's the usual reason for setting the puzzle; you'reSo I use brute force, with a computer program, if I think it isSure. So do we all [FSVO "we"].
practical in a reasonable time.
suppose to solve it by being clever, not cheating! Or least, not
dumbly trying every possible combination until you hit the right
answer; that's not playing the game.
https://projecteuler.net/archivesProblem #413 was discussed on comp.lang.c recently. (See 'Losing my
The problems range from trivial to extremely difficult, and from
really interesting to bafflingly boring, but there are over 700 to
choose from.
mind' thread from around 2nd July.)
I have to say that finding a way to solve it via a computer within
the one-minute limit was far beyond my abilities.
You seem to think that there is some rigid dividing lineWhat I have in mind is the complicated stuff, the jargon, the proofs, basically everything that's beyond me.
between "maths" and "not-maths". Not so.
Then I don't really want to be reminded of my shortcomings.
On 04/10/2021 10:50, David Brown wrote:
On 04/10/2021 01:58, Bart wrote:
On 03/10/2021 23:05, David Brown wrote:
On 03/10/2021 20:27, Bart wrote:
Processors are designed to do many things. Exactly duplicating
standard
mathematical integers is not one of those things. Being usable to
model
a limited version of those integers - following somewhat different
mathematical rules and definitions - /is/ one of those things.
No, it's just arithmetic with a limited number of digits. And usually in >>> binary.
It's engineering, not maths. But of course you can apply maths to
anything.
Engineering /is/ applied maths!
/includes/
!
Engineering is a lot of things: science, materials, chemistry,
mathematics, biology, etc.
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
On 2021-10-06 16:56, James Harris wrote:
On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
No, overflow is outside the model. Inside the model computer
integer arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid
mechanics and strength of materials is inadequate in general case,
but perfectly well for building bridges.
Do not overflow your numbers, OK?
That's maths. Engineering includes how to respond to overflow (or the
potential thereof).
No, engineering is how to *avoid* overflows.
Fine, Dmitry. You try to write code to avoid
A * B
overflowing before you execute the multiply.
On 06/10/2021 07:20, David Brown wrote:
These were examples of function definitions with conditionals in common
mathematics using real numbers - they cannot be implemented directly in
computer code. If you want a mathematical definition of "abs" for fixed
size integer types in a programming language, you must adapt it to a
different mathematical definition that is suitable for the domain you
are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still
maths. Two possibilities for n-bit two's complement signed integers
could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ int_min, if x = int_min
or
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ undefined, if x = int_min
Both are good, solid mathematical definitions - and both can be
implemented. They have slightly different characteristics, each with
their pros and cons.
These are somewhat unsatisfactory.
I guess you only have one actual
definition of abs()?
In practice, it would be different for each different type of x. For
example, the representation might be twos complement, or it might be
signed magnitude [as used in floats].
Further, there might be different sizes of int, so different values of int_min.
Also, you might need to consider putting the check for int_min first, or
at least second, depending on whether problems are anticipated with
doing 'x > int_min' when x is negative.
There is also a question over exactly what 'undefined' means: would it require abs() to return a sum-type now rather than an int? If so, abs()
might need such a type as input too: abs(abs(x)).
So, the reality is bit a more involved. But it depends on what the
purpose of your mathematical definitions are: is it just to 'look
pretty'; or is it informal user documentation; or would it actually be
input to some compiler generator?
On 06/10/2021 18:15, James Harris wrote:
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
On 2021-10-06 16:56, James Harris wrote:
On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
No, overflow is outside the model. Inside the model computer
integer arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid
mechanics and strength of materials is inadequate in general case,
but perfectly well for building bridges.
Do not overflow your numbers, OK?
That's maths. Engineering includes how to respond to overflow (or the
potential thereof).
No, engineering is how to *avoid* overflows.
Agreed.
Fine, Dmitry. You try to write code to avoid
A * B
overflowing before you execute the multiply.
Would it make sense to ask a baker to "mix two ingredients" ? The baker would want to know what they are, what the quantities are, perhaps how
well they need to be mixed. The same applies in programming. It makes
no sense to say "take two things and multiply them, avoiding overflow"
with no concept of what the things are, what types, what ranges within
those types, what outputs you want, what language, and so on.
The answer could be "there's no way it could ever overflow", or "use __builtin_mul_overflow", or "use a bigger type", or many other
possibilities.
In the real world, in real programming tasks, you usually have a lot
more information than just "a number". Maybe "A" is the number of kids
in a school class and "B" is the number of bikes they own - you can
avoid overflow by using 32-bit integers and multiplying because you
already know there are not two billion bikes in the class.
On 05/10/2021 23:21, Andy Walker wrote:
On 05/10/2021 20:45, Bart wrote:
Well, mathematics cheats a little. Because A + B in maths is just:
A + B
It usually doesn't need to evaluate it, so overflow is irrelevant!
But suppose, given some concrete values for A and B, it DID need to
evaluate A + B into a concrete result.
Others have partially addressed this point. But there are
several other partial answers:
(a) This [suitably generalised] is the entire purpose of the
branch of mathematics called "numerical analysis"*. We
needed to get concrete results long before we had computing
machines.
My approach to solving problems would be computation or trial and error rather than doing things analytically, for which I just don't have the ability.
I sometimes like to solve puzzles, but I'm not good enough to do it
manually, and don't have the maths skills to do it that way.
So I use brute force, with a computer program, if I think it is
practical in a reasonable time.
Here's an example of a puzzle where you have to fit different pieces
into an outlined grid; this shows one solution:
https://github.com/sal55/langs/blob/master/delta.png
(I've made the pieces different colours.)
Is what I did to solve this (aside from designing and implementing the language used) maths? Not in my view, as I was trying to avoid using it.
But apparently it was.
On 06/10/2021 16:10, David Brown wrote:
[Bart:]You never came across functions like Dirac's delta function [...]
That looks like a useful function!Yes, it's a great one - it's useful in many cases. It is the derivative
of the step function, which is another useful function defined using
conditionals. [...]
I wonder whether this is the right place to point out to our
readers that step functions are uncomputable? Of course, that could
be the opportunity for some to decide that "uncomputable" is a daft concept; OTOH, no-one seems to have proposed anything better.
On 06/10/2021 07:20, David Brown wrote:
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
...
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it were
using 2's complement representation and x was the most negative number.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
These were examples of function definitions with conditionals in common
mathematics using real numbers - they cannot be implemented directly in
computer code. If you want a mathematical definition of "abs" for fixed
size integer types in a programming language, you must adapt it to a
different mathematical definition that is suitable for the domain you
are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still
maths. Two possibilities for n-bit two's complement signed integers
could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ int_min, if x = int_min
Yes. I would consider that a valid and correct definition given the
criteria. It describes what a programmer can expect from a computer's
abs function (again, given the criteria).
I would add, however, that it describes something which is not the mathematical |x| or 'absolute value'. Instead, it /uses/ mathematics to describe what happens in different scenarios. But it does not implement
a mathematical abs operation because a computer does not.
Again, I don't think we disagree on the substance but the nomenclature
so I don't see a need to pursue this further but I will append an anecdote.
I remember a documentary about Charles Babbage in which he showed some
dinner guests an early version of one of his machines. In the
documentary Babbage had the machine generate a series of numbers but one
of the numbers did not fit the mathematical series: for the presumed computation that value was mathematically incorrect. Babbage explained
that that was the point: with his machine such step-outs could be
configured by him as the machine's controller.
Now, one could think of him as making an excuse for an incorrect
computation but presuming he really did mean that to happen I see it as similar to the abs(int_min) case: a step-out from mathematics put in by
an engineer.
On 07/10/2021 11:52, David Brown wrote:
On 06/10/2021 18:15, James Harris wrote:
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
On 2021-10-06 16:56, James Harris wrote:
On 05/10/2021 21:01, Dmitry A. Kazakov wrote:
On 2021-10-05 21:36, James Harris wrote:
On 04/10/2021 11:26, Dmitry A. Kazakov wrote:
No, overflow is outside the model. Inside the model computer
integer arithmetic is associative.
Then the model is inadequate - and that's party my point.
Only if you deploy it falsely. This is the core engineering. Solid >>>>>> mechanics and strength of materials is inadequate in general case, >>>>>> but perfectly well for building bridges.
Do not overflow your numbers, OK?
That's maths. Engineering includes how to respond to overflow (or the >>>>> potential thereof).
No, engineering is how to *avoid* overflows.
Agreed.
Fine, Dmitry. You try to write code to avoid
A * B
overflowing before you execute the multiply.
Would it make sense to ask a baker to "mix two ingredients" ? The baker
would want to know what they are, what the quantities are, perhaps how
well they need to be mixed. The same applies in programming. It makes
no sense to say "take two things and multiply them, avoiding overflow"
with no concept of what the things are, what types, what ranges within
those types, what outputs you want, what language, and so on.
The answer could be "there's no way it could ever overflow", or "use
__builtin_mul_overflow", or "use a bigger type", or many other
possibilities.
In the real world, in real programming tasks, you usually have a lot
more information than just "a number". Maybe "A" is the number of kids
in a school class and "B" is the number of bikes they own - you can
avoid overflow by using 32-bit integers and multiplying because you
already know there are not two billion bikes in the class.
Real programs may also have to end up multiplying two values that are
runtime inputs. They may not have any constraints either.
Example: a calculator program. Or a compiler that reduces constant expressions.
Then it is up to you what degree of quality of implementation you want
to apply.
To do something about it, you might need to look at what features the language provides. If none, then it is up to your application code.
If you are creating the language to write the application (even the one
used to write that compiler), then you have to decide how much more complicated and difficult you want both language and implementation, to
solve what is in reality a minor problem.
You can go to a LOT of trouble so that if someone types 2**3**4**5, it
will do something sensible. That can mean diverting attention from much
more productive matters.
On 06/10/2021 00:04, Bart wrote:
On 05/10/2021 18:35, David Brown wrote:
On 05/10/2021 17:47, Bart wrote:
On 05/10/2021 15:13, David Brown wrote:
or do you not understand what mathematics is?
Maybe I don't. I stopped 'getting it' since I started being involved
with computers.
But you still feel qualified to argue that computing is not mathematics, >>> or that computer operations are not mathematically defined?
I don't know why everyone seems determined to question my credentials.
Not everyone! IMO you are correct and they are wrong.
I don't necessarily disagree with David. My point is that (largely
because of fixed-size integers and representation issues) computers do
not implement the simple mathematics that Dmitry seemed to suggest in
his earlier post when he spoke about using natural numbers (which is, I think, where this subthread originated). David's point is that computer arithmetic can be mathematically defined. Those two statements are not actually in conflict.
What I do think is wrong is other people insisting on their definitions
when they just have a different viewpoint.
On 06/10/2021 18:15, James Harris wrote:[... overflow ...]
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
On 04/10/2021 19:19, David Brown wrote:
On 04/10/2021 10:39, James Harris wrote:
On 03/10/2021 23:14, David Brown wrote:
On 03/10/2021 22:20, James Harris wrote:
...
1. Integer arithmetic where all values - including intermediate
results
- remain in range for the data type. In this, the computer implements >>>>> normal mathematics.
2. Integer arithmetic where either a result or an intermediate value >>>>> does not fit in the range assigned. For these a decision has to be
made
(by hardware, by language or by compiler) as to what to do with the
non-compliant value. As you say, there are various options but they
have
to be cast semantically in terms of "if this happens then do that"
rather than following the normal rules of mathematics. Worse, exactly >>>>> where the limits apply can even depend on implementation.
And it is all defined mathematically.
We are talking finite sets with partial operations (for C-style signed >>>> integers) or closed operations (for C-style unsigned integers), rather >>>> than infinite sets, but it is all mathematics.
OK, then how would you define integer computing's
A - B
in terms of mathematics?
OK, then how would you define integer computing's
Mathematics on standard integers doesn't define 1/0. Mathematics on a >>>> finite set for C-style signed integers leaves a lot more values
undefined on more operations. It doesn't mean it is not mathematical. >>>
A / B
in terms of mathematics?
No need to reply but I'd suggest to you that because of the limits of
computer fixed representation both of those are much more complex than
just 'mathematics'!
I'd agree that they are somewhat complicated by the limited sizes of
fixed-size types - but they are still just "mathematics".
How about:
These are good but I would dispute that they are mathematical! Comments below.
1. For a given fixed size of computer integer, "A - B" is defined as the
result of normal mathematical integer subtraction as long as that result
fits within the type.
(That's basically how C defines it, if you stick to "int" or ignore the
promotion stuff.)
As you say, that's only partial. Fine for a limited domain, though the
domain would be hard to specify; and it's incomplete due to the limited domain.
or
2. "A - B" is defined as the result of normal mathematical integer
subtraction, reduced modulo 2^n as necessary to fit in the range of the
type.
(That's how "gcc -fwrapv" defines it.)
No mathematics that I am aware of has the concept of "to fit in the
range of the type" but maybe you know different.
or
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
That's an interesting one! I'm not sure what it means but it's
definitely interesting. ;-)
or
4. "A - B" is defined as either the result of normal integer subtraction
if that fits within the range of the type, or an exception condition
otherwise.
Again, "an exception condition" is surely not mathematics.
I put it to you that you are thinking like an engineer and producing definitions which are suitable for engineering. That's the right thing
to do, IMO, but it should be called engineering not mathematics.
...
Unfortunately, many programming tutorials encourage programmers to
simply assume that any value is 'large enough' and so will behave
according to the normal rules of mathematics. But as everyone here
knows, that is not always the case, as was shown in the example I saw >>>>> discussed recently of
255 + 1
what that results in is decided by what I would call 'engineering',
and
not by the normal rules of mathematics.
Engineering is about applying the mathematical (and perhaps physical,
chemical, etc.) laws to practical situations. An engineer who does not >>>> understand that there is a mathematical basis for what they do is in
the
wrong profession. (I certainly don't mean that they should understand >>>> the mathematics involved - but they should understand that there /is/
mathematics involved, and that the mathematics is what justifies the
rules and calculations they apply.)
Well, I would say that engineering includes being aware of and
accommodating limits - including those limits where simple mathematics
breaks down and no longer applies. YMMV.
Mathematics doesn't break down. That's the point.
I said "simple mathematics" breaks down.
In fact, it breaks down so much that even something as simple as plain subtraction is better described by an algorithm.
On 05/10/2021 21:55, James Harris wrote:
No mathematics that I am aware of has the concept of "to fit in the
range of the type" but maybe you know different.
It's just modulo arithmetic.
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
That's an interesting one! I'm not sure what it means but it's
definitely interesting. ;-)
It is saturation - if the result of "A - B" with normal (infinite)
integer arithmetic is outside the range of the type, it gets saturated
to the limit of the type.
Again, "an exception condition" is surely not mathematics.
Surely it is.
You would have the exception as part of the set of outputs for the function.
The fact is that mathematics is not a programming language, otherwise
we would all be coding in it. Real code has a dynamic element that is
missing from maths.
And if source code was mathematics, then you wouldn't need to run it
(saving the bother of writing compilers and implementations, or even
needing to buy a computer); you'd just look at it!
On 06/10/2021 13:08, Bart wrote:
These are somewhat unsatisfactory.
So, the reality is bit a more involved. But it depends on what the
purpose of your mathematical definitions are: is it just to 'look
pretty'; or is it informal user documentation; or would it actually be
input to some compiler generator?
They can certainly be used in code generation, such as for code
optimisation and simplification. They can be used in proving code correctness (which can sometimes be done automatically). They can be
used to reason about code and algorithms, or document them, or specify them.
On 07/10/2021 11:52, David Brown wrote:
On 06/10/2021 18:15, James Harris wrote:[... overflow ...]
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
In the early days of computing, array-bound checks and
overflow checks were automatic unless you actively turned them
off. These days, you ought to be able to add [eg] use of
uninitialised variables, use of storage after "free", following
a null pointer, and probably others. It used to be important
to be able to turn them off, as they dragged in extra code and
extra time into limited storage and run-time. These days, such
factors really, really don't impinge on almost all normal work.
For all but a tiny proportion of work, time is dominated by
disc/network transfers or waiting for the user to type/click,
and space by large data structures rather than a bit of extra
code -- esp when errors are detected by hardware interrupt
rather than by user checks.
Perhaps we should get back closer to the old days?
If your program has a bug, would you rather that the program
stops when the bug is first manifest, or that it continues
until something more catastrophic happens? Or, even worse,
that it continues but gives wrong results with no indication?
How much developer/user time is wasted dealing with malware
that couldn't exist in a safer environment?
On 07/10/2021 14:02, David Brown wrote:
On 05/10/2021 21:55, James Harris wrote:
No mathematics that I am aware of has the concept of "to fit in the
range of the type" but maybe you know different.
It's just modulo arithmetic.
Modulo arithmetic seems to be mainly defined over a range that starts
from zero.
To have a range between any two limits, you need to start applying offsets.
The sort of modulo arithmetic used to give a free overflow pass to C's unsigned operations is much more rigid: the (inclusive) range is a power-of-two, and starts from 0.
Which I always thought was too specific; if someone wants modulo
behaviour in the range 1 to 100 inclusive (100+1 wraps to 1), it will
not have language support.
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
That's an interesting one! I'm not sure what it means but it's
definitely interesting. ;-)
It is saturation - if the result of "A - B" with normal (infinite)
integer arithmetic is outside the range of the type, it gets saturated
to the limit of the type.
That's a confusing way of expressing it. Presumably 'mid' refers to the middle argument, but it won't be the middle numerically if outside the
range.
Also, if actually implementing code that does this, checking whether it
is in-range might be tricky if done /after/ you've evaluated A-B, if
using int-sized arithmetic.
I use an actual operator called clamp, where your example becomes:
clamp(A-B, int.min, int.max)
But for those limits, the calculation must done with a type of wider
range than int. (Clamp is defined on top of min/max ops.)
Again, "an exception condition" is surely not mathematics.
Surely it is.
You would have the exception as part of the set of outputs for the
function.
So, how do you get from deep inside one formula to one elsewhere, or can mathematics also define 'goto'?
The fact is that mathematics is not a programming language, otherwise we would all be coding in it. Real code has a dynamic element that is
missing from maths.
And if source code was mathematics, then you wouldn't need to run it
(saving the bother of writing compilers and implementations, or even
needing to buy a computer); you'd just look at it!
On 07/10/2021 15:59, Bart wrote:
Which I always thought was too specific; if someone wants modulo
behaviour in the range 1 to 100 inclusive (100+1 wraps to 1), it will
not have language support.
I believe Ada has this kind of language support
On 2021-10-07 22:10, David Brown wrote:
On 07/10/2021 15:59, Bart wrote:
Which I always thought was too specific; if someone wants modulo
behaviour in the range 1 to 100 inclusive (100+1 wraps to 1), it will
not have language support.
I believe Ada has this kind of language support
No, Ada has only normal mudular numbers. Though it would be easy to
implement a user-defined numeric type with these properties as you can override +,-,*,/ etc.
On 06/10/2021 17:37, James Harris wrote:
On 06/10/2021 00:04, Bart wrote:
I don't know why everyone seems determined to question my credentials.
Not everyone! IMO you are correct and they are wrong.
I for one am only questioning /some/ of Bart's credentials in relation
to /some/ of the things he has said, not as a general point. For many
things in this group, his credentials include experience and
achievements well beyond anything I have done in practice. But we all
have occasions when we mistake our own subjective opinions for objective facts, or when we have strong opinions that are not based on knowledge
or experience.
On 07/10/2021 13:38, David Brown wrote:
On 06/10/2021 17:37, James Harris wrote:
On 06/10/2021 00:04, Bart wrote:
...
I don't know why everyone seems determined to question my credentials.
Not everyone! IMO you are correct and they are wrong.
I for one am only questioning /some/ of Bart's credentials in relation
to /some/ of the things he has said, not as a general point. For many
things in this group, his credentials include experience and
achievements well beyond anything I have done in practice. But we all
have occasions when we mistake our own subjective opinions for objective
facts, or when we have strong opinions that are not based on knowledge
or experience.
OT but I never understand that when discussing ideas people care about credentials. Ideas should be judged on their merits, not on the qualifications of the person putting them forward.
In fact, if one wants to break new ground one really needs to invite
ideas which have not come from established schools of thought.
On 07/10/2021 11:52, David Brown wrote:
On 06/10/2021 18:15, James Harris wrote:[... overflow ...]
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
In the early days of computing, array-bound checks and
overflow checks were automatic unless you actively turned them
off. These days, you ought to be able to add [eg] use of
uninitialised variables, use of storage after "free", following
a null pointer, and probably others. It used to be important
to be able to turn them off, as they dragged in extra code and
extra time into limited storage and run-time. These days, such
factors really, really don't impinge on almost all normal work.
For all but a tiny proportion of work, time is dominated by
disc/network transfers or waiting for the user to type/click,
and space by large data structures rather than a bit of extra
code -- esp when errors are detected by hardware interrupt
rather than by user checks.
Perhaps we should get back closer to the old days?
If your program has a bug, would you rather that the program
stops when the bug is first manifest, or that it continues
until something more catastrophic happens? Or, even worse,
that it continues but gives wrong results with no indication?
How much developer/user time is wasted dealing with malware
that couldn't exist in a safer environment?
On 07/10/2021 11:52, David Brown wrote:
On 06/10/2021 18:15, James Harris wrote:
On 06/10/2021 16:39, Dmitry A. Kazakov wrote:
On 2021-10-06 16:56, James Harris wrote:
That's maths. Engineering includes how to respond to overflow (or the >>>>> potential thereof).
No, engineering is how to *avoid* overflows.
Fine, Dmitry. You try to write code to avoid
A * B
overflowing before you execute the multiply.
Real programs may also have to end up multiplying two values that are
runtime inputs. They may not have any constraints either.
On 06/10/2021 17:26, James Harris wrote:
On 06/10/2021 07:20, David Brown wrote:
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
...
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it were >>>> using 2's complement representation and x was the most negative number. >>>> It's a classic case in point where the computer won't follow the
accepted mathematical definition.
These were examples of function definitions with conditionals in common
mathematics using real numbers - they cannot be implemented directly in
computer code. If you want a mathematical definition of "abs" for fixed >>> size integer types in a programming language, you must adapt it to a
different mathematical definition that is suitable for the domain you
are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still
maths. Two possibilities for n-bit two's complement signed integers
could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ int_min, if x = int_min
Yes. I would consider that a valid and correct definition given the
criteria. It describes what a programmer can expect from a computer's
abs function (again, given the criteria).
What criteria?
On 08/10/2021 14:35, James Harris wrote:
On 07/10/2021 13:38, David Brown wrote:
On 06/10/2021 17:37, James Harris wrote:
On 06/10/2021 00:04, Bart wrote:
...
I don't know why everyone seems determined to question my credentials. >>>>Not everyone! IMO you are correct and they are wrong.
I for one am only questioning /some/ of Bart's credentials in relation
to /some/ of the things he has said, not as a general point. For many
things in this group, his credentials include experience and
achievements well beyond anything I have done in practice. But we all
have occasions when we mistake our own subjective opinions for objective >>> facts, or when we have strong opinions that are not based on knowledge
or experience.
OT but I never understand that when discussing ideas people care about
credentials. Ideas should be judged on their merits, not on the
qualifications of the person putting them forward.
Ideas can stand on their own merits. The relevance of discussing them,
the effort you put into the discussion, and the weight or consideration
you give to ideas can depend on qualifications or credentials (and I
mean these terms in a general sense - not formal academic qualifications)
So if you have a sore tooth, and know nothing about dentistry, you might solicit ideas and opinions from several people. A professional dentist
might suggest you need a filling.
In fact, if one wants to break new ground one really needs to invite
ideas which have not come from established schools of thought.
That's fine - as long as you understand that the /vast/ majority of such ideas will be wrong.
Being educated and/or experienced in a field is no
guarantee that you will be right or have the best ideas, but it is a
pretty good guide in practice.
On 08/10/2021 14:35, James Harris wrote:
If I am listening in to a discussion on a topic that I know little about
(I can't think of many examples - football, perhaps :-) ) and am not in
a position to judge the merit of two competing ideas, then the
qualifications of the people putting them forward help you judge.
So if you have a sore tooth, and know nothing about dentistry, you might solicit ideas and opinions from several people. A professional dentist
might suggest you need a filling. A toothologist might suggest tying a
live frog to your jaw for a couple of days. You don't know which is the
best idea, so you look at the qualifications. The dentist has a
professional qualification - clearly he is just after your money. The toothologist is quoting from Pliny the Elder - a naturalist and author
who is still well known after two thousand years. Obviously you judge
the toothologist as the better qualified idea.
In fact, if one wants to break new ground one really needs to invite
ideas which have not come from established schools of thought.
That's fine - as long as you understand that the /vast/ majority of such ideas will be wrong. Being educated and/or experienced in a field is no guarantee that you will be right or have the best ideas, but it is a
pretty good guide in practice.
In the early days of computing, array-bound checks andI appreciate your point, but disagree somewhat.
overflow checks were automatic [...].
Perhaps we should get back closer to the old days?
If your program has a bug, would you rather that the program
stops when the bug is first manifest, [...]?
First off, I agree that for a lot of code, run-time speed of the code
itself is (or should be) a minor concern. But the answer is not to have run-time checking in a language like C - the answer is not to use a
language like C (or Ada, or Java, or C++) for such tasks. Rather,
languages like Python or other higher level languages should be used -
with the language choice depending on the type of task.
Then there
simply isn't a question of overflows of integers or buffers, and array
bound errors are caught at run-time (though most opportunities for such errors are avoided by having proper strings, high-level structures like hashmaps and queues, etc.
Secondly, I /don't/ want the bug to be found as soon as it manifests
itself at run-time. I want it to be found /before/ run-time.
And one
of the things that lets C and C++ tools mark something like integer
overflow as an error (if it can be seen at compile-time) is precisely
the fact that it is undefined behaviour. If there is a defined
behaviour - including throwing a run-time exception of some sort - the
the compiler or more advanced static analysis can't stop you and say
it's a mistake.
Thirdly, modern C and C++ (and Ada and other language) tools /do/
support checking at run-time. People just have to choose to use them.
(And again, these rely on undefined behaviour.)
I do agree that hiding the mistake and carrying on is often the worst
option. You get that when you try to define the behaviour of everything
in a language.
The number of wrong ideas doesn't matter much. It's fairly easy to
filter them out.
It's *far* better to encourage free thinking (and to filter what comes
back) than to discourage free thinking by lauding tradition.
On 07/10/2021 19:57, David Brown wrote:
[I wrote:]
In the early days of computing, array-bound checks andI appreciate your point, but disagree somewhat.
overflow checks were automatic [...].
Perhaps we should get back closer to the old days?
If your program has a bug, would you rather that the program
stops when the bug is first manifest, [...]?
First off, I agree that for a lot of code, run-time speed of the code
itself is (or should be) a minor concern. But the answer is not to have
run-time checking in a language like C - the answer is not to use a
language like C (or Ada, or Java, or C++) for such tasks. Rather,
languages like Python or other higher level languages should be used -
with the language choice depending on the type of task.
Yes, but that is putting the responsibility in the wrong
place. Programmers know C [or think they do]; no point telling
them to use Python [which by hypothesis they don't know] instead.
Further, almost all of the software I use was written by someone
else; I have no control over how it was written. The best I can
hope for, [somewhat] realistically, is that my computer will tell
me when that software does something buggy. For example, if my
computer has a hardware interrupt when an overflow occurs, that
overflow cannot sneak through undetected. Computers of the '60s
[speaking generalities] had that; in the '70s, it was replaced
by setting overflow flags, which had to be tested for after each
arithmetic operation. When space and time were tight, testing
went by the board; in my experiments on the ICL 1906A, adding
overflow checks to a program slowed it down by 32% and cost a
significant amount of [tight] storage. By contrast, on Atlas,
the hardware interrupt was essentially a free service to all
programs. Guess what happened on the '6A, and even more so on
the PDP-11 when Unix came along.
Given suitable hardware, most of the more egregious
errors can be caught "free"; it just requires a co-operating
computer that controls and checks array accesses, storage
management, pointers out of range, reading uninitialised
storage and so on. Then it becomes more expensive to by-pass
the checks than just to use them [and eliminate most of the
malware that relies on exploiting their lack].
On 05/10/2021 21:55, James Harris wrote:
On 04/10/2021 19:19, David Brown wrote:
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
That's an interesting one! I'm not sure what it means but it's
definitely interesting. ;-)
It is saturation - if the result of "A - B" with normal (infinite)
integer arithmetic is outside the range of the type, it gets saturated
to the limit of the type.
Algorithms are mathematical recipes. You can't escape :-)
So, all those checks are really largely a waste of time.
The language can help by providing some explicit way of doing it, for example:
(c, overflow) := checkedadd(a, b)
But it should not interfere with:
c := a + b
On 07/10/2021 14:02, David Brown wrote:
On 05/10/2021 21:55, James Harris wrote:
On 04/10/2021 19:19, David Brown wrote:
...
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
That's an interesting one! I'm not sure what it means but it's
definitely interesting. ;-)
It is saturation - if the result of "A - B" with normal (infinite)
integer arithmetic is outside the range of the type, it gets saturated
to the limit of the type.
OK, though if it's saturation why is it called mid?
I have an interpreter that does all sorts of runtime checks, [...].
Outside of development, it is very rare for a production program to
trigger an error. (And I've had past versions working hours each day
at 1000 customer sites.)
So, all those checks are really largely a waste of time.
Perhaps this is the kind of reasoning that led to the switch from
interrupt handling to setting flags; I don't know what rationale
applied there.
(Flags can often be /usefully/ triggered, and might
have a lot fewer overheads than servicing an interrupt.)
And also, why some language implementations offer runtime checking
for debug versions, that can be disabled for production versions.
Checking overflow is useful in certain situations where the numbers
are runtime data like my calculator and compiler examples, but there
I think that check belongs in user-code, in the application, and not
be built-in to all the language's arithmetic ops, especially without
proper means of dealing with the overflow when it happens.
Interrupts are signals are heavy ways of doing that!
The language can help by providing some explicit way of doing it, for example:
(c, overflow) := checkedadd(a, b)
But it should not interfere with:
c := a + b
On 08/10/2021 18:27, Bart wrote:
I have an interpreter that does all sorts of runtime checks, [...].
Outside of development, it is very rare for a production program to
trigger an error. (And I've had past versions working hours each day
at 1000 customer sites.)
Key phrase: "outside of development". In that case, there ideally ought to be /no/ errors triggered. But, of course, that
depends on having perfect development. You may be a near-perfect developer; the evidence from the real world is that many are not,
esp given the pressures to get software out of the door ASAP, and
hang the final testing when deadlines loom.
So, all those checks are really largely a waste of time.
Yes. In the same way that having seat belts in my car is
"largely" a waste of time. In fact, so far it has been a complete
and utter waste of time. Likewise, insuring my house; and many
other precautionary activities. In fact, no-one has ever broken
into my computer or stolen my credit-card details [AFAIK!], so it
is a waste of time having passwords and PINs. Or perhaps not.
Much malware derives from exploiting things that ought to
be unexploitable.
Checking overflow is useful in certain situations where the numbers
are runtime data like my calculator and compiler examples, but there
I think that check belongs in user-code, in the application, and not
be built-in to all the language's arithmetic ops, especially without
proper means of dealing with the overflow when it happens.
But it shouldn't happen! If there is a genuine need for the
sort of thing you mentioned [getting the bottom 64 bits of a 64-bit
by 64-bit multiply], then it should be met by suitable instructions
at the machine-code level, not by switching off checks. [Eg, IIRC,
Atlas had a double-length accumulator; normally a f-p multiply
returned the top half after normalisation, but you could ask for
an un-normalised multiply and then read out the bottom half.]
Interrupts are signals are heavy ways of doing that!
An interrupt is, in normal use, free in terms of the code
and of the time taken, /unless/ it is triggered, which /normally/
means that your program has just done something rather bad.
The language can help by providing some explicit way of doing it, for
example:
(c, overflow) := checkedadd(a, b)
What normal person is going to write that? After all,
/my/ code is bug-free, so that's a waste of time and space as
well as unnecessary typing. [Ha!]
On 08/10/2021 15:09, David Brown wrote:
On 08/10/2021 14:35, James Harris wrote:
...
If I am listening in to a discussion on a topic that I know little about
(I can't think of many examples - football, perhaps :-) ) and am not in
a position to judge the merit of two competing ideas, then the
qualifications of the people putting them forward help you judge.
I disagree big time. I can see where you are coming from but IMO such an approach is horribly limiting. I'd suggest instead to look for logic in
each argument and to develop a feel for lines of enquiry that might lead somewhere useful. The best answer may be neither of those presented but something in one argument or the other may trigger a new way of thinking which results in a useful direction of travel.
So if you have a sore tooth, and know nothing about dentistry, you might
solicit ideas and opinions from several people. A professional dentist
might suggest you need a filling. A toothologist might suggest tying a
live frog to your jaw for a couple of days. You don't know which is the
best idea, so you look at the qualifications. The dentist has a
professional qualification - clearly he is just after your money. The
toothologist is quoting from Pliny the Elder - a naturalist and author
who is still well known after two thousand years. Obviously you judge
the toothologist as the better qualified idea.
If you want to know what's best ask a toothsayer. :-)
In fact, if one wants to break new ground one really needs to invite
ideas which have not come from established schools of thought.
That's fine - as long as you understand that the /vast/ majority of such
ideas will be wrong. Being educated and/or experienced in a field is no
guarantee that you will be right or have the best ideas, but it is a
pretty good guide in practice.
The number of wrong ideas doesn't matter much. It's fairly easy to
filter them out.
It's *far* better to encourage free thinking (and to filter what comes
back) than to discourage free thinking by lauding tradition.
On 07/10/2021 12:32, David Brown wrote:
On 06/10/2021 17:26, James Harris wrote:
On 06/10/2021 07:20, David Brown wrote:
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
...
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it
were
using 2's complement representation and x was the most negative
number.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
These were examples of function definitions with conditionals in common >>>> mathematics using real numbers - they cannot be implemented directly in >>>> computer code. If you want a mathematical definition of "abs" for
fixed
size integer types in a programming language, you must adapt it to a
different mathematical definition that is suitable for the domain you
are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still
maths. Two possibilities for n-bit two's complement signed integers
could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ int_min, if x = int_min
Yes. I would consider that a valid and correct definition given the
criteria. It describes what a programmer can expect from a computer's
abs function (again, given the criteria).
What criteria?
Those above: 2's complement, etc.
On 07/10/2021 14:02, David Brown wrote:
On 05/10/2021 21:55, James Harris wrote:
On 04/10/2021 19:19, David Brown wrote:
...
3. "A - B" is defined as the result of mid(int_min, A - B, int_max).
That's an interesting one! I'm not sure what it means but it's
definitely interesting. ;-)
It is saturation - if the result of "A - B" with normal (infinite)
integer arithmetic is outside the range of the type, it gets saturated
to the limit of the type.
OK, though if it's saturation why is it called mid?
Indeed. But you also get it when the tools you [rightly]
describe aren't used; IOW when slapdash programmers are let loose
on important projects.
You might think that. Have a look at the world around you, and see the number of people who think magnetic bracelets improve their blood circulation, or organic food is healthier, or that their favourite state leader or candidate really cares about them, or that you can't get
pointer or memory errors if you switch to Rust.
On 2021-10-09 12:43, David Brown wrote:
You might think that. Have a look at the world around you, and see the
number of people who think magnetic bracelets improve their blood
circulation, or organic food is healthier, or that their favourite state
leader or candidate really cares about them, or that you can't get
pointer or memory errors if you switch to Rust.
OT
Central and East European medicine beginning with 30's invented a huge
number of physiotherapeutic medical procedures like electrophoresis,
exposing blood to the UV light, ultrasound warming etc. Magnets is an accessible and inexpensive remnant of that glorious epoch! (:-))
BTW, German health insurances still pay for some of these, and some do
for homeopathy as well.
On 09/10/2021 13:18, Dmitry A. Kazakov wrote:
On 2021-10-09 12:43, David Brown wrote:
You might think that. Have a look at the world around you, and see the >>> number of people who think magnetic bracelets improve their blood
circulation, or organic food is healthier, or that their favourite state >>> leader or candidate really cares about them, or that you can't get
pointer or memory errors if you switch to Rust.
OT
Central and East European medicine beginning with 30's invented a huge
number of physiotherapeutic medical procedures like electrophoresis,
exposing blood to the UV light, ultrasound warming etc. Magnets is an
accessible and inexpensive remnant of that glorious epoch! (:-))
Nah - modern magnetic bracelets are based on the logic that your blood contains iron, and iron is magnetic, so magnetic bracelets are good for
your blood flow.
It's that simple - and that stupid.
On 08/10/2021 18:33, Andy Walker wrote:[...]
Indeed. But you also get it when the tools you [rightly]This is the key point - program quality and bugs is a people problem,
describe aren't used; IOW when slapdash programmers are let loose
on important projects.
more than a language or implementation problem. [...]
I don't know the answer to this one. But I think "add automatic safety checks to the language" is not it - that kind of thing reduces the
efficiency of the code and makes life harder for the good programmers,
but will not really make a significant difference for the poor
programmers.
What we need is a linter for the programmers, not the programs.
Not a good analogy. The probabilities are very different with a hugeSo, all those checks are really largely a waste of time.Yes. In the same way that having seat belts in my car is
"largely" a waste of time. In fact, so far it has been a complete
and utter waste of time. Likewise, insuring my house; and many
other precautionary activities. In fact, no-one has ever broken
into my computer or stolen my credit-card details [AFAIK!], so it
is a waste of time having passwords and PINs. Or perhaps not.
amount of environmental factors that are different on each journey,
or each day your house/cards are at risk.
Running the same code each time should give the expected results
provided consideration has been given to kinds of inputs expected. If
the inputs are likely to be wild, then use the explicit checking I
show below.
Much malware derives from exploiting things that ought toIf the idea is to protect against malicious attacks, then detecting
be unexploitable.
that a number is bigger than approx 9000000000000000000 is not going
to help much, if the number is expected to be no bigger than 90!
So, then what?
You /don't/ write that in normal code. Only when inputs are unknown.The language can help by providing some explicit way of doing it, forWhat normal person is going to write that? After all,
example:
(c, overflow) := checkedadd(a, b)
/my/ code is bug-free, so that's a waste of time and space as
well as unnecessary typing. [Ha!]
On 2021-10-09 15:21, David Brown wrote:
On 09/10/2021 13:18, Dmitry A. Kazakov wrote:
On 2021-10-09 12:43, David Brown wrote:
You might think that. Have a look at the world around you, and see the >>>> number of people who think magnetic bracelets improve their blood
circulation, or organic food is healthier, or that their favourite
state
leader or candidate really cares about them, or that you can't get
pointer or memory errors if you switch to Rust.
OT
Central and East European medicine beginning with 30's invented a huge
number of physiotherapeutic medical procedures like electrophoresis,
exposing blood to the UV light, ultrasound warming etc. Magnets is an
accessible and inexpensive remnant of that glorious epoch! (:-))
Nah - modern magnetic bracelets are based on the logic that your blood
contains iron, and iron is magnetic, so magnetic bracelets are good for
your blood flow.
You think that other physiotherapeutic methods are better justified?
80%
of medicine is try and error stuff without knowing anything about the underlying processes. In many cases it is not possible to do
statistically meaningful experiments. E.g. in the case of magnets. You
simply do not know what to look after, what is the test group, if you
wanted to see for any effects.
Yes, an educated guess would be, no effect, but this is only a guess.
So, do not judge people too harsh.
It's that simple - and that stupid.
Exposing blood to UV light is supposed in order to boost the immunity
system, how is that more clever/stupid than exposing it to the magnetic field?
On 09/10/2021 00:52, Bart wrote:
Not a good analogy. The probabilities are very different with a hugeSo, all those checks are really largely a waste of time.Yes. In the same way that having seat belts in my car is
"largely" a waste of time. In fact, so far it has been a complete
and utter waste of time. Likewise, insuring my house; and many
other precautionary activities. In fact, no-one has ever broken
into my computer or stolen my credit-card details [AFAIK!], so it
is a waste of time having passwords and PINs. Or perhaps not.
amount of environmental factors that are different on each journey,
or each day your house/cards are at risk.
Running the same code each time should give the expected results
provided consideration has been given to kinds of inputs expected. If
the inputs are likely to be wild, then use the explicit checking I
show below.
As so often, you are telling us what /you/ could, might or
do do in your own code. That tells us nothing about what checking
is done by [eg] Firefox or Gcc or ...; and life is too short for
each of us to go scrabbling to find and examine the source code for
all the software on our computers.
You /don't/ write that in normal code. Only when inputs are unknown.
But it is usual for inputs to be unknown. That's why they
are inputs! Again, /you/ may write careful checks when "inputs
are unknown", but you can't make the authors of mail agents, word
processors, search engines, ... do the same. Hardware support
/could/, in many cases, enforce, or at least encourage, them, and
would be /much/ cheaper [in terms of efficiency and reliability]
than language support.
On 09/10/2021 19:08, Dmitry A. Kazakov wrote:
Yes, an educated guess would be, no effect, but this is only a guess.
So, do not judge people too harsh.
You don't need to guess. You can measure, you can do research, you can
do statistics, you can do experiments.
Here's a clue - anything that claims to "boost your immune system",
doesn't work. /Nothing/ will boost it if it is working properly, though there are plenty of ways to /weaken/ your immune system (nutritional deficiencies, stress, cold, infections, etc.).
And it's a good thing
that no "immune system booster" works, because you don't want it boosted
- that would be a guarantee of anaphylactic shock, autoimmune diseases, allergies, and many other unpleasantnesses.
On 08/10/2021 17:52, James Harris wrote:
On 07/10/2021 12:32, David Brown wrote:
On 06/10/2021 17:26, James Harris wrote:
On 06/10/2021 07:20, David Brown wrote:
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
...
But something else stood out particularly in the context of this
subthread. The definition of abs(x) would fail on a computer if it >>>>>> were
using 2's complement representation and x was the most negative
number.
It's a classic case in point where the computer won't follow the
accepted mathematical definition.
These were examples of function definitions with conditionals in common >>>>> mathematics using real numbers - they cannot be implemented directly in >>>>> computer code. If you want a mathematical definition of "abs" for
fixed
size integer types in a programming language, you must adapt it to a >>>>> different mathematical definition that is suitable for the domain you >>>>> are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still >>>>> maths. Two possibilities for n-bit two's complement signed integers >>>>> could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min
⎩ int_min, if x = int_min
Yes. I would consider that a valid and correct definition given the
criteria. It describes what a programmer can expect from a computer's
abs function (again, given the criteria).
What criteria?
Those above: 2's complement, etc.
Again - you are mixing implementation and specification.
On 09/10/2021 11:30, David Brown wrote:
On 08/10/2021 17:52, James Harris wrote:
On 07/10/2021 12:32, David Brown wrote:
On 06/10/2021 17:26, James Harris wrote:
On 06/10/2021 07:20, David Brown wrote:
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
...
But something else stood out particularly in the context of this >>>>>>> subthread. The definition of abs(x) would fail on a computer if it >>>>>>> were
using 2's complement representation and x was the most negative
number.
It's a classic case in point where the computer won't follow the >>>>>>> accepted mathematical definition.
These were examples of function definitions with conditionals in
common
mathematics using real numbers - they cannot be implemented
directly in
computer code. If you want a mathematical definition of "abs" for >>>>>> fixed
size integer types in a programming language, you must adapt it to a >>>>>> different mathematical definition that is suitable for the domain you >>>>>> are using (i.e., the input and output sets are the range of your
computer type, rather than the real numbers). It is, however, still >>>>>> maths. Two possibilities for n-bit two's complement signed integers >>>>>> could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min >>>>>> ⎩ int_min, if x = int_min
Yes. I would consider that a valid and correct definition given the
criteria. It describes what a programmer can expect from a computer's >>>>> abs function (again, given the criteria).
What criteria?
Those above: 2's complement, etc.
Again - you are mixing implementation and specification.
On the contrary, part of the specification of the problem, at least as I
had it in mind, was to define an abs which a computer might implement.
Therefore the representation was an essential part - even if only as a
model. Someone could pick a representation and define how abs would work
on that representation - which is what you seemed to have done, above.
The point being that the /mathematical/ |x| is not what most computers implement. Because most computers use 2's complement representation they cannot do so.
But we've discussed this somewhat to death. If you still believe
computers implement mathematics consider defining
sin(x)
as a computer might implement it. As you know, the result of that
function is necessarily not the sine of x but an approximation thereof.
And, yes, you could create a mathematical specification for what a
computer would do in all kinds of situation but it would still not be
the mathematical sine operation.
As I've said more than once, I don't think you and I agree on the
essentials. This is only about what one defined as 'mathematics' as
opposed, I would argue, to 'engineering'. As with the integer cases I
would call accounting for the sin(x) result's accuracy /engineering/
just as I would a loss of info at the top end; they are both due to limitations imposed by the representation.
On 2021-10-10 13:02, David Brown wrote:
On 09/10/2021 19:08, Dmitry A. Kazakov wrote:
Yes, an educated guess would be, no effect, but this is only a guess.
So, do not judge people too harsh.
You don't need to guess. You can measure, you can do research, you can
do statistics, you can do experiments.
Well, the point is that you cannot do any of these in a scientifically meaningful way for such broad category of effects and diverse samples.
Here's a clue - anything that claims to "boost your immune system",
doesn't work. /Nothing/ will boost it if it is working properly, though
there are plenty of ways to /weaken/ your immune system (nutritional
deficiencies, stress, cold, infections, etc.).
Once certainly can influence the immune system, e.g. by using immunosuppressants and vaccinations.
And it's a good thing
that no "immune system booster" works, because you don't want it boosted
- that would be a guarantee of anaphylactic shock, autoimmune diseases,
allergies, and many other unpleasantnesses.
No, these are malfunctions of the immune system. "Boosting" means faster
and stronger immune response without false positives. Wear larger,
green, CO2 neutral magnets! (:-))
On 09/10/2021 11:30, David Brown wrote:
But we've discussed this somewhat to death. If you still believe
computers implement mathematics consider defining
sin(x)
as a computer might implement it. As you know, the result of that
function is necessarily not the sine of x but an approximation thereof.
And, yes, you could create a mathematical specification for what a
computer would do in all kinds of situation but it would still not be
the mathematical sine operation.
On 10/10/2021 16:10, James Harris wrote:
On 09/10/2021 11:30, David Brown wrote:
On 08/10/2021 17:52, James Harris wrote:
On 07/10/2021 12:32, David Brown wrote:
On 06/10/2021 17:26, James Harris wrote:
On 06/10/2021 07:20, David Brown wrote:
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
...
But something else stood out particularly in the context of this >>>>>>>> subthread. The definition of abs(x) would fail on a computer if it >>>>>>>> were
using 2's complement representation and x was the most negative >>>>>>>> number.
It's a classic case in point where the computer won't follow the >>>>>>>> accepted mathematical definition.
These were examples of function definitions with conditionals in >>>>>>> common
mathematics using real numbers - they cannot be implemented
directly in
computer code. If you want a mathematical definition of "abs" for >>>>>>> fixed
size integer types in a programming language, you must adapt it to a >>>>>>> different mathematical definition that is suitable for the domain you >>>>>>> are using (i.e., the input and output sets are the range of your >>>>>>> computer type, rather than the real numbers). It is, however, still >>>>>>> maths. Two possibilities for n-bit two's complement signed integers >>>>>>> could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min >>>>>>> ⎩ int_min, if x = int_min
Yes. I would consider that a valid and correct definition given the >>>>>> criteria. It describes what a programmer can expect from a computer's >>>>>> abs function (again, given the criteria).
What criteria?
Those above: 2's complement, etc.
Again - you are mixing implementation and specification.
On the contrary, part of the specification of the problem, at least as I
had it in mind, was to define an abs which a computer might implement.
The point of having a mathematical specification is that there is no
such thing as what you had "in mind" - it is /exact/.
But we've discussed this somewhat to death. If you still believe
computers implement mathematics consider defining
sin(x)
as a computer might implement it. As you know, the result of that
function is necessarily not the sine of x but an approximation
thereof. And, yes, you could create a mathematical specification for
what a computer would do in all kinds of situation but it would still
not be the mathematical sine operation.
As I've said more than once, I don't think you and I agree on the
essentials. This is only about what one defined as 'mathematics' as
opposed, I would argue, to 'engineering'.
On 10/10/2021 15:10, James Harris wrote [to DB]:
But we've discussed this somewhat to death. If you still believe
computers implement mathematics consider defining
sin(x)
as a computer might implement it. As you know, the result of that
function is necessarily not the sine of x but an approximation
thereof. And, yes, you could create a mathematical specification for
what a computer would do in all kinds of situation but it would still
not be the mathematical sine operation.
Further to Dmitri's reply -- As I have pointed out more than
once [in recently in this thread] this whole topic is something that
every mathematician and engineer ought to understand, viz [part of]
numerical analysis. The ancient Greeks knew that; Newton was a
major contributor, and so were most of the other best mathematicians
of the 17th-19thC. Sadly, NA is, IME, rarely taught to programmers
these days, as it is thought to be too mathematical. But that's the
point -- it /is/ mathematics. Bart may not have studied it,
not, but shedloads of people did, esp in the early days of computing;
eg for one of the major early applications of computing, namely the preparation of [error free!] mathematical tables.
Further, whereas you might be slightly lucky if
real pi = 4 * arctan (1);
print (sin(pi/3)^2)
printed an exact representation of 3/4 rather than [say] 0.7499...9,
you would be entitled to your money back if a symbolic algebra
package gave anything other than 3/4 for the same expression.
Modern symbolic packages know an amazing amount of maths, more
than all but a handful of mathematicians in terms of being able
to do calculus, algebra and numerical work generally.
On 10/10/2021 16:01, David Brown wrote:
On 10/10/2021 16:10, James Harris wrote:
On 09/10/2021 11:30, David Brown wrote:
On 08/10/2021 17:52, James Harris wrote:
On 07/10/2021 12:32, David Brown wrote:
On 06/10/2021 17:26, James Harris wrote:
On 06/10/2021 07:20, David Brown wrote:
On 05/10/2021 21:13, James Harris wrote:
On 05/10/2021 18:35, David Brown wrote:
...
abs(x) = ⎧ x, if x >= 0
⎨
⎩ -x, if x < 0
...
Yes. I would consider that a valid and correct definition given the >>>>>>> criteria. It describes what a programmer can expect from aBut something else stood out particularly in the context of this >>>>>>>>> subthread. The definition of abs(x) would fail on a computer if it >>>>>>>>> were
using 2's complement representation and x was the most negative >>>>>>>>> number.
It's a classic case in point where the computer won't follow the >>>>>>>>> accepted mathematical definition.
These were examples of function definitions with conditionals in >>>>>>>> common
mathematics using real numbers - they cannot be implemented
directly in
computer code. If you want a mathematical definition of "abs" for >>>>>>>> fixed
size integer types in a programming language, you must adapt it >>>>>>>> to a
different mathematical definition that is suitable for the
domain you
are using (i.e., the input and output sets are the range of your >>>>>>>> computer type, rather than the real numbers). It is, however, >>>>>>>> still
maths. Two possibilities for n-bit two's complement signed
integers
could be :
abs(x) = ⎧ x, if x >= 0
⎨ -x, if x < 0 and x > int_min >>>>>>>> ⎩ int_min, if x = int_min >>>>>>>
computer's
abs function (again, given the criteria).
What criteria?
Those above: 2's complement, etc.
Again - you are mixing implementation and specification.
On the contrary, part of the specification of the problem, at least as I >>> had it in mind, was to define an abs which a computer might implement.
The point of having a mathematical specification is that there is no
such thing as what you had "in mind" - it is /exact/.
Well, at the start of this subthread I said that your "definition of
abs(x) would fail on a computer if it were using 2's complement representation and x was the most negative number." It's in the text,
above. I would have thought that was clear enough that I was talking
about 2's complement, at least.
The newsgroup is about languages (and language design I guess). Who
knows what languages are used in those big applications.
Those big apps have endless problems which IMV are cause by being so
large, complex and multi-layered.
(Eg. both Firefox and Opera [...])
Writing simple software or keeping it small enough to keep on top of
is another subject.
You really don't want the loop that turns "12345" into 12345 to cause
a hardware interrupt or to require the language to create some
exception when doing a*10+c.
(Would you need language assistance also
when encountering "123?5"? This is just lexing.)
It's once those values have been loaded that the problems start. They
might be within range of the the type, but might not be valid for the application (eg. they might need to be even numbers). And then there
are the calculations that might be done.
If you do the rest of the job properly, then checking those machine
limits becomes less important.
Or to put it another way, I don't want to invest too much effort and sacrifice too much efficiency for that tiny benefit.
Further, whereas you might be slightly lucky ifI had to try it:
real pi = 4 * arctan (1);
print (sin(pi/3)^2)
printed an exact representation of 3/4 rather than [say] 0.7499...9,
C:\qapps>type t.q
println sqr sin(pi/3)
C:\qapps>qq t
0.750000
How about that? Of course, it helps that it rounds to 6 decimals. I
only get 0.749... at 16 decimal places or more.
you would be entitled to your money back if a symbolic algebraYou said why; they're symbolic.
package gave anything other than 3/4 for the same expression.
Modern symbolic packages know an amazing amount of maths, more
than all but a handful of mathematicians in terms of being able
to do calculus, algebra and numerical work generally.
They will keep an expression as an
expression as much as possible, only applying reductions as needed,
so that 'sqrt(a)**2' is just a, it won't evaluate sqrt(a) first.
Presumably sin(pi/3) involves some sort of square root (sin 60° is
0.866... which is sqrt(3)/2), so it all cancels out).
It's conceivable here that an ordinary compiler can do this kind of
simple analysis.
On 10/10/2021 11:36, Bart wrote:
The newsgroup is about languages (and language design I guess). Who
knows what languages are used in those big applications.
In the first place, anyone can look at [and indeed help with]
the applications thus far mentioned; Firefox, Thunderbird and Gcc
and others are open source or close relative thereto.
You really don't want the loop that turns "12345" into 12345 to cause
a hardware interrupt or to require the language to create some
exception when doing a*10+c.
??? I would if "a*10+c" exceeded "maxint" [assuming "int"
to be the relevant type]. The arithmetic would overflow, which
/should/ normally spring an overflow trap. It's up to you whether
you program a preliminary check to prevent such overflows or don't
bother and let the trap either terminate your program or divert to
a user-defined trap routine.
(Would you need language assistance also >> when encountering "123?5"? This is just lexing.)
The point about /hardware/ support is that it's /not/
language assistance, and requires no action on your part as a
programmer or as a compiler writer. If you choose to take no
action, then the program will simply terminate, presumably with
some pre-defined message, if the overflow occurs. That's the
way it used to be in the '60s!
Machine limits are only a tiny part of debugging, and
especially of fire-proofing major applications. Things like
buffer overflow, use of storage after freeing it, writing
outside an array, using uninitialised storage, dereferencing
null pointers, ... are not to do with machine limits, but
could all be detected by suitable hardware, esp [but not only]
in symbiosis with compilers.
Or to put it another way, I don't want to invest too much effort and
sacrifice too much efficiency for that tiny benefit.
Sadly, that attitude is all too common. Which is why
doing as much as possible in hardware saves all of us from
sacrificing /any/ efficiency.
They are so big and complex that they might as well be closed source.The newsgroup is about languages (and language design I guess). WhoIn the first place, anyone can look at [and indeed help with]
knows what languages are used in those big applications.
the applications thus far mentioned; Firefox, Thunderbird and Gcc
and others are open source or close relative thereto.
No need. For turning text into binary, the necessary checks can beYou really don't want the loop that turns "12345" into 12345 to cause??? I would if "a*10+c" exceeded "maxint" [assuming "int"
a hardware interrupt or to require the language to create some
exception when doing a*10+c.
to be the relevant type]. The arithmetic would overflow, which
/should/ normally spring an overflow trap. It's up to you whether
you program a preliminary check to prevent such overflows or don't
bother and let the trap either terminate your program or divert to
a user-defined trap routine.
done on the string.
What you seem to want in hardware support is to generate interrupts,
which is a really heavy-duty way to deal with such matters, and quite difficult to use effectively from a language (eg. I've got no idea
how to trap them from mine).
And unless there's some way of selectively disabling it, it means
programs at risk of crashing, and losing customer's work and data,
for something that might be harmless.
Machine limits are only a tiny part of debugging, andBut you've just said that the hardware can only do a small part!
especially of fire-proofing major applications. Things like
buffer overflow, use of storage after freeing it, writing
outside an array, using uninitialised storage, dereferencing
null pointers, ... are not to do with machine limits, but
could all be detected by suitable hardware, esp [but not only]
in symbiosis with compilers.
Or to put it another way, I don't want to invest too much effort andSadly, that attitude is all too common. Which is why
sacrifice too much efficiency for that tiny benefit.
doing as much as possible in hardware saves all of us from
sacrificing /any/ efficiency.
Take array bounds: your runtime requests of a pool of virtual memory
from the OS. Part of that is used to allocate the array.
Access outside the array bounds, it will still be inside the
OS-allocated block, so the hardware check is no use here. Not unless
you're well outside the array.
If people complain that 'print 2**3**4' displays 0 instead of 2417851639229258349412352, then I just play the UB card like C does.
Users of the language need to take care around those limits.
On 12/10/2021 17:41, Bart wrote:
No need. For turning text into binary, the necessary checks can be
done on the string.
In which case the trap won't be sprung. It costs you nothing,
so you don't need even to think about it.
And unless there's some way of selectively disabling it, it means
programs at risk of crashing, and losing customer's work and data,
for something that might be harmless.
It /might/ be harmless. How would anyone know? You asked
the computer to work out, in your example, "a*10+c", and it failed
to do so. If you don't care what "a*10+c" is, why is that code in
there in the first place?
BTW, A68G displays 4096 for the result of 2**3**4. Now /that/ is
unexpected!
It uses the wrong precedence for **.
On 13/10/2021 12:13, Bart wrote:
BTW, A68G displays 4096 for the result of 2**3**4. Now /that/ is
unexpected!
To you, perhaps. But it's in accord with the Report, and
with the precedents of Algol 60 and IAL, so it's been "expected"
for over 60 years. Traditional mathematics has no equivalent, as exponentiation is normally indicated by indexes, not by operators,
and many languages don't implement exponentiation operators either [preferring to use functions instead] so there is no solid general
prior expectation to guide us.
It uses the wrong precedence for **.
If you meant that it associates the same way around as
other operators, that may not be your expectation but Algol took
the view that consistency was more important. Operators of equal
precedence always associate such that "a op b op c" means
"(a op b) op c". You could make a case for it always to mean
"a op (b op c)" [tho' then (eg) "a-b-c" would surprise most of
us], but you would be hard pushed to make a coherent case for
mixing them. You would have to add new syntax, and it would be
quite hard to read/express.
If you meant that it associates the same way around as
other operators, that may not be your expectation but Algol took
the view that consistency was more important. Operators of equal
precedence always associate such that "a op b op c" means
"(a op b) op c". You could make a case for it always to mean
"a op (b op c)" [tho' then (eg) "a-b-c" would surprise most of
us], but you would be hard pushed to make a coherent case for
mixing them.
You do need to think about it. My apps used to consist of a compiledNo need. For turning text into binary, the necessary checks can beIn which case the trap won't be sprung. It costs you nothing,
done on the string.
so you don't need even to think about it.
main application, with hot-loaded bytecode modules to execute
user-commands.
If a module went wrong [...].
What you don't want is some silly overflow in a module (which may be
written in this scripting language by a third party), from bringing
down the entire application, and losing the user's work for that
day.
It /might/ be harmless. How would anyone know? You askedSo we're back to this. a*10+c isn't some theoretical mathematical
the computer to work out, in your example, "a*10+c", and it failed
to do so. If you don't care what "a*10+c" is, why is that code in
there in the first place?
term which is always going to have some exact numeric result, so long
as you don't actually have to evaluate it.
It's doing the calculation within the limitations of, say, the 64-bit
ALU of a processor which represents integers using two's complement.> Then how it behaves near the boundaries of those limitations, how
much it insulates those realities from the user, is a choice
involving both the language and application.
If people complain that 'print 2**3**4' displays 0 instead of 2417851639229258349412352, then I just play the UB card like C does.
I understand that if /you/ were implement a language, it would do
things perfectly.
It would display 2**3**4**5 exactly (even if it
takes a while, or exhausts your machine's memory).
I tend to cut a
few corners.
[...] Traditional mathematics has no equivalent, asI think it has. The operators are implicit: 3x² has always been
exponentiation is normally indicated by indexes, not by operators,
and many languages don't implement exponentiation operators either
[preferring to use functions instead] so there is no solid general
prior expectation to guide us.
3*(x**2) or 3*(x^2).
If I type 2**3**4 or 2^3^4 into Google, it gives me the larger value
(so using using right-to-left precedence) rather than the smaller.
So that appears to be the expectation these days.
Also, if I do:
print *, 2**3**4
in Fortran (via rextester.com) it appears to parse as 2**(3**4) too.
Fortran is even older than Algol (I don't know if at some point it
did something different; but it's unlikely they would have changed
it).
On 13/10/2021 12:01, Bart wrote:
You do need to think about it. My apps used to consist of a compiledNo need. For turning text into binary, the necessary checks can beIn which case the trap won't be sprung. It costs you nothing, >>> so you don't need even to think about it.
done on the string.
main application, with hot-loaded bytecode modules to execute
user-commands.
If a module went wrong [...].
What you don't want is some silly overflow in a module (which may be
written in this scripting language by a third party), from bringing
down the entire application, and losing the user's work for that
day.
Repeat -- /if/ you have done the checks, /then/ the trap
won't be sprung [and it has cost you nothing and it doesn't matter
what would have happened if the trap had been sprung]. If your
users care what their work actually does, then an overflow is not
"silly", it means they are getting the wrong answers. It must be
a bizarre application if getting wrong answers doesn't matter.
Otherwise, you either [again] build in the checks, and the trap
won't be sprung, /or/ you spend a few moments /once/ writing a
trap handler [which in this case would presumably be "print an
error message and go to the end of this module"]. It's easier
to write one simple trap handler than to add checks to all the
modules as and when you import them. Shell scripts [eg] can use
the "trap" command; C programs can use the stuff defined [eg]
in N1570 section 7.14 and/or Annex H [on "language independent
arithmetic", which might interest James].
[...]
It /might/ be harmless. How would anyone know? You askedSo we're back to this. a*10+c isn't some theoretical mathematical
the computer to work out, in your example, "a*10+c", and it failed
to do so. If you don't care what "a*10+c" is, why is that code in
there in the first place?
term which is always going to have some exact numeric result, so long
as you don't actually have to evaluate it.
It's doing the calculation within the limitations of, say, the 64-bit
ALU of a processor which represents integers using two's complement.>
Then how it behaves near the boundaries of those limitations, how
much it insulates those realities from the user, is a choice
involving both the language and application.
Yes, but when "a*10+c" overflows, you [apparently] aren't
checking, nor are you trapping the operations, so your users have
no idea that their answers are completely bogus. Yes, it's your
choice; I hope your users know what your decision was, and know
not to use your code to work out their payroll ....
If people complain that 'print 2**3**4' displays 0 instead of
2417851639229258349412352, then I just play the UB card like C does.
C doesn't. Some implementations of C do. Guess what,
those are the implementations that David and I complain about.
C can be made to behave properly [see the above-mentioned parts
of N1570, for example]; but it's harder than it should be.
I understand that if /you/ were implement a language, it would do
things perfectly.
Thanks for your confidence.
It would display 2**3**4**5 exactly (even if it
takes a while, or exhausts your machine's memory).
Doesn't need me to implement anything:
$ a68g -p 2**3**4**5
a68g: runtime error: 1: INT math error [...]
$ a68g -p "LONG 2**3**4**5"
+1152921504606846976
I tend to cut a
few corners.
Yes, we'd noticed. Nothing wrong with cutting corners
as long as everyone concerned knows that's how you operate and
is happy that the computing work they are paying for is of that
standard.
On 13/10/2021 12:01, Bart wrote:
I understand that if /you/ were implement a language, it would do
things perfectly.
Thanks for your confidence.
It would display 2**3**4**5 exactly (even if it
takes a while, or exhausts your machine's memory).
Doesn't need me to implement anything:
$ a68g -p 2**3**4**5
a68g: runtime error: 1: INT math error [...]
$ a68g -p "LONG 2**3**4**5"
+1152921504606846976
C programs can use the stuff defined [eg]
in N1570 section 7.14 and/or Annex H [on "language independent
arithmetic", which might interest James].
On 21/08/2021 21:31, James Harris wrote:
Isn't it odd to have the maximum size of a file decided by a language's
implementation (or an OS's implementation)?
It is the OS's choice, not the language.
If an OS does not support file systems that can handle files bigger than
4 GB, why should it make every file operation bigger and slower with a pointlessly large "off_t" size?
Put another way, the C/Posix or whatever concept of off_t seems to me to
be broken.
It only seems that way because you don't understand it.
On 09/09/2021 11:57, James Harris wrote:
1. create a .c file with the required #includes
2. run it through the target's preprocessor
3. parse the output to extract the data I needed
4. store the extracted data in a configuration file for the target
5. use the configuration file to set up my own types and structures
for the target environment.
Further, since I may not even have access to a given target
environment, if the above process was unable to parse anything it
needed to I'd have the parser produce a report of what it could not
handle for sending back to me so I could update the parser or take
remedial steps.
As the end of the day, I thought you were lamenting that there's no
master config file and all info is in C headers. The above steps are
intended to remedy that and create the master config file I thought
you wanted.
In general, the process is non-trivial, even if you have a C compiler
that can successfully process the headers (which itself can be
problematical as it can have extra dependencies).
For example, at some point, something is defined in terms of C
executable code, and not declarations. Now you are having to translate
chunks of program code.
It is also an unreasonable degree of effort. I think most people would
rather buy fish in a supermarket than having to lease a North Sea
trawler to find it themselves!
On 29/08/2021 19:21, James Harris wrote:
On 29/08/2021 15:50, David Brown wrote:
On 29/08/2021 13:47, James Harris wrote:
The limit of a file's size would naturally be defined by the filesystem >>>> on which it was stored or on which it was being written. Such a value
would be known by, and a property of, the FS driver.
"Proof by repetitive assertion" is not convincing.
There's nothing to prove. It is simply factual (and well known) that
different filesystems have different maximum file sizes. FAT12 has
different limits from FAT32, for example. Ergo, the maximum permitted
file size /is/ a natural property of the formatted filesystem. I guess
that's repetitive again but I cannot imagine what you think would need
to be added to that to establish the point.
Of course different filesystems have different maximum file sizes - no
one has disputed that! All that is in dispute is your silly idea that
the /OS/ does not have limits here.
What systems use file sizes that are smaller than "types smaller that
32-bit" ?
I thought you were the microcontroller man!
I am. What type would you use for file sizes here?
Something Dmitry said makes, I think, this easier to explain. In an OO
language you could think of the returns from your functions as objects.
The objects would be correctly sized for the filesystems to which they
related. They could have different classes but they would all respond
polymorphically to the same methods. The class of each would know the
maximum permitted offset.
Right. Arbitrary precision integers. They are written in a nice OO
language so that the language hides the ugly mechanics of allocations, deallocations, memory management, etc., and gives you operator overloads instead of ugly prefix function calls or macros for everything.
But they are still arbitrary precision integers if you refuse to allow limits. And they are still massively less efficient than a simple large integer of fixed size.
Again, I have to disagree. The question is: What defines how large a
file's offset can be?
The answer is just as simple: Each filesystem has its own range of max >>>> sizes.
Your concept of "basic engineering" is severely lacking here.
Oh? What, specifically, is lacking?
Supposing someone asked you to build a bridge for a two-lane road
passing over a river. Basic engineering is to look at the traffic on
the road, the size of the crossing, and figure out a reasonable maximum
load weight that could realistically be on the bridge at any given time.
Then you extrapolate for future growth based on the best available data
and predictions. Then you multiply and add in safety factors. You tell
the town planners that the bridge should have a total weight limit of,
say, 100 tons and you tell them the price.
That is basic engineering.
C programs can use the stuff defined [eg]Yes, interesting stuff which I didn't know about. The wording is
in N1570 section 7.14 and/or Annex H [on "language independent
arithmetic", which might interest James].
largely for 'language lawyers', though. I'd rather specify such
operations in a more user-friendly way that would work for everyone
including non-mathematician newbie programmers.
On 23/10/2021 11:18, James Harris wrote:
[I wrote, a propos traps:]
C programs can use the stuff defined [eg]
in N1570 section 7.14 and/or Annex H [on "language independent
arithmetic", which might interest James].
Yes, interesting stuff which I didn't know about. The wording is
largely for 'language lawyers', though. I'd rather specify such
operations in a more user-friendly way that would work for everyone
including non-mathematician newbie programmers.
You seem to be confusing a language specification with a
language primer or even advanced text. The specification must be sufficiently precise that no two civilised readers can disagree
about what it means; ordinary programmers who find that hard to
understand can look instead at books that explain what is going
on. You don't learn C from N1570; but you might write compilers
from it.
Sadly, since then, few people have even tried to define
languages formally.
You seem to be confusing a language specification with aDon't get me wrong. I'm a fan of what is probably now an
language primer or even advanced text. The specification must be
sufficiently precise that no two civilised readers can disagree
about what it means; ordinary programmers who find that hard to
understand can look instead at books that explain what is going
on. You don't learn C from N1570; but you might write compilers
from it.
old-fashioned idea of there being an instruction manual and a
reference manual. It's just that IMO the reference manual should be
for programmers rather than for compiler writers.
Whether a reference manual could be good enough for compiler writers
as well as programmers is, for me, an open question.
Sadly, since then, few people have even tried to defineThat's surprising. I would have thought that all languages needed a
languages formally.
formal definition.
On 23/10/2021 16:06, James Harris wrote:
[I wrote:]
You seem to be confusing a language specification with aDon't get me wrong. I'm a fan of what is probably now an
language primer or even advanced text. The specification must be
sufficiently precise that no two civilised readers can disagree
about what it means; ordinary programmers who find that hard to
understand can look instead at books that explain what is going
on. You don't learn C from N1570; but you might write compilers
from it.
old-fashioned idea of there being an instruction manual and a
reference manual. It's just that IMO the reference manual should be
for programmers rather than for compiler writers.
I didn't say "reference manual", but "specification".
If a
language is not precisely defined, then different compilers will
either disagree despite conforming to the spec, or else [and just
as bad] there will be a consensus among compilers that is merely
"folklore" about what the language means and that cannot be found
out by programmers who want/need to know. It's that definition
that compiler writers need to have access to.
Whether a reference manual could be good enough for compiler writers
as well as programmers is, for me, an open question.
If the reference manual is /not/ good enough, then a better
one is needed, otherwise what are compiler writers supposed to use?
I don't see why that's "open", other than for toy/private languages.
IOW, you can do what you like with your own language, but languages
like C, Fortran, Algol, ... that aspire to a degree of universality
need a reliable definition. [To some extent, "C is what runs on
DMR's computer" is such a definition, but it leaves ambiguous
whether some features are as they are because DMR made arbitrary
choices for his own compiler or because they are the essence of C.]
I know. I brought up the topic of a reference manual.[...] It's just that IMO the reference manual should beI didn't say "reference manual", but "specification".
for programmers rather than for compiler writers.
Are you arguing for there being a reference manual and aWhether a reference manual could be good enough for compiler writersIf the reference manual is /not/ good enough, then a better
as well as programmers is, for me, an open question.
one is needed, otherwise what are compiler writers supposed to use?
specification, or just one text which is good enough for both
purposes (programmer's reference and compiler-writer's
specification)? By 'one text' I don't mean one book with both
sections but a single exposition which serves both purposes.
One thing I am convinced of is that it helps a human if a text
follows the pattern: principle, examples and maybe rationale.
Have to say the language specifications I've seen are not easy to
read. I do wonder if the same information could be written in an
easier form.
AISI a compiler writer wants to know what's required at
each point in his code but I am not sure that specifications are
written that way. I've never tried to write a compiler from a spec
but I get the impression that to write any piece of compiler code one
would have to take into account points from across the specification
rather than those in the one relevant place. That's not much help to
a compiler writer.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 286 |
Nodes: | 16 (2 / 14) |
Uptime: | 84:26:36 |
Calls: | 6,495 |
Calls today: | 6 |
Files: | 12,097 |
Messages: | 5,276,895 |