I've loads of other messages to get back to but while I think of it I'd
like to post a suggestion for you guys to shoot down in flames. ;-)
The printf approach to printing is flexible and fast at rendering
inbuilt types - probably better than anything which came before it - but
it's not perfect. In particular, it means that the code inside printf
which does the rendering needs to know about all types it may be asked
to render in character form. But there are many other types. E.g. a programmer could want a print routine to render a boolean, an integer, a float, a record, a table, a list, a widget, etc.
So here's another potential approach. What do you think of it?
The idea is, as with the printf family, to have a controlling string
where normal characters are copied verbatim and special fields are
marked with a % sign or similar. The difference is what would come after
the % sign and how it would be handled.
What I am thinking of is a format specification something like
%EB;
where "E" is a code which identifies the rendering engine, "B" is the
body of the format and ";" marks the end of the format and a return to
normal printing.
The /mechanical/ difference is that rather than the print function doing
all the formatting itself it would outsource any it didn't know. For outsourcing, the rendering engine would be sent both the value to be
printed AND a pointer to the format string.
As for where rendering engines could come from:
* Some rendering engines could be inbuilt.
* Some could be specified earlier in the code.
* Some could be supplied in the parameter list (see below).
What would the formats look like? As some examples:
%i; a plain, normal, signed integer
%iu; a plain, normal, unsigned integer
%iu02x; a 2-digit zero-padded unsigned hex integer
%Kabc; a type K unknown to the print function
The latter would need the print function to have been previously told
about a rendering engine for "K". The print function would pass to the rendering engine the format specification and a pointer to the value. Finally,
%*abc; a dynamic render
The * would indicate that the address of the rendering engine had been supplied as a parameter as in
printit("This is %*abc;, OK?", R, v)
where R is the rendering engine and v is the value to be rendered
according to the specification abc.
That's it. It's intended to be convenient, efficient, flexible and about
as simple as possible to use. Whether it achieves that is up for debate.
%i; a plain, normal, signed integer
%iu; a plain, normal, unsigned integer
%iu02x; a 2-digit zero-padded unsigned hex integer
%Kabc; a type K unknown to the print function
On 27/12/2021 14:01, James Harris wrote:
What would the formats look like? As some examples:
%i; a plain, normal, signed integer
%iu; a plain, normal, unsigned integer
%iu02x; a 2-digit zero-padded unsigned hex integer
%Kabc; a type K unknown to the print function
%*abc; a dynamic render
The * would indicate that the address of the rendering engine had been
supplied as a parameter as in
printit("This is %*abc;, OK?", R, v)
where R is the rendering engine and v is the value to be rendered
according to the specification abc.
Look at that last example. You have to give it three things other than
the surrounding context: v, R and *abc.
At the simplest, you want to do it with just v. Then:
* It will apply the renderer previously associated with that user-type
* If it there isn't one, it will use a default rendering for that
generic type (array, struct etc)
* Or it can say it doesn't know how to do it, or just prints a reference
to v
Next simplest is when you specify some parameters to control the
formatting. How these work depends on what it does above.
Supplying a rendere each time you print I would say is not the simplest
way of doing it! If you're going to do that, you might as well do:
printit("This is %s; OK?", R(v,"abc"))
Then no special features are needed. Except that if R returns a string,
then you need some means of disposing of that string after printing, but there are several ways of dealing with that.
%i; a plain, normal, signed integer
%iu; a plain, normal, unsigned integer
%iu02x; a 2-digit zero-padded unsigned hex integer
%Kabc; a type K unknown to the print function
This is too C-like. C-style formats have all sorts of problems
associated with hard-coding a type-code into a format string:
* What is the code for arbitrary expression X?
* What will it be when X changes, or the type of the terms change?
* What is it for clock_t, or some other semi-opaque type?
* What is it for uint64_t? (Apparently, it is PRId64 - a macro that
expands to a string)
If your Print function is implemented as a regular user-code function,
which your language knows nothing about, then you will need some scheme
which imparts type information to the function, as well as a way of
dealing with such variadic parameter lists.
But if it does, which would be a more modern way of doing so, then the compiler already knows the types involved. Then the formatting is about
the display, so for an integer:
* Perhaps override signed to unsigned
* Plus sign
* Width
* Justification
* Zero-padding (and padding character)
* Base
* Separators (and grouping)
* Prefix and/or suffix
* Upper/lower case (digits A-F)
Or maybe, this number represents certain quanties (eg. a day of the
week), which will need displaying in a special way. If you're not using
a special type for that, then here it will need a way to override that, perhaps using an R() function.
You should also look at how the current crop of languages do it. They
are also still tend to use format strings, and some like to put the expressions to be printed inside the format string.
-----------
(My approach in dynamic code is that there is an internal function
tostr(), fully overloaded for different types, with optional format
data, that is applied to Print items. So that:
print a, b, c:"h" # last bit means hex
is the same as:
print tostr(a), tostr(b), tostr(c, "h")
There is a crude override mechanism, which links 'tostr' and a type T,
to a regular user-code function F.
Then, when printing T, it will call F().
In static code, this part is poorly developed. But Print (which is again known to the language as it is a statement), can deal with regular
types, including most of those options for integers:
print a:"z 8 h s_" # leading zeros, 8-char field, hex, "_" separator
)
On 27/12/2021 15:49, Bart wrote:
printit("This is %s; OK?", R(v,"abc"))
Then no special features are needed. Except that if R returns a
string, then you need some means of disposing of that string after
printing, but there are several ways of dealing with that.
Dealing with memory is indeed a problem with that approach. The %s could
be passed a string which needs to be freed or one which must not be
freed. One option is
%s - a string which must not be freed
%M - a string which must be freed
; %i; a plain, normal, signed integer
; %iu; a plain, normal, unsigned integer
; %iu02x; a 2-digit zero-padded unsigned hex integer
; %Kabc; a type K unknown to the print function
This is too C-like. C-style formats have all sorts of problems
associated with hard-coding a type-code into a format string:
* What is the code for arbitrary expression X?
It would have to be something to match the type of X.
* What will it be when X changes, or the type of the terms change?
The format string would need to be changed to reflect the type change.
* What is it for clock_t, or some other semi-opaque type?
Perhaps %sdhh:mm:ss.fff; where d indicates datetime.
* What is it for uint64_t? (Apparently, it is PRId64 - a macro that
expands to a string)
That said, it would make sense for the elements of the format string to appear in some sort of logical order - possibly the order in which they
would be needed by the renderer.
Maybe unfairly I have an antipathy to copying other languages but maybe
in this case it would be useful. Are there any you would recommend?
(My approach in dynamic code is that there is an internal function
tostr(), fully overloaded for different types, with optional format
data, that is applied to Print items. So that:
print a, b, c:"h" # last bit means hex
is the same as:
print tostr(a), tostr(b), tostr(c, "h")
Maybe that's better: the ability to specify custom formatting on any argument. I presume that's not just available for printing, e.g. you
could write
string s := c:"h"
and that where you have "h" you could have an arbitrarily complex format specification.
There looks to be a potential issue, though. In C one can build up the control string at run time. Could you do that with such as
string fmt := format_string(....)
s := c:(fmt)
?
The printf approach to printing is flexible and fast at rendering
inbuilt types - probably better than anything which came before it -
but it's not perfect.
So here's another potential approach. What do you think of it?
The idea is, as with the printf family, to have a controlling string
where normal characters are copied verbatim and special fields are
marked with a % sign or similar. The difference is what would come
after the % sign and how it would be handled.
On 27/12/2021 14:01, James Harris wrote:
The printf approach to printing is flexible and fast at rendering
inbuilt types - probably better than anything which came before it -
but it's not perfect.
No, it's rubbish.
then instead of all
the special casing, the easiest and most flexible way is to
convert everything to strings.
On 27/12/2021 14:01, James Harris wrote:
The printf approach to printing is flexible and fast at rendering
inbuilt types - probably better than anything which came before it -
but it's not perfect.
No, it's rubbish. If you need formatted transput [not
entirely convinced, but chacun a son gout],
then instead of all
the special casing, the easiest and most flexible way is to
convert everything to strings.
Thus, for each type, you need
an operator that converts values of that type into strings [and
vv for reading].
So here's another potential approach. What do you think of it?
The idea is, as with the printf family, to have a controlling string
where normal characters are copied verbatim and special fields are
marked with a % sign or similar. The difference is what would come
after the % sign and how it would be handled.
Then what you've done is to use "%" where you should
instead simply be including a string. So the specification of
"printf" becomes either absurdly complicated [as indeed it is
in most languages] or too limited [because some plausible
conversions are not catered for]. The "everything is a string"
approach has the advantage that for specialised use, eg if you
want to read/write your numbers as Roman numerals, you just have
to write the conversion routines that you would need anyway, no
need to change anything in "printf".
If you convert to strings then what reclaims the memory used by those strings?
Not all languages have dynamic memory management, and dynamic
memory management is not ideal for all compilation targets.
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by those
strings?
What reclaims memory used by those integers?
Not all languages have dynamic memory management, and dynamic memory
management is not ideal for all compilation targets.
No dynamic memory management is required for handling temporary objects.
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
FOO (I + 1)
were OK almost human life span ago.
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by those
strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
Not all languages have dynamic memory management, and dynamic memory
management is not ideal for all compilation targets.
No dynamic memory management is required for handling temporary objects.
If memory is allocated for the temporary object, then at some point it
needs to be reclaimed. Preferably just after the print operation is completed.
If your language takes care of those details, then lucky you. It means someone else has had the job of making it work.
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
FOO (I + 1)
were OK almost human life span ago.
Evey *normal* language takes care of the objects it creates. And every *normal* language lets identityless objects (like integers, strings,
records etc) be created ad-hoc and passed around in a unified manner.
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
On 27/12/2021 17:26, James Harris wrote:
On 27/12/2021 15:49, Bart wrote:
printit("This is %s; OK?", R(v,"abc"))
Then no special features are needed. Except that if R returns a
string, then you need some means of disposing of that string after
printing, but there are several ways of dealing with that.
Dealing with memory is indeed a problem with that approach. The %s
could be passed a string which needs to be freed or one which must not
be freed. One option is
%s - a string which must not be freed
%M - a string which must be freed
This has similar problems to hardcoding a type. Take this:
printit("%s", F())
F returns a string, but is it one that needs freeing or not? That's
depends on what happens inside F. Whatever you choose, later the implementation of F changes, then 100 formats have to change too?
This is a more general problem of memory-managing strings. It's more
useful to be able to solve it for the language, then it will work for
Print too.
(Personally, I don't get involved with this at all, not in low level code.
Functions that return strings generally return a pointer to a local
static string that needs to be consumed ASAP. Or sometimes there is a circular list of them to allow several such calls per Print. It's not
very sophisticated, but that's why the language is low level.)
If you want a more rigorous approach, perhaps try this:
printit("%H", R(v, "abc"))
H means handler. R() is not the handler itself, but returns a descriptor
(eg. X) the contains a reference to a handler function, and references
to those two captured values (a little like a lambda or closure I think).
* What is it for uint64_t? (Apparently, it is PRId64 - a macro that >>> expands to a string)
Again this is for C; the problem being that a format string should not
need to include type information:
* The compiler knows the type
* You may not know the type (eg. clock_t)
* You may not know the format needed (eg. uint64_t)
* You don't want to have to maintain 1000 format strings as
expressions and types of variables change
I've exercised my print formatting recently and found some weak areas,
to do with tabulation. Getting things lined up in columns is tricky, especially with a header.
I do have formatted print which looks like this:
fprint "#(#, #) = #", fnname, a, b, result
If I want field widths, they are written as:
fprint "#: # #", a:w1, b:w2, c:w3
where w1/w2/w2 are "12" etc. Here, the first problem is a disconnect
between each #, and the corresponding print item. This is why some
languages bring them inside.
But the main thing here is that I don't get a sense of what it looks
like until I run the program. Something I've seen in the past would look
a bit like:
fprint "###: ####### #############", a, b, c
The widths are the number of # characters. That same string could be
used for headings:
const format = "###: ####### #############"
fprint format, "No.", "Barcode", "Description"
....
fprint format, i, item[i].code, item[i].descr
I haven't implemented this, it's just an idea. This an actual example of
the kind of output I'm talking about, but done the hard way by trial and error:
Type Seg Offset Symbol/Target+Offset -------------------------------------------------------
1: imprel32 code 00000024 MessageBoxA
2: locabs64 code 00000015 idata 02E570B0
3: locabs64 code 0000000B idata 02E570B6
On 2021-12-27 23:02, Andy Walker wrote:
then instead of all
the special casing, the easiest and most flexible way is to
convert everything to strings.
Sure, though, usually not everything is converted to string. For
example, formatting symbols or extensions of the idea: meta/tagged
formats like HTML, XML etc are inherently bad.
The most flexible is a combination of a string that carries most of the information specific to the datatype (an OO method) and some commands to
the rendering environment.
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
That would be poor for small-machine targets. Shame on Ada! ;-)
On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
On 2021-12-27 23:02, Andy Walker wrote:
...
then instead of all
the special casing, the easiest and most flexible way is to
convert everything to strings.
Sure, though, usually not everything is converted to string. For
example, formatting symbols or extensions of the idea: meta/tagged
formats like HTML, XML etc are inherently bad.
The most flexible is a combination of a string that carries most of
the information specific to the datatype (an OO method) and some
commands to the rendering environment.
That sounds interesting. How would it work?
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
FOO (I + 1)
were OK almost human life span ago.
Fortran didn't allow recursion either.
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
My static language is of the lower-level kind described above, yet this example is merely:
println =X, =Y
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Not the print function. See below.
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
FOO (I + 1)
were OK almost human life span ago.
I disagree slightly with both of you. AISI it doesn't matter whether the objects to be printed are integers or structures or arrays or widgets.
If there's any reclaiming to be done then it would be carried out by
other language mechanisms which would happen anyway; it would not be
required by the print function. The print function would simply use
them. In reclamation terms it would neither increase nor decrease any reference count.
For example,
complex c
function F
widget w
....
print(w, c)
endfunction
Widget w would be created at function entry and reclaimed at function
exit. The global c would be created at program load time and destroyed
when the program terminates. The print function would not get involved
in any of that stuff.
On 2022-01-02 18:50, Bart wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
FOO (I + 1)
were OK almost human life span ago.
Fortran didn't allow recursion either.
Irrelevant. What reclaims integer I+1?
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
My static language is of the lower-level kind described above, yet
this example is merely:
println =X, =Y
No it is not. The example creates temporary strings, which possibility stunned James and you as something absolutely unthinkable, or requiring
heap, which is rubbish and pretty normal in any decent language.
But since you say it requires temporary strings, where does it put them?
They have to go somewhere!
On the stack?
bigger? What if some of the terms are strings returned from a function;
those will not be on the stack. Example:
println "(" + tostr(123456)*1'000'000 + ")"
this creates an intermediate string of 6MB; too big for a stack.
On 2022-01-02 19:51, Bart wrote:
But since you say it requires temporary strings, where does it put
them? They have to go somewhere!
On the stack?
The stack is as big as you specify, it is not the program stack, normally.
The stack is typically 1-4MB; what if the strings are
bigger? What if some of the terms are strings returned from a
function; those will not be on the stack. Example:
println "(" + tostr(123456)*1'000'000 + ")"
this creates an intermediate string of 6MB; too big for a stack.
Are you going to spill 6MB character long single line on a terminal
emulator? Be realistic.
On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
On 2022-01-02 19:51, Bart wrote:
But since you say it requires temporary strings, where does it put
them? They have to go somewhere!
On the stack?
The stack is as big as you specify, it is not the program stack,
normally.
So, what are you going to, have a stack which is 1000 times bigger than normal just in case?
Anyway, if the string is returned from a function, it is almost
certainly on the heap.
Are you going to spill 6MB character long single line on a terminal
emulator? Be realistic.
Who knows what a user-program will do?
It could also be on multiple lines if you want:
It depends on contents of the string data the user wants to output,
which as I said could be anything and of any size:
println reverse(readstrfile(filename))
On 2022-01-02 20:42, Bart wrote:
On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
On 2022-01-02 19:51, Bart wrote:
But since you say it requires temporary strings, where does it put
them? They have to go somewhere!
On the stack?
The stack is as big as you specify, it is not the program stack,
normally.
So, what are you going to, have a stack which is 1000 times bigger
than normal just in case?
No, I will never ever print anything that does not fit into a page of
72-80 characters wide.
BTW formatting output is meant to format, which includes text wrapping,
you know.
Anyway, if the string is returned from a function, it is almost
certainly on the heap.
No, it is certainly on the secondary stack.
Are you going to spill 6MB character long single line on a terminal
emulator? Be realistic.
Who knows what a user-program will do?
The developer:
https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack
It could also be on multiple lines if you want:
Now it is time to learn about cycles.
It depends on contents of the string data the user wants to output,
which as I said could be anything and of any size:
println reverse(readstrfile(filename))
It could even be fetching whole git repository tree of the Linux kernel...
1. Why would anybody ever do that?
2. Why would anybody program it in such a stupid way?
Which boils down to the requirements of the application program and its behavior upon broken input constraints from these requirements.
On 02/01/2022 20:25, Dmitry A. Kazakov wrote:
On 2022-01-02 20:42, Bart wrote:
On 02/01/2022 19:23, Dmitry A. Kazakov wrote:
On 2022-01-02 19:51, Bart wrote:
But since you say it requires temporary strings, where does it put
them? They have to go somewhere!
On the stack?
The stack is as big as you specify, it is not the program stack,
normally.
So, what are you going to, have a stack which is 1000 times bigger
than normal just in case?
No, I will never ever print anything that does not fit into a page of
72-80 characters wide.
You're not even going to allow anything that might spill over multiple
lines?
Say, displaying factorial(1000);
BTW formatting output is meant to format, which includes text
wrapping, you know.
Anyway, if the string is returned from a function, it is almost
certainly on the heap.
No, it is certainly on the secondary stack.
You'll have to explain what a secondary stack is.
Are you going to spill 6MB character long single line on a terminal
emulator? Be realistic.
Who knows what a user-program will do?
The developer:
https://spec.cs.miami.edu/cpu2017/flags/gcc.2021-07-21.html#user_Wl-stack
If your program needs to ROUTINELY increase the stack size, then it is probably broken.
So large files are OK in some circumstances, but not as an argument to
Print?
1. Why would anybody ever do that?
2. Why would anybody program it in such a stupid way?
Somebody does this:
println dirlist("*.c")
The output could be anything from 10 characters to 100,000 or more. (And
yes, the default routine to print a list of strings could write one per line.)
As I said I've lost track of what we discussing.
On 2022-01-02 23:31, Bart wrote:
Right, except that it is your program that tries to create 6MB strings
for no reason.
So large files are OK in some circumstances, but not as an argument to
Print?
Exactly.
1. Why would anybody ever do that?
2. Why would anybody program it in such a stupid way?
Somebody does this:
println dirlist("*.c")
Nobody prints file lists this way.
The output could be anything from 10 characters to 100,000 or more.
(And yes, the default routine to print a list of strings could write
one per line.)
Wrong. For printing lists of files, if anybody cares, there will be a subprogram with parameters:
- Files path / array of paths
- Wildcard/pattern use flag
- Number of columns
- Output direction: column first vs row first
- First line decorator text
- Consequent lines decorator text
- Filter object
- Sorting order object
On 2022-01-03 12:19, Bart wrote:
As I said I've lost track of what we discussing.
We were discussing inability to return a string from a function in your language in a reasonable way.
You argued that there is no need to have it due to danger that some user might miss his or her medication or confuse that with certain mushrooms
and so came to an idea of creating terabyte large temporary strings...
If you are not convinced by formatted io then what kind of io do you prefer?The printf approach to printing is flexible and fast at renderingNo, it's rubbish. If you need formatted transput [not
inbuilt types - probably better than anything which came before it -
but it's not perfect.
entirely convinced, but chacun a son gout],
then instead of allIf you convert to strings then what reclaims the memory used by those strings? Not all languages have dynamic memory management, and
the special casing, the easiest and most flexible way is to
convert everything to strings.
dynamic memory management is not ideal for all compilation targets.
The form I proposed had no need for dynamic allocations. That's part
of the point of it.
I'm not sure you understand the proposal. To be clear, the print
routine would be akin to
print("String with %kffff; included", val)
where the code k would be used to select a formatter. The formatter
would be passed two things:
1. The format from k to ; inclusive.
2. The value val.
As a result, the format ffff could be as simple as someone could
design it.
Note that there would be no requirement for dynamic memory. The
formatter would just send for printing each character as it was
generated.
What's wrong with that? (Genuine question!)
On 02/01/2022 16:06, James Harris wrote:
If you are not convinced by formatted io then what kind of io do youThe printf approach to printing is flexible and fast at renderingNo, it's rubbish. If you need formatted transput [not
inbuilt types - probably better than anything which came before it -
but it's not perfect.
entirely convinced, but chacun a son gout],
prefer?
Unformatted transput, of course. Eg,
print this, that, these, those and the many other things
[with whatever syntax, quoting, separators, etc you prefer]. Much
Yes, that's what I thought you meant. C is thataway -->.
You call it "simple"; C is one of the simpler languages of this
type, yet the full spec of "printf" and its friends is horrendous.
Build in some version [exact syntax up to you] of
print "String with ", val, " included"
and you're mostly done.
For the exceptional cases, use a library
procedure or your own procedure to convert "val" to a suitable
array of characters, with whatever parameters are appropriate.
Note that there would be no requirement for dynamic memory. The
formatter would just send for printing each character as it was
generated.
What's wrong with that? (Genuine question!)
Nothing. How on earth do you think we managed in olden
times, before we had "dynamic memory"? [Ans: by printing each
character in sequence.]
If I do this in A68G:
print(("<",1,">"))
the output is:
< +1>
The "<>" are just to help show the problem: how to get rid of the
those leading spaces and that plus sign? Or write the number with a
field width of your choice?
(It gets worse with wider, higher precision numbers, as it uses the
maximum value as the basis for the field width, so that one number
could take up most of the line. Now you will need to start calling
functions that return strings to get things done properly.)
The main problem with C's printf is having to tell it the exact type
of each expression.
But those formatting facilities are genuinely useful and harder to
emulate in user code if they didn't exist.
For the exceptional cases, use a libraryWell, this is the problem. Where will the array of characters be
procedure or your own procedure to convert "val" to a suitable
array of characters, with whatever parameters are appropriate.
located especially if the size is unpredictable?
How on earth do you think we managed in oldenProbably the printing tasks weren't that challenging. As my A68G
times, before we had "dynamic memory"? [Ans: by printing each
character in sequence.]
example showed, output tended to be tabulated.
On 04/01/2022 11:31, Bart wrote:
The main problem with C's printf is having to tell it the exact type
of each expression.
The main problem is its complexity! It's as bad as A68,
while not being /anywhere near/ as comprehensive. I suspect you
haven't read N2731 [or near equivalent]. If it takes scores of
pages to describe a relatively simple facility, there's surely
something wrong.
Well, this is the problem. Where will the array of characters be
located especially if the size is unpredictable?
Why [as a user] do you care? Your language either has
strings as a useful type or it doesn't.
[In response to James:]
How on earth do you think we managed in oldenProbably the printing tasks weren't that challenging. As my A68G
times, before we had "dynamic memory"? [Ans: by printing each
character in sequence.]
example showed, output tended to be tabulated.
Perhaps you could write that in a more patronising form?
Format codes form a little language of their own. Bear in mind all
the different parameters that could be used to control the appearance
of an integer or float value.
I've lost track here of your argument.Probably the printing tasks weren't that challenging. As my A68GPerhaps you could write that in a more patronising form?
example showed, output tended to be tabulated.
You say that a language ought to have strings as a proper type. But
you also say it doesn't need them. So which is it?
I think the thread is partly about how to add custom printing to a
language that doesn't have automatically managed string types.
deallocate it sooner or later, preferably sooner, but via which
mechanism? It could also mean arbitrary large strings that can cause
issues.
I don't think it's helpful to suggest that either the language needs
to be transformed into a higher level one, just for Print, or that it
doesn't need any such features, because decades ago we all seemed to
manage to print dates with basic Print.
On 05/01/2022 13:12, Bart wrote:
You're back to implementation issues. Not the concern
of the user who wants to print dates that look nice. Meanwhile,
the implementation issues were solved more than half a century
ago. I don't know why you and James are so opposed to the use
of heap storage [and temporary files, if you really want strings
that are many gigabytes]?
I don't think it's helpful to suggest that either the language needs
to be transformed into a higher level one, just for Print, or that it
doesn't need any such features, because decades ago we all seemed to
manage to print dates with basic Print.
"Need" is an exaggeration. But in any case no-one here
has suggested either part of that [esp not if you replace "need"
by "desirable" (with appropriate changes to the grammar)]. It is
indeed desirable for a modern language to have the ability to
allocate [and deallocate] off-stack storage and the ability to
print characters. Is there any major general-purpose computing
language of the past sixty years that has not had such abilities?
Meanwhile, you're proposing adding a "little language"
to the language spec "just for Print". Why is that any better
than adding "fixed", "float" and "whole" to the library?
Meanwhile, you're proposing adding a "little language"
to the language spec "just for Print".
On 2022-01-02 18:54, James Harris wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
That would be poor for small-machine targets. Shame on Ada! ;-)
That works perfectly well on small targets. You seem unaware of what
actually happens on I/O. For a "small" target it would be some sort of networking stack with a terminal emulation on top of it. Believe me,
creating a temporary string on the stack is nothing in comparison to
that. Furthermore, the implementation would likely no less efficient
than printf which has no idea how large the result is and would have to reallocate the output buffer or keep it between calls and lock it from concurrent access. Locking on an embedded system is a catastrophic event because switching threads is expensive as hell. Note also that you
cannot stream output, because networking protocols and terminal
emulators are much more efficient if you do bulk transfers. All that is
the infamous premature optimization.
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by those
strings?
What reclaims memory used by those integers?
Not all languages have dynamic memory management, and dynamic memory
management is not ideal for all compilation targets.
No dynamic memory management is required for handling temporary objects.
----------
If that were relevant in the case of formatted output, which has a
massive overhead, so that even when using the heap (which is no way necessary) it would leave a little or no dent. I remember a SysV C
compiler which modified the format string of printf in a misguided
attempt to save a little bit memory, while the linker put string
constants in the read-only memory...
On 2022-01-02 19:05, James Harris wrote:
On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
The most flexible is a combination of a string that carries most of
the information specific to the datatype (an OO method) and some
commands to the rendering environment.
That sounds interesting. How would it work?
With single dispatch you have an interface, say, 'printable'. The
interface has an abstract method 'image' with the profile:
function Image (X : Printable) return String;
Integer, float, string, whatever that has to be printable inherits to Printable and thus overrides Image. That is.
The same goes with serialization/streaming etc.
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
On 02/01/2022 16:06, James Harris wrote:
If you are not convinced by formatted io then what kind of io do youThe printf approach to printing is flexible and fast at renderingNo, it's rubbish. If you need formatted transput [not
inbuilt types - probably better than anything which came before it -
but it's not perfect.
entirely convinced, but chacun a son gout],
prefer?
Unformatted transput, of course. Eg,
print this, that, these, those and the many other things
[with whatever syntax, quoting, separators, etc you prefer]. Much
the same for "read". Most of the time the default is entirely
adequate. If not, then the choice for the language designer is
either an absurdly complicated syntax that still probably doesn't
meet some plausible needs, or to provide simple mechanisms that
allow programmers to roll their own. Guess which I prefer.
then instead of allIf you convert to strings then what reclaims the memory used by those
the special casing, the easiest and most flexible way is to
convert everything to strings.
strings? Not all languages have dynamic memory management, and
dynamic memory management is not ideal for all compilation targets.
AIUI, you are designing your own language. If it doesn't
have strings, eg as results of procedures, then you have much worse
problems than designing some transput procedures. There are lots
of ways of implementing strings, but they are for the compiler to
worry about, not the language designer [at least, once you know it
can be done].
The form I proposed had no need for dynamic allocations. That's part
of the point of it.
There's no need for "dynamic allocations" merely for transput.
Most of the early languages that I used didn't have them, but still
managed to print and read things. You're making mountains out of
molehills.
I'm not sure you understand the proposal. To be clear, the print
routine would be akin to
print("String with %kffff; included", val)
where the code k would be used to select a formatter. The formatter
would be passed two things:
1. The format from k to ; inclusive.
2. The value val.
As a result, the format ffff could be as simple as someone could
design it.
Yes, that's what I thought you meant. C is thataway -->.
You call it "simple"; C is one of the simpler languages of this
type, yet the full spec of "printf" and its friends is horrendous.
Build in some version [exact syntax up to you] of
print "String with ", val, " included"
and you're mostly done. For the exceptional cases, use a library
procedure or your own procedure to convert "val" to a suitable
array of characters, with whatever parameters are appropriate.
Note that there would be no requirement for dynamic memory. The
formatter would just send for printing each character as it was
generated.
What's wrong with that? (Genuine question!)
Nothing. How on earth do you think we managed in olden
times, before we had "dynamic memory"? [Ans: by printing each
character in sequence.]
On 04/01/2022 00:35, Andy Walker wrote:
On 02/01/2022 16:06, James Harris wrote:
If you are not convinced by formatted io then what kind of io do you
prefer?
Unformatted transput, of course. Eg,
print this, that, these, those and the many other things
If I do this in A68G:
print(("<",1,">"))
the output is:
< +1>
print val3(f())
where f itself includes a 'print val'.
On 02/01/2022 18:40, Dmitry A. Kazakov wrote:
On 2022-01-02 18:54, James Harris wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
That would be poor for small-machine targets. Shame on Ada! ;-)
That works perfectly well on small targets. You seem unaware of what
actually happens on I/O. For a "small" target it would be some sort of
networking stack with a terminal emulation on top of it. Believe me,
creating a temporary string on the stack is nothing in comparison to
that. Furthermore, the implementation would likely no less efficient
than printf which has no idea how large the result is and would have
to reallocate the output buffer or keep it between calls and lock it
from concurrent access. Locking on an embedded system is a
catastrophic event because switching threads is expensive as hell.
Note also that you cannot stream output, because networking protocols
and terminal emulators are much more efficient if you do bulk
transfers. All that is the infamous premature optimization.
I don't know why you bring locking into it. It's neither necessary nor relevant.
Furthermore, the only use for an output buffer is to make output more efficient; it's not fundamental.
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by those
strings?
What reclaims memory used by those integers?
Depends on where they are:
1. Globals - reclaimed when the program exits.
2. Locals on the stack or in activation records - reclaimed at least by
the time the function exits.
3. Dynamic on the heap - management required.
Your solution of creating string forms of values is reasonable in a
language and an environment which already have dynamic memory management
but not otherwise.
Not all languages have dynamic memory management, and dynamic memory
management is not ideal for all compilation targets.
No dynamic memory management is required for handling temporary objects.
Where would you put the string forms?
Whether formatted or not, all IO tends to have higher costs than
computation and for most applications the cost of printing doesn't
matter. But when designing a language or a standard library it's a bad
idea to effectively impose a scheme which has a higher cost than
necessary because the language designer doesn't know what uses his
language will be put to.
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
...
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
What's wrong with
put_line("X=%i;, Y=%i;", X, Y)
?
On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
On 2022-01-02 19:05, James Harris wrote:
On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
...
The most flexible is a combination of a string that carries most of
the information specific to the datatype (an OO method) and some
commands to the rendering environment.
That sounds interesting. How would it work?
With single dispatch you have an interface, say, 'printable'. The
interface has an abstract method 'image' with the profile:
function Image (X : Printable) return String;
Integer, float, string, whatever that has to be printable inherits to
Printable and thus overrides Image. That is.
The same goes with serialization/streaming etc.
OK, that was my preferred option, too. The trouble with it is that it
needs somewhere to put the string form. And you know the problems
therewith (lack of recursion OR lack of thread safety OR dynamic memory management).
This didn't work (I guess ' has higher precedence than +?). But neither
did:
Put_Line ((X+Y)'Image);
'Image' can only be applied to a name, not an expression; why?
I think you're not in a position to tell people how to implement Print!
You need to be able to just do this:
println X+Y
On 2022-01-08 17:29, James Harris wrote:
No dynamic memory management is required for handling temporary objects.
Where would you put the string forms?
The same place you put integer, float, etc. That place is called stack
or LIFO.
On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
On 2022-01-08 17:29, James Harris wrote:
No dynamic memory management is required for handling temporary
objects.
Where would you put the string forms?
The same place you put integer, float, etc. That place is called stack
or LIFO.
(1) integer, float etc are a fixed size known at compile-time
(2) integer, float etc are usually manipulated by value
Apparently Ada strings have a fixed length.
It would be very constraining if you had to know, in advance, the size
of a string returned from a function for a complex user-type conversion.
Which I guess also means the function has the same limit for all
possible calls.
But Ada also has unbounded strings:
"Unbounded strings are allocated using heap memory, and are deallocated automatically."
On 2022-01-08 17:36, James Harris wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
...
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
What's wrong with
put_line("X=%i;, Y=%i;", X, Y)
?
Untyped, unsafe, messy, non-portable garbage that does not work with user-defined types.
And again, that is not the point. The point is that in any decent
language there is no need either in printf mess or print statements.
Because the language abstractions are powerful enough to express
formatting I/O in language terms.
On 2022-01-08 21:22, Bart wrote:
This didn't work (I guess ' has higher precedence than +?). But
neither did:
Put_Line ((X+Y)'Image);
'Image' can only be applied to a name, not an expression; why?
Because 'Image is a type attribute:
<subtype>'Image (<value>)
So
Integer_32'image (X + Y)
I think you're not in a position to tell people how to implement Print!
I am, just don't.
You need to be able to just do this:
println X+Y
Nope, I don't need that at all. In Ada it is just this:
Put (X + Y);
See the package Integer_IO (ARM A.10.8). The point is that is is almost
never used, because, again, not needed for real-life software.
On 2022-01-08 21:46, Bart wrote:
On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
On 2022-01-08 17:29, James Harris wrote:
No dynamic memory management is required for handling temporary
objects.
Where would you put the string forms?
The same place you put integer, float, etc. That place is called
stack or LIFO.
(1) integer, float etc are a fixed size known at compile-time
So what?
(2) integer, float etc are usually manipulated by value
Irrelevant.
Apparently Ada strings have a fixed length.
Apparently not:
function Get_Line (File : File_Type) return String;
It would be very constraining if you had to know, in advance, the size
of a string returned from a function for a complex user-type conversion.
String is an indefinite type, the size of an object is unknown until run-time.
Which I guess also means the function has the same limit for all
possible calls.
Wrong. Indefinite types are returned just same as definite types are, on
the stack, which means memory management policy LIFO.
To widen your horizon a little bit, a stack LIFO can be implemented by
many various means: using machine stack, using machine registers, using thread local storage as well as various combinations of.
But Ada also has unbounded strings:
"Unbounded strings are allocated using heap memory, and are
deallocated automatically."
Unbounded_String is practically never needed and discouraged to use.
Because heap is a bad idea and because text processing algorithm almost
never require changing length/content of a string.
On 2022-01-08 17:36, James Harris wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
...
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
What's wrong with
put_line("X=%i;, Y=%i;", X, Y)
?
Untyped, unsafe, messy, non-portable garbage that does not work with user-defined types.
On 2022-01-08 17:50, James Harris wrote:
On 02/01/2022 18:40, Dmitry A. Kazakov wrote:
On 2022-01-02 18:54, James Harris wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
That would be poor for small-machine targets. Shame on Ada! ;-)
That works perfectly well on small targets. You seem unaware of what
actually happens on I/O. For a "small" target it would be some sort
of networking stack with a terminal emulation on top of it. Believe
me, creating a temporary string on the stack is nothing in comparison
to that. Furthermore, the implementation would likely no less
efficient than printf which has no idea how large the result is and
would have to reallocate the output buffer or keep it between calls
and lock it from concurrent access. Locking on an embedded system is
a catastrophic event because switching threads is expensive as hell.
Note also that you cannot stream output, because networking protocols
and terminal emulators are much more efficient if you do bulk
transfers. All that is the infamous premature optimization.
I don't know why you bring locking into it. It's neither necessary nor
relevant.
Because this is how I/O works.
Furthermore, the only use for an output buffer is to make output more
efficient; it's not fundamental.
It is fundamental, there is no hardware anymore where you could just
send a single character to.
A small target will write to the network
stack, e.g. use socket send over TCP, that will coalesce output into
network packets, these would be buffered into transport layer frames,
these will go to physical layer packets etc.
There is no such thing as character stream without a massive overhead
beneath it. So creating a string on the secondary stack is nothing in
compare to that especially when you skip stream abstraction. Most
embedded software do. They tend to do I/O directly in packets. E.g.
sending application level packets over TCP or using UDP.
Tracing, the only place where text output is actually used,
On 2022-01-08 17:29, James Harris wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Depends on where they are:
1. Globals - reclaimed when the program exits.
2. Locals on the stack or in activation records - reclaimed at least
by the time the function exits.
3. Dynamic on the heap - management required.
Now replace integer with string. The work done.
Your solution of creating string forms of values is reasonable in a
language and an environment which already have dynamic memory
management but not otherwise.
You do not need that.
Not all languages have dynamic memory management, and dynamic memory
management is not ideal for all compilation targets.
No dynamic memory management is required for handling temporary objects.
Where would you put the string forms?
The same place you put integer, float, etc. That place is called stack
or LIFO.
Whether formatted or not, all IO tends to have higher costs than
computation and for most applications the cost of printing doesn't
matter. But when designing a language or a standard library it's a bad
idea to effectively impose a scheme which has a higher cost than
necessary because the language designer doesn't know what uses his
language will be put to.
Did you do actual measurements?
printf obviously imposes higher costs than direct conversions. And also
costs that cannot be easily optimized since the format can be an
expression and even if a constant it is difficult to break down into
direct conversions.
On 2022-01-08 17:18, James Harris wrote:
On 02/01/2022 18:25, Dmitry A. Kazakov wrote:
On 2022-01-02 19:05, James Harris wrote:
On 28/12/2021 09:21, Dmitry A. Kazakov wrote:
...
The most flexible is a combination of a string that carries most of
the information specific to the datatype (an OO method) and some
commands to the rendering environment.
That sounds interesting. How would it work?
With single dispatch you have an interface, say, 'printable'. The
interface has an abstract method 'image' with the profile:
function Image (X : Printable) return String;
Integer, float, string, whatever that has to be printable inherits to
Printable and thus overrides Image. That is.
The same goes with serialization/streaming etc.
OK, that was my preferred option, too. The trouble with it is that it
needs somewhere to put the string form. And you know the problems
therewith (lack of recursion OR lack of thread safety OR dynamic
memory management).
No idea why you think there is something special about string format or
that any of the mentioned issues would ever apply. Conversion to string
needs no recursion,
is as thread safe as any other call, needs no
dynamic memory management.
There should be no format specifications at all. You just need a few parameters for Image regarding type-specific formatting and a few
parameters regarding rendering context in the actual output call.
The former are like put + if positive, base, precision etc; the latter
are like output field width, alignment, fill character etc.
On 08/01/2022 19:48, Dmitry A. Kazakov wrote:
On 2022-01-08 17:50, James Harris wrote:
On 02/01/2022 18:40, Dmitry A. Kazakov wrote:
On 2022-01-02 18:54, James Harris wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
That would be poor for small-machine targets. Shame on Ada! ;-)
That works perfectly well on small targets. You seem unaware of what
actually happens on I/O. For a "small" target it would be some sort
of networking stack with a terminal emulation on top of it. Believe
me, creating a temporary string on the stack is nothing in
comparison to that. Furthermore, the implementation would likely no
less efficient than printf which has no idea how large the result is
and would have to reallocate the output buffer or keep it between
calls and lock it from concurrent access. Locking on an embedded
system is a catastrophic event because switching threads is
expensive as hell. Note also that you cannot stream output, because
networking protocols and terminal emulators are much more efficient
if you do bulk transfers. All that is the infamous premature
optimization.
I don't know why you bring locking into it. It's neither necessary
nor relevant.
Because this is how I/O works.
Why? Where there's no contention and just one task producing a certain stream's output what is there to lock from?
I have even designed a lock-free way
Furthermore, the only use for an output buffer is to make output more
efficient; it's not fundamental.
It is fundamental, there is no hardware anymore where you could just
send a single character to.
Of course there is. For example, a 7-segment display. Another: an async serial port.
A small target will write to the network stack, e.g. use socket send
over TCP, that will coalesce output into network packets, these would
be buffered into transport layer frames, these will go to physical
layer packets etc.
In a couple of replies recently you've mentioned a communication stack.
I don't know why you are thinking of such a thing but not all
communication uses the OSI 7-layer model! ;-)
On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
On 2022-01-08 21:22, Bart wrote:
This didn't work (I guess ' has higher precedence than +?). But
neither did:
Put_Line ((X+Y)'Image);
'Image' can only be applied to a name, not an expression; why?
Because 'Image is a type attribute:
<subtype>'Image (<value>)
And yet it works with X'Image when X is a variable, not a type.
You need to be able to just do this:
println X+Y
Nope, I don't need that at all. In Ada it is just this:
Put (X + Y);
So it's overloading Put() with different types. But the language doesn't similarly overload Put_Line()?
See the package Integer_IO (ARM A.10.8). The point is that is is
almost never used, because, again, not needed for real-life software.
Huh? Have you never written data to a file?
On 08/01/2022 19:35, Dmitry A. Kazakov wrote:
No idea why you think there is something special about string format
or that any of the mentioned issues would ever apply. Conversion to
string needs no recursion,
It does if a to-string function invokes another to-string function.
is as thread safe as any other call, needs no dynamic memory management.
Unless you know the maximum size of the string
However, if the formatter is passed the value and the format (my
suggestion) then it (the formatter) can print the characters one by one
On 08/01/2022 21:04, Dmitry A. Kazakov wrote:
On 2022-01-08 21:46, Bart wrote:
On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
On 2022-01-08 17:29, James Harris wrote:
No dynamic memory management is required for handling temporary
objects.
Where would you put the string forms?
The same place you put integer, float, etc. That place is called
stack or LIFO.
(1) integer, float etc are a fixed size known at compile-time
So what?
(2) integer, float etc are usually manipulated by value
Irrelevant.
Relevant because you are suggesting that strings can be manipulated just
like a 4- or 8-byte primitive type.
Apparently Ada strings have a fixed length.
Apparently not:
function Get_Line (File : File_Type) return String;
Yet I can't do this:
S: String;
"unconstrained subtype not allowed". It needs a size or to be
initialised from a literal of known length.
To widen your horizon a little bit, a stack LIFO can be implemented by
many various means: using machine stack, using machine registers,
using thread local storage as well as various combinations of.
Suppose you have this:
Put_Line(Get_Line(...));
Can you go into some detail as to what, exactly, is passed back from Get_Line(),
what, exactly, is passed to Put_Line(),
bearing in mind that
64-bit ABIs frown on passing by value any args more than 64-bits, and
where, exactly, the actual string data, which can be of any length,
resides during this process, and how that string data is destroyed when
it is no longer needed?
Then perhaps you might explain in what way that is identical to passing
a Integer to Put().
But Ada also has unbounded strings:
"Unbounded strings are allocated using heap memory, and are
deallocated automatically."
Unbounded_String is practically never needed and discouraged to use.
Because heap is a bad idea and because text processing algorithm
almost never require changing length/content of a string.
It seems you've never written a text editor either!
On 2022-01-08 22:05, Bart wrote:
On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
On 2022-01-08 21:22, Bart wrote:
This didn't work (I guess ' has higher precedence than +?). But
neither did:
Put_Line ((X+Y)'Image);
'Image' can only be applied to a name, not an expression; why?
Because 'Image is a type attribute:
<subtype>'Image (<value>)
And yet it works with X'Image when X is a variable, not a type.
That is another attribute. The type of a variable is known. The type of
an expression is not.
You need to be able to just do this:
println X+Y
Nope, I don't need that at all. In Ada it is just this:
Put (X + Y);
So it's overloading Put() with different types. But the language
doesn't similarly overload Put_Line()?
Why should it? It is never happens to print one data point per line
except when the whole line is printed and then that is a string.
See the package Integer_IO (ARM A.10.8). The point is that is is
almost never used, because, again, not needed for real-life software.
Huh? Have you never written data to a file?
Not this way.
Besides it is about formatted output, not about data. Data are written
in binary formats from simple to very complex like in the databases.
Formatted output is performed into a string buffer which is then output. Always, no exceptions.
On 08/01/2022 19:54, Dmitry A. Kazakov wrote:
On 2022-01-08 17:36, James Harris wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
...
If this
Put_Line ("X=" & X'Image & ", Y=" & Y'Image);
is a problem in your language, then the job is not done.
What's wrong with
put_line("X=%i;, Y=%i;", X, Y)
?
Untyped, unsafe, messy, non-portable garbage that does not work with
user-defined types.
That's just wrong. It is typesafe,
clean
and portable.
What's more, per
the suggestion I made to start this thread it will work with
user-defined types.
On 08/01/2022 20:01, Dmitry A. Kazakov wrote:
On 2022-01-08 17:29, James Harris wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Depends on where they are:
1. Globals - reclaimed when the program exits.
2. Locals on the stack or in activation records - reclaimed at least
by the time the function exits.
3. Dynamic on the heap - management required.
Now replace integer with string. The work done.
Strings are, in general, not of fixed length.
Whether formatted or not, all IO tends to have higher costs than
computation and for most applications the cost of printing doesn't
matter. But when designing a language or a standard library it's a
bad idea to effectively impose a scheme which has a higher cost than
necessary because the language designer doesn't know what uses his
language will be put to.
Did you do actual measurements?
Did you?
printf obviously imposes higher costs than direct conversions. And
also costs that cannot be easily optimized since the format can be an
expression and even if a constant it is difficult to break down into
direct conversions.
I am not defending printf.
On 2022-01-08 22:18, Bart wrote:
Relevant because you are suggesting that strings can be manipulated
just like a 4- or 8-byte primitive type.
Regarding memory management they can.
Suppose you have this:
Put_Line(Get_Line(...));
Can you go into some detail as to what, exactly, is passed back from
Get_Line(),
That depends on the compiler. Get_Line could allocate a doped string
vector on the secondary stack and return a reference to it on the
primary stack. Or the dope and reference on the primary stack and the
body on the secondary stack.
what, exactly, is passed to Put_Line(),
Reference to the vector on the secondary stack or else the dope and the reference.
bearing in mind that 64-bit ABIs frown on passing by value any args
more than 64-bits, and where, exactly, the actual string data, which
can be of any length, resides during this process, and how that string
data is destroyed when it is no longer needed?
The object's length computable from the vector's dope. E.g.
8 + high-bound - low-bound + 1 + rounding
assuming 32-bit bounds. The secondary stack containing arguments of
Put_Line is popped upon return from it.
Then perhaps you might explain in what way that is identical to
passing a Integer to Put().
Just same. Push arguments on the stack, pop it upon return. In the case
of Integer it could be the primary stack instead of secondary stacks.
Which one uses GNAT for Ada calling convention I don't know.
Hint, it is an incredible bad idea to use Unbounded_String for a text
buffer.
On 08/01/2022 23:12, Dmitry A. Kazakov wrote:
On 2022-01-08 22:05, Bart wrote:
On 08/01/2022 20:49, Dmitry A. Kazakov wrote:
On 2022-01-08 21:22, Bart wrote:
This didn't work (I guess ' has higher precedence than +?). But
neither did:
Put_Line ((X+Y)'Image);
'Image' can only be applied to a name, not an expression; why?
Because 'Image is a type attribute:
<subtype>'Image (<value>)
And yet it works with X'Image when X is a variable, not a type.
That is another attribute. The type of a variable is known. The type
of an expression is not.
But it is known with Put(X+Y) ?
You need to be able to just do this:
println X+Y
Nope, I don't need that at all. In Ada it is just this:
Put (X + Y);
So it's overloading Put() with different types. But the language
doesn't similarly overload Put_Line()?
Why should it? It is never happens to print one data point per line
except when the whole line is printed and then that is a string.
The common-sense way of doing this is to define Put/Putln to work identically, but the latter writes a newline as the end.
I'm not sure where you get the idea that one item per line never happens unless it's a string. If I wanted to output N numbers, I can do N calls
to Put, but each has to be followed by Put_Line("") or Put_Line(" ") to separate them? Instead of just doing N calls of Put_Line(x).
It's an unnecessary restriction. (Have a look at a FizzBuzz program.)
See the package Integer_IO (ARM A.10.8). The point is that is is
almost never used, because, again, not needed for real-life software.
Huh? Have you never written data to a file?
Not this way.
Besides it is about formatted output, not about data. Data are written
in binary formats from simple to very complex like in the databases.
Formatted output is performed into a string buffer which is then
output. Always, no exceptions.
Nonsense. There are no rules for what someone may want to write to a
text file.
There are also files with mixed text and binary content, eg. PGM files.
It's also not clear, as James pointed out, how a string created on a secondary stack in a called function manages to move to the secondary
stack of the caller,
This is also highly specific to machine, language and ABI.
In other words, nothing like as simple as passing an integer.
Hint, it is an incredible bad idea to use Unbounded_String for a text
buffer.
The basis on which most programs work is that memory is a huge array of mutable bytes.
On 2022-01-09 01:15, Bart wrote:
It's also not clear, as James pointed out, how a string created on a
secondary stack in a called function manages to move to the secondary
stack of the caller,
One technique is to use two stacks and swap them over. The arguments and locals stack vs the results stack so that the callee's arguments and
locals stack is the caller's results stack. The call sequence would be
this:
push arguments onto S1
call and swap stacks
push locals on S2 (former S1)
push result onto S1 (former S2)
return and swap stacks
pop S1
Now the result of the call is on the arguments and locals stack as
expected.
This is also highly specific to machine, language and ABI.
This is absolutely non-specific. A machine may provide some support, but there is no problem if it does not.
In other words, nothing like as simple as passing an integer >It is exactly same stack handling for everything.
The text buffers have certain requirements like effective insertion and removing portions of text as well as effective text tagging. A character array is incredibly miserable on these.
OK. I would say that that has /default/ formatting but is stillIf you are not convinced by formatted io then what kind of io doUnformatted transput, of course. Eg,
you prefer?
print this, that, these, those and the many other things
formatted.
You have a preferred syntax in which users can express how they want
values to be formatted? Suggestions welcome!
The design of the language supports dynamically sized strings. That's
fine for applications which need them. But that's different from
imposing such strings and the management thereof on every print
operation.
You're making mountains out ofOh? How would you handle
molehills.
widget w
print(w)
?
ISTM you are suggesting that w be converted to a string (or array of characters) and then printed.
a. Where would you store the array of characters?
b. What's wrong with
print "String with %v; included" % val
Cool. But if you agree with my suggestion of a means which can be
used to render each character in sequence I don't know why you
suggested conversion to an array of characters.
[...] I don't know why you and James are so opposed to the useBecause heap storage requires a more advanced language to manage
of heap storage [and temporary files, if you really want strings
that are many gigabytes]?
properly, ie. automatically. (I don't care for speculative GC
methods.)
On 07/01/2022 18:14, Bart wrote:
[I wrote:]
[...] I don't know why you and James are so opposed to the useBecause heap storage requires a more advanced language to manage
of heap storage [and temporary files, if you really want strings
that are many gigabytes]?
properly, ie. automatically. (I don't care for speculative GC
methods.)
Oh. Well, a language either provides heap storage or it
does not. Even C does, even if in an unsafe and rather primitive
way. The techniques involved aren't exactly cutting-edge recent.
For the limited requirements of strings-for-transput, procedures
such as [in C terms] "malloc" and "free" are entirely sufficient
and safe. If you have control [as you do for your own language]
of the compiler, then you can implement strings-as-results even
without any off-stack storage [simply copy them, or indeed any
data structures that don't involve pointers, down into the stack
space of the calling procedure as part of exiting the returning
function]. But of course it's easier, from the PoV of the
programmer, if heaps and strings are already built in. [Of
course, as discussed elsewhere in the thread, strings aren't
necessary either, merely convenient.]
So what you actually seem to be saying is that your [or
James's] language "should" be more primitive than C, and should
be incapable of implementing reasonably general data structures,
such as trees, that require heaps, or near equivalents. That
would be a pretty definite deal-breaker for me. I can understand
that you might want a better implementation than C's; but that's
another [and perhaps inspiring] matter.
On 08/01/2022 17:15, James Harris wrote:
b. What's wrong with
print "String with %v; included" % val
Nothing, except that "%" has become a privileged
character [with two different syntaxes in your example].
But why should anyone prefer it to
print "String with ", val, " included"
[...]But why should anyone prefer [James's proposal] to
print "String with ", val, " included"
But sometimes I decide to use formatted print, or 'fprint', rather[... examples snipped ...]
than 'print'. 'fprint' starts with a format string. Just see for
yourself whether my choice was justified:
The 'fprint' versions give a nice overall picture of the layout of
the line, with variable parts represented by '#'. It is is very easy
to get that right and to maintain.
I can very easily control the spacing between items; unspaced is
"##", spaced is "# #".
But also, I can instantly change the format to anything else, eg. an
extra space is "# #"; with a comma as well it's "#, #".
Another advantage is being to able to use a generic format like "(#,
#, #)", and applying to a sequence of 3-element prints, if I wanted
them to all be displayed in the same style. And to change the style,
I change it one place.
Still not convinced? This is an easy and low-cost language feature
for the extra convenience provided.
On 10/01/2022 01:25, Bart wrote:
[I wrote:]
[...]But why should anyone prefer [James's proposal] to
print "String with ", val, " included"
But sometimes I decide to use formatted print, or 'fprint', rather[... examples snipped ...]
than 'print'. 'fprint' starts with a format string. Just see for
yourself whether my choice was justified:
Obviously, if your language has /both/ "print" and "printf"
/and/ you're well versed in the details of both, then you can and
should use whichever suits you in particular cases. Most users
are not so familiar with the details of whatever languages they're
using, and have no control over what the language provides. So
the more important question is whether a /new/ language /should/
have both, and if not which is the one that should go. Bearing
in mind that "printf" necessitates the invention of a "little
language" [or not so little!] and perhaps additional syntax, and
will still not be comprehensive [cf your "date" example], ISTM
that the decision is straightforward.
The 'fprint' versions give a nice overall picture of the layout of
the line, with variable parts represented by '#'. It is is very easy
to get that right and to maintain.
Yes, but there are other ways to do that without adding
to the size of the language and its description. Left as an
exercise.
But also, I can instantly change the format to anything else, eg. an
extra space is "# #"; with a comma as well it's "#, #".
Not "anything else" [or, at least, not "instantly"]. As
you came close to pointing out, there are dozens of ways in which
dates are commonly [or less commonly!] printed, not all of which
are trivial re-arrangements of "print day, month, year".
[...]
Another advantage is being to able to use a generic format like "(#,
#, #)", and applying to a sequence of 3-element prints, if I wanted
them to all be displayed in the same style. And to change the style,
I change it one place.
Yes, but you don't need "printf" to do that. A procedure
that takes three parameters [or an array, and if you like strings
to act as start, finish and separator] and prints the appropriate
output is essentially trivial to write in any sensible language,
providing formatting for
triples of integers probably isn't, and should be left to the
users.
Still not convinced? This is an easy and low-cost language feature
for the extra convenience provided.
Providing simple procedures such as the one just suggested
is indeed easy, but not to do with writing little languages, or
adding syntax, and esp not with doubling the documentation needed.
I did an experiment: I removed support for 'fprint/fprintln' from my compiler.
I left in the library support for it, since this will still be needed
whether it's built-in, or implemented via functions.
It made the compiler executable about 0.2% smaller. It made the
source code 50 lines smaller.
However, it leaves a problem: exactly how to I achieve the same thing
using only user-functions?
That is, implement a function like this:
formattedprint(dest, formatstring, x, y, fmtxx(z,"z11"))
where x, y, z are of of arbitrary types. And it introduces the
problems that have been discussed of what to do about the string
generated from fmtxx(). It has the -xx designation because that is
also type-specific.
I don't have variadic parameters either.
It is little to do with the language knowing how to deal with:
print d
when 'd' is some user-defined type.
example of applying one format to multiple prints, not all those
prints will have the same types.
Providing simple procedures such as the one just suggestedYou will still need to document those functions. Or perhaps you are suggesting every user has to reinvent the same functions for turning
is indeed easy, but not to do with writing little languages, or
adding syntax, and esp not with doubling the documentation needed.
numbers into strings, padding to a given width, justifying left or
right etc etc.
On 12/01/2022 13:22, Bart wrote:
I did an experiment: I removed support for 'fprint/fprintln' from my
compiler.
I left in the library support for it, since this will still be needed
whether it's built-in, or implemented via functions.
That depends on what the support entails. If it's anything
even remotely like Algol support for formats, it's simply not needed
if you don't have formats [Algol "$ ... $", often one of the first
things to go in subset languages (inc early versions of A68R)]. If
OTOH you really mean the support for "print", then of course that
will be needed, but that suggests that your idea of formatted
transput is, to say the least, minimal.
It made the compiler executable about 0.2% smaller. It made the
source code 50 lines smaller.
However, it leaves a problem: exactly how to I achieve the same thing
using only user-functions?
Well, I jotted down A68G code for what seems to be roughly
your "fprint[ln]" example. It comes to 19 lines, inc six lines to
implement a couple of bells-and-whistles [optional trailing
newline, and repeat last format]:
STRING ditto = "";
STRING lastfstring := ditto;
BOOL donewline := TRUE; # set FALSE to suppress trailing newline #
PROC fprint = (STRING s, [] UNION (INT, LONG INT, REAL, STRING) a) VOID:
( STRING fmt = ( s = ditto | lastfstring | lastfstring := s );
INT i := 0;
FOR j TO UPB fmt
DO IF fmt[j] = "#"
THEN CASE a[i +:= 1]
IN (UNION (INT, LONG INT) k): print (whole (k, 0)),
(REAL r): print (fixed (r, -8, 5)),
(STRING w): print (w)
ESAC
ELSE print (fmt[j])
FI
OD;
donewline | print (newline)
);
fprint ( "Hello #! Sqrt # is #, to 5dp", ("World", 2, sqrt(2)) );
fprint ( ditto, ("Bart", 1.69, "1.3") );
fprint ( "Longmaxint is #.", longmaxint )
[prints:
Hello World! Sqrt 2 is 1.41421, to 5dp
Hello Bart! Sqrt 1.69000 is 1.3, to 5dp
Longmaxint is 999999999999999999999999999999999999999999.
[...]
I don't have variadic parameters either.
Nor does Algol.
[...]
It is little to do with the language knowing how to deal with:
print d
when 'd' is some user-defined type.
That's not to do with formatted printing either.
[...]> See my example above. Not all formats are just "#, #, #". And in my
example of applying one format to multiple prints, not all those
prints will have the same types.
See my code above.
[...]
Providing simple procedures such as the one just suggestedYou will still need to document those functions. Or perhaps you are
is indeed easy, but not to do with writing little languages, or
adding syntax, and esp not with doubling the documentation needed.
suggesting every user has to reinvent the same functions for turning
numbers into strings, padding to a given width, justifying left or
right etc etc.
How much documentation do you suggest the code above needs?
Surely nothing like the scores of pages needed to describe formats
in C and Algol? [Formatted transput is ~15% of the RR, four times
as much as unformatted (which easily does all the things you're
"suggesting" users might have to reinvent), a similar fraction of
the A68 /syntax/, and is a nightmare to lex, as formats can
contain ordinary code, potentially with embedded formats nested
to arbitrary depth.]
On 15/01/2022 19:53, Andy Walker wrote:
Well, I jotted down A68G code for what seems to be roughly
your "fprint[ln]" example. It comes to 19 lines, inc six lines to
implement a couple of bells-and-whistles [optional trailing
newline, and repeat last format]:
STRING ditto = "";
STRING lastfstring := ditto;
BOOL donewline := TRUE; # set FALSE to suppress trailing newline #
PROC fprint = (STRING s, [] UNION (INT, LONG INT, REAL, STRING) a)
VOID:
( STRING fmt = ( s = ditto | lastfstring | lastfstring := s );
INT i := 0;
FOR j TO UPB fmt
DO IF fmt[j] = "#"
THEN CASE a[i +:= 1]
IN (UNION (INT, LONG INT) k): print (whole (k, 0)),
(REAL r): print (fixed (r, -8, 5)),
(STRING w): print (w)
ESAC
ELSE print (fmt[j])
FI
OD;
donewline | print (newline)
);
fprint ( "Hello #! Sqrt # is #, to 5dp", ("World", 2, sqrt(2)) );
fprint ( ditto, ("Bart", 1.69, "1.3") );
fprint ( "Longmaxint is #.", longmaxint )
[prints:
Hello World! Sqrt 2 is 1.41421, to 5dp
Hello Bart! Sqrt 1.69000 is 1.3, to 5dp
Longmaxint is 999999999999999999999999999999999999999999.
That's a reasonable attempt at emulating 'fprint/ln'. It's an approach I can't use in my systems language, because it doesn't have automatic
tagged unions as used here; arbitrary array constructors; nor that
automatic 'rowing' feature to turn one item into a list.
Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote:
On 2022-01-02 18:50, Bart wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
??? FOO (I + 1)
were OK almost human life span ago.
Fortran didn't allow recursion either.
Irrelevant. What reclaims integer I+1?
Relevant. Early Fortrans statically allocated storage to I + 1.
Storage was "reclaimed" by OS at program termination, but
reserved for the whole run. Such static allocation is
impossible without static bound on maximal size of object
and it does not work in case of recursion.
On 2022-01-02 18:50, Bart wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by
those strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
??? FOO (I + 1)
were OK almost human life span ago.
Fortran didn't allow recursion either.
Irrelevant. What reclaims integer I+1?
[Bart:]Well, I jotted down A68G code for what seems to be roughly
your "fprint[ln]" example. It comes to 19 lines, inc six lines to
implement a couple of bells-and-whistles [...].
That's a reasonable attempt at emulating 'fprint/ln'. It's an
approach I can't use in my systems language, because it doesn't
have automatic tagged unions as used here; arbitrary array
constructors; nor that automatic 'rowing' feature to turn one item
into a list.
Actually your example demonstrates your argument in reverse: it's
full of complex language features which I decided are not necessary
to build in to my systems language. (Not just that: they are hard to implement, and inefficient.)
You are saying a language shouldn't have this type of formatting
control built-in, because it's so easy to emulate a half-working
version with different behaviour using other features.
Provided the language has those features, as A68 coincidentally
happens to have!
Meanwhile it also demonstrates one or two features that are missing
from A68, which I do have in my systems language and consider more
useful: local static variables (that retain their value betweeen
calls), and optional function parameters with default values.
On 2022-01-18 20:19, antispam@math.uni.wroc.pl wrote:
Dmitry A. Kazakov <mailbox@dmitry-kazakov.de> wrote:
On 2022-01-02 18:50, Bart wrote:
On 02/01/2022 17:21, Dmitry A. Kazakov wrote:
On 2022-01-02 18:08, Bart wrote:
On 02/01/2022 16:37, Dmitry A. Kazakov wrote:
On 2022-01-02 17:06, James Harris wrote:
If you convert to strings then what reclaims the memory used by >>>>>>> those strings?
What reclaims memory used by those integers?
Integers are passed by value at this level of language.
This has nothing to do with the question: what reclaims integers?
FORTRAN-IV passed everything by reference, yet calls
??? FOO (I + 1)
were OK almost human life span ago.
Fortran didn't allow recursion either.
Irrelevant. What reclaims integer I+1?
Relevant. Early Fortrans statically allocated storage to I + 1.
Storage was "reclaimed" by OS at program termination, but
reserved for the whole run. Such static allocation is
impossible without static bound on maximal size of object
and it does not work in case of recursion.
And still irrelevant, whatever reclaims integers can reclaim strings and conversely. Furthermore all that has nothing to do with the parameter
passing mode or where and how call frames get allocated.
On 07/01/2022 18:14, Bart wrote:
[I wrote:]
[...] I don't know why you and James are so opposed to the useBecause heap storage requires a more advanced language to manage
of heap storage [and temporary files, if you really want strings
that are many gigabytes]?
properly, ie. automatically. (I don't care for speculative GC
methods.)
Oh. Well, a language either provides heap storage or it
does not. Even C does, even if in an unsafe and rather primitive
way. The techniques involved aren't exactly cutting-edge recent.
To put it differently, retuning integer on modern machine
requires very little code in compiler and during runtime,
one just puts value in designated return register. The
whole call-return machinery and stack frame allocation
can be done in say 200 lines of compiler code. The
machinations needed to handle variable sized object are
order or tow orders of magnitude more complicated.
So
saying that "whatever reclaims integers can reclaim strings"
is at best misleading.
P.S. I used to like very much idea of allocating variable
size objects on stack(s). But after looking at various
tradeofs I am no longer convinced that it makes sense.
On 2022-01-19 02:25, antispam@math.uni.wroc.pl wrote:
To put it differently, retuning integer on modern machine
requires very little code in compiler and during runtime,
one just puts value in designated return register. The
whole call-return machinery and stack frame allocation
can be done in say 200 lines of compiler code. The
machinations needed to handle variable sized object are
order or tow orders of magnitude more complicated.
Or reverse, advanced algorithms of register optimizations are incredibly complicated while dealing with arrays is very straightforward. You make assumptions about certain implementations which confirm nothing but
these assumptions.
So
saying that "whatever reclaims integers can reclaim strings"
is at best misleading.
No, it is exactly the point. Integers and strings and all other objects managed by the language are created and reclaimed by the language memory management system. Which in turn can operate in a LIFO policy even if
the object sizes are indeterminable.
Your loss. Using the heap is a crime on a multi-core architecture.
On 16/01/2022 11:01, Bart wrote:
[I wrote:]
[Bart:]Well, I jotted down A68G code for what seems to be roughly
your "fprint[ln]" example. It comes to 19 lines, inc six lines to
implement a couple of bells-and-whistles [...].
That's a reasonable attempt at emulating 'fprint/ln'. It's an
approach I can't use in my systems language, because it doesn't
have automatic tagged unions as used here; arbitrary array
constructors; nor that automatic 'rowing' feature to turn one item
into a list.
Tagged unions: They aren't exactly an unusual feature of
languages [Wiki gives examples from Algol, Pascal, Ada, ML, Haskell,
Modula, Rust, Nim, ..., tho' sometimes, as in C, a certain amount
of pushing and squeezing is needed]. If your unions aren't tagged,
it's a recipe for unsafe use of types [as in C].
Arbitrary array constructors: ??? You mean the ability to
write down an actual row of things and have the language treat it
as a row of things? How awful.
Rowing: Well, you're on better ground here, as the rowing
coercion is one of the features of Algol that has been touted as
something that perhaps ought not to have been included. But it's
only syntactic sugar, so it's easily worked around if your language
doesn't have it.
Actually your example demonstrates your argument in reverse: it's
full of complex language features which I decided are not necessary
to build in to my systems language. (Not just that: they are hard to
implement, and inefficient.)
Complex? Did you find my code difficult to follow?
You
[and more importantly, in context, James] don't need to re-invent
wheels, any more than you need to invent new parsing techniques
or new ways of sorting lists. Just copy the code from any of the
many PD compilers out there.
Inefficient? ???
You are saying a language shouldn't have this type of formatting
control built-in, because it's so easy to emulate a half-working
version with different behaviour using other features.
Half-working?
our continental friends], alternatives to "#", ..., but they
"Local static variables" were in Algol 60, were problematic
in detailed specification and were therefore dropped in Algol 68 [RR 0.2.6f]. Some of the reasons are expanded in the well-known book on
"Algol 60 Implementation" by Randell and Russell, esp in relation to
"own" arrays. The effects are easy to obtain in other ways. [Most
other prominent languages don't have them either.]
On 19/01/2022 07:42, Dmitry A. Kazakov wrote:
On 2022-01-19 02:25, antispam@math.uni.wroc.pl wrote:
No, it is exactly the point. Integers and strings and all other
objects managed by the language are created and reclaimed by the
language memory management system. Which in turn can operate in a LIFO
policy even if the object sizes are indeterminable.
At the point where a Print routine has finished with a string associated
with an item in print-list, how will the language know how to recover
any resource used by that string, or if it needs to do so? Remember that
at different times through the same code:
* The string might be a literal (so it can be left)
* It can be constructed just for this purpose (so must be recovered)
* It could belong to a global entity (so can be left)
* It could be shared (so a reference count may need adjusting)
Your loss. Using the heap is a crime on a multi-core architecture.
Huh? Pretty much everything is multi-core now other than on small devices.
How else are you going to use all those GB of memory other than using a
heap?
You may as well insist that file storage on a disk is allocated in a
LIFO manner too; that would get rid of all that pesky fragmentation!
People who invent file systems seem to have missed a trick.
On 19/01/2022 12:24, Dmitry A. Kazakov wrote:
On 2022-01-19 11:48, Bart wrote:
On 19/01/2022 07:42, Dmitry A. Kazakov wrote:
On 2022-01-19 02:25, antispam@math.uni.wroc.pl wrote:
No, it is exactly the point. Integers and strings and all other
objects managed by the language are created and reclaimed by the
language memory management system. Which in turn can operate in a
LIFO policy even if the object sizes are indeterminable.
At the point where a Print routine has finished with a string
associated with an item in print-list, how will the language know how
to recover any resource used by that string, or if it needs to do so?
Remember that at different times through the same code:
* The string might be a literal (so it can be left)
* It can be constructed just for this purpose (so must be recovered)
* It could belong to a global entity (so can be left)
* It could be shared (so a reference count may need adjusting)
Which exactly applies to integer. Moreover things are far more
complicated to integers. An integer can be
- packed and misaligned in a container
- in a register
- optimized away value
- atomic access value
- mapped to an I/O port non-relocatable value
This is an integer being passed to a function. The Win64 ABI specifies
that a /copy/ of its value is passed in register RCX if it is the first argument.
What does it say about the values of strings?
Your loss. Using the heap is a crime on a multi-core architecture.
Huh? Pretty much everything is multi-core now other than on small
devices.
See, could not even claim the green grapes.
Huh?
How else are you going to use all those GB of memory other than using
a heap?
For doing something useful, maybe?
Next you're going to tell me that all those languages that need garbage collection are doing it all wrong.
You may as well insist that file storage on a disk is allocated in a
LIFO manner too; that would get rid of all that pesky fragmentation!
Well, if you looked how journaling file systems function or how flash
does you might experience another revelation...
I'm talking about normal, random-access, read-write devices.
People who invent file systems seem to have missed a trick.
Sure. None uses heap to transfer memory blocks, I hope. Oh, don't tell
me you just wrote one. You'll get me a PTSD...
Huh? again. Are you on something?
On 2022-01-19 11:48, Bart wrote:
On 19/01/2022 07:42, Dmitry A. Kazakov wrote:
On 2022-01-19 02:25, antispam@math.uni.wroc.pl wrote:
No, it is exactly the point. Integers and strings and all other
objects managed by the language are created and reclaimed by the
language memory management system. Which in turn can operate in a
LIFO policy even if the object sizes are indeterminable.
At the point where a Print routine has finished with a string
associated with an item in print-list, how will the language know how
to recover any resource used by that string, or if it needs to do so?
Remember that at different times through the same code:
* The string might be a literal (so it can be left)
* It can be constructed just for this purpose (so must be recovered)
* It could belong to a global entity (so can be left)
* It could be shared (so a reference count may need adjusting)
Which exactly applies to integer. Moreover things are far more
complicated to integers. An integer can be
- packed and misaligned in a container
- in a register
- optimized away value
- atomic access value
- mapped to an I/O port non-relocatable value
Your loss. Using the heap is a crime on a multi-core architecture.
Huh? Pretty much everything is multi-core now other than on small
devices.
See, could not even claim the green grapes.
How else are you going to use all those GB of memory other than using
a heap?
For doing something useful, maybe?
You may as well insist that file storage on a disk is allocated in a
LIFO manner too; that would get rid of all that pesky fragmentation!
Well, if you looked how journaling file systems function or how flash
does you might experience another revelation...
People who invent file systems seem to have missed a trick.
Sure. None uses heap to transfer memory blocks, I hope. Oh, don't tell
me you just wrote one. You'll get me a PTSD...
On 2022-01-19 14:34, Bart wrote:
On 19/01/2022 12:24, Dmitry A. Kazakov wrote:
- packed and misaligned in a container
- in a register
- optimized away value
- atomic access value
- mapped to an I/O port non-relocatable value
This is an integer being passed to a function. The Win64 ABI specifies
that a /copy/ of its value is passed in register RCX if it is the
first argument.
If your language does not support out and in/out parameters it is your problem and this has nothing to do with ABI, at all. The following is
legal Ada:
function Foo (X : String) return String;
pragma Convention (Stdcall, Foo);
Takes string returns string and has Win32 calling convention. The
compiler will give you a friendly warning that it would be tricky to use
it from C, but otherwise there is no problem.
What does it say about the values of strings?
Win32 says pretty much same: LPSTR, LPLONG.
On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
On 2022-01-19 14:34, Bart wrote:
What does it say about the values of strings?
Win32 says pretty much same: LPSTR, LPLONG.
Those types are 64-bit pointers on Win64.
LPSTR points to a sequence of bytes, terminated with zero to represent a crude C-style string.
On 2022-01-19 15:38, Bart wrote:
On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
On 2022-01-19 14:34, Bart wrote:
What does it say about the values of strings?
Win32 says pretty much same: LPSTR, LPLONG.
Those types are 64-bit pointers on Win64.
For example this:
procedure Foo (X : in out LONG);
pragma Convention (Stdcall, Foo);
will deploy LPLONG for X. The value will be passed by reference (LPLONG)
as Win32 Stdcall prescribes. See any pointers? Right, there is none.
LPSTR points to a sequence of bytes, terminated with zero to represent
a crude C-style string.
Yes, and the following works perfectly well:
declare
procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);
X : char_array := "abc";
begin
Foo (X & "d" & NUL);
Foo will get null-terminated "abcd". If implemented in C, it would use
LPSTR. Again, no pointers in sight. And, no, heap will not be used.
On 19/01/2022 15:20, Dmitry A. Kazakov wrote:
On 2022-01-19 15:38, Bart wrote:
On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
On 2022-01-19 14:34, Bart wrote:
What does it say about the values of strings?
Win32 says pretty much same: LPSTR, LPLONG.
Those types are 64-bit pointers on Win64.
For example this:
procedure Foo (X : in out LONG);
pragma Convention (Stdcall, Foo);
will deploy LPLONG for X. The value will be passed by reference
(LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there
is none.
Actually, yes. A reference is a pointer, just an implicit one. The 'P'
in LPLONG stands for 'Pointer'; what did you think it was?
LPSTR points to a sequence of bytes, terminated with zero to
represent a crude C-style string.
Yes, and the following works perfectly well:
declare
procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);
X : char_array := "abc";
begin
Foo (X & "d" & NUL);
Foo will get null-terminated "abcd". If implemented in C, it would use
LPSTR. Again, no pointers in sight. And, no, heap will not be used.
Not even for Foo(X*1000000)?
gcc -c test.adbtest.adb:7:19: warning: type of argument "Foo.X" is unconstrained array [-gnatwx]
D:\Temp\y\test>testLength: 3000000
Sure, if you say so:
* No pointers are involved
* No heap storage is necessary, no matter what language
* No memory resources are involved
* Passing an arbitrary string expression is just like passing an integer expression
On 2022-01-19 17:00, Bart wrote:
On 19/01/2022 15:20, Dmitry A. Kazakov wrote:
On 2022-01-19 15:38, Bart wrote:
On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
On 2022-01-19 14:34, Bart wrote:
What does it say about the values of strings?
Win32 says pretty much same: LPSTR, LPLONG.
Those types are 64-bit pointers on Win64.
For example this:
procedure Foo (X : in out LONG);
pragma Convention (Stdcall, Foo);
will deploy LPLONG for X. The value will be passed by reference
(LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there
is none.
Actually, yes. A reference is a pointer, just an implicit one. The
'P' in LPLONG stands for 'Pointer'; what did you think it was?
LONG, see the declaration of Foo.
LPSTR points to a sequence of bytes, terminated with zero to
represent a crude C-style string.
Yes, and the following works perfectly well:
declare
procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);
X : char_array := "abc";
begin
Foo (X & "d" & NUL);
Foo will get null-terminated "abcd". If implemented in C, it would
use LPSTR. Again, no pointers in sight. And, no, heap will not be used.
Not even for Foo(X*1000000)?
Yes, for that invoking a contract termination clause would be the best choice.
However this:
------------------------------ test.adb -------
with Ada.Text_IO; use Ada.Text_IO;
with Interfaces.C.Strings; use Interfaces.C, Interfaces.C.Strings;
with Ada.Unchecked_Conversion, System;
procedure Test is
procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);
procedure Foo (X : char_array) is
function "+" is
new Ada.Unchecked_Conversion (System.Address, chars_ptr);
begin
Put_Line ("Length:" & size_t'Image (strlen (+X'Address)));
end Foo;
function "*" (X : char_array; Y : size_t) return char_array is
begin
return Result : char_array (0..X'Length * Y) do
for I in 1..Y loop
Result ((I - 1) * X'Length..I * X'Length - 1) := X;
end loop;
Result (Result'Last) := NUL;
end return;
end "*";
X : char_array := "abc";
begin
Foo (X * 1000000);
end Test;
----------------------------------------
compiles and works just fine:
----------------------------------------
gcc -c test.adbtest.adb:7:19: warning: type of argument "Foo.X" is unconstrained array [-gnatwx]
test.adb:7:19: warning: foreign caller must pass bounds explicitly
[-gnatwx]
gnatbind -x test.ali
gnatlink test.ali
D:\Temp\y\test>testLength: 3000000
----------------------------------------
Questions?
Sure, if you say so:
* No pointers are involved
Right. None. In Ada pointer is a distinct type in the declaration of
which contains the keyword "access". Saw any?
* No heap storage is necessary, no matter what language
Right. Since avoiding heap for dealing with indefinite object is a
computable problem there is no necessity.
* No memory resources are involved
Wrong. Memory is always required for expression evaluation.
* Passing an arbitrary string expression is just like passing an
integer expression
Neither happens. We are not talking about closures.
On 19/01/2022 17:17, Dmitry A. Kazakov wrote:
On 2022-01-19 17:00, Bart wrote:
On 19/01/2022 15:20, Dmitry A. Kazakov wrote:
On 2022-01-19 15:38, Bart wrote:
On 19/01/2022 13:59, Dmitry A. Kazakov wrote:
On 2022-01-19 14:34, Bart wrote:
What does it say about the values of strings?
Win32 says pretty much same: LPSTR, LPLONG.
Those types are 64-bit pointers on Win64.
For example this:
procedure Foo (X : in out LONG);
pragma Convention (Stdcall, Foo);
will deploy LPLONG for X. The value will be passed by reference
(LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there
is none.
Actually, yes. A reference is a pointer, just an implicit one. The
'P' in LPLONG stands for 'Pointer'; what did you think it was?
LONG, see the declaration of Foo.
The 'P' in 'LPLONG' stands for 'LONG'? Okay.....
Not even for Foo(X*1000000)?LPSTR points to a sequence of bytes, terminated with zero to
represent a crude C-style string.
Yes, and the following works perfectly well:
declare
procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);
X : char_array := "abc";
begin
Foo (X & "d" & NUL);
Foo will get null-terminated "abcd". If implemented in C, it would
use LPSTR. Again, no pointers in sight. And, no, heap will not be used. >>>
Yes, for that invoking a contract termination clause would be the best
choice.
However this:
------------------------------ test.adb -------
with Ada.Text_IO; use Ada.Text_IO;
with Interfaces.C.Strings; use Interfaces.C, Interfaces.C.Strings;
with Ada.Unchecked_Conversion, System;
procedure Test is
procedure Foo (X : char_array);
pragma Convention (Stdcall, Foo);
procedure Foo (X : char_array) is
function "+" is
new Ada.Unchecked_Conversion (System.Address, chars_ptr);
begin
Put_Line ("Length:" & size_t'Image (strlen (+X'Address)));
end Foo;
function "*" (X : char_array; Y : size_t) return char_array is
begin
return Result : char_array (0..X'Length * Y) do
for I in 1..Y loop
Result ((I - 1) * X'Length..I * X'Length - 1) := X; >> end loop;
Result (Result'Last) := NUL;
end return;
end "*";
X : char_array := "abc";
begin
Foo (X * 1000000);
end Test;
----------------------------------------
compiles and works just fine:
----------------------------------------
gcc -c test.adbtest.adb:7:19: warning: type of argument "Foo.X" is unconstrained
array [-gnatwx]
test.adb:7:19: warning: foreign caller must pass bounds explicitly
[-gnatwx]
gnatbind -x test.ali
gnatlink test.ali
D:\Temp\y\test>testLength: 3000000
----------------------------------------
Questions?
I asked if no heap was used. How can you tell what your Ada code is
doing and how?
I also asked, if memory need to be recovered, which bit of code did
that, and where.
Sure, if you say so:
* No pointers are involved
Right. None. In Ada pointer is a distinct type in the declaration of
which contains the keyword "access". Saw any?
So because no explicit pointer denotations are in the source code, that
means that none are used in the implementation?
* No heap storage is necessary, no matter what language
Right. Since avoiding heap for dealing with indefinite object is a
computable problem there is no necessity.
You really, really don't like heaps don't you?
* Passing an arbitrary string expression is just like passing an
integer expression
Neither happens. We are not talking about closures.
OK, since you are being pedantic now, I mean passing the result of
evaluating those respective expressions.
The result of an integer expression will be an integer. The result of an string expression may be one of half a dozen cateogories of strings
(literal, owned, shared, slice etc), and exactly what is passed depends
on a dozen different ways that strings may be implemented.
On 2022-01-19 23:09, Bart wrote:
Actually, yes. A reference is a pointer, just an implicit one. The
'P' in LPLONG stands for 'Pointer'; what did you think it was?
LONG, see the declaration of Foo.
The 'P' in 'LPLONG' stands for 'LONG'? Okay.....
?
So because no explicit pointer denotations are in the source code,
that means that none are used in the implementation?
Exactly so. No pointers means no pointers,
This has nothing to with love. It is a statement of fact. You asked
whether heap is necessary for a problem X. The answer is no. It is a
formal question:
Whether the ordering of allocation/deallocation used in X is "random", actually indeterminable? No it is not, it is well defined in fact.
Formally the language Ada mandates this:
- scalar objects must be passed by value
- tagged and limited objects must be passed by reference
- other objects (and arrays fall in this category) are passed at will
Of course all this goes out of the window when Windows calling
convention is requested, sorry for pun.
Or another example, if FORTRAN
calling convention is requested, then, of course, integer will be passed
by reference not by value.
The result of an integer expression will be an integer. The result of
an string expression may be one of half a dozen cateogories of strings
(literal, owned, shared, slice etc), and exactly what is passed
depends on a dozen different ways that strings may be implemented.
Rubbish. The result of string expression is string.
Values are not
passed, objects representing values do. It is quite no matter what sort
of value an object holds when passing the object.
On 20/01/2022 08:31, Dmitry A. Kazakov wrote:
On 2022-01-19 23:09, Bart wrote:
Actually, yes. A reference is a pointer, just an implicit one. The >>>>> 'P' in LPLONG stands for 'Pointer'; what did you think it was?
LONG, see the declaration of Foo.
The 'P' in 'LPLONG' stands for 'LONG'? Okay.....
?
FFS, it's very simple: 'LP' in MS types stands for 'Long Pointer', so
the 'P' stands for 'Pointer', you know, the thing you said doesn't exist.
So because no explicit pointer denotations are in the source code,
that means that none are used in the implementation?
Exactly so. No pointers means no pointers,
Not even in the underlying implementation? (That's that bit /I/ have to code.)
This has nothing to with love. It is a statement of fact. You asked
whether heap is necessary for a problem X. The answer is no. It is a
formal question:
Whether the ordering of allocation/deallocation used in X is "random",
actually indeterminable? No it is not, it is well defined in fact.
Not even when X calls random() to decide which parts of a data structure
to allocate/deallocate?
Remember, I am not writing X, I have to implement the language in which
X is written.
Formally the language Ada mandates this:
- scalar objects must be passed by value
- tagged and limited objects must be passed by reference
- other objects (and arrays fall in this category) are passed at will
So, assuming a string is not classed as a scalar, a non-small string is passed differently from an integer, possibly 'at will'.
Thank you for finally admitting that integers and strings might need different passing mechanisms.
Of course all this goes out of the window when Windows calling
convention is requested, sorry for pun.
So what does SYS V ABI do that is so different?
Or another example, if FORTRAN calling convention is requested, then,
of course, integer will be passed by reference not by value.
'By-reference' is an extra level of indirection that can be specified
within the code; it can be applied to both integers normally passed by
value, and objects normally passed by reference anyway.
The language specifies which objects can be modified by a callee without
a formal 'by-reference', if any.
The result of an integer expression will be an integer. The result of
an string expression may be one of half a dozen cateogories of
strings (literal, owned, shared, slice etc), and exactly what is
passed depends on a dozen different ways that strings may be
implemented.
Rubbish. The result of string expression is string.
Yes, which can be one of several categories, not several types, for
example, the string data is owned by the object, or it can be a view
into a separate substring.
And that string can be implemented in a dozen different ways depending
on language.
You said, long ago, that dealing with strings is no harder than dealing
with fixed-width integers.
Take this example:
string S:="onetwothree"
case random(1..10)
when 1 then S := "ABC"
when 2 then S := T
when 3 then S := T.[10..20]
when 4 then S := S.[4..6]
when 5 then S := F()*10
esac
Values are not passed, objects representing values do. It is quite no
matter what sort of value an object holds when passing the object.
I don't understand what you are saying.
Of course it is very easy to
utterly dismiss low-level implementation details, and talk about only in
HLL terms (presumably Ada terms).
So you can claim that an implementation does not involve registers or addresses or pointers or stack or heap or even memory, because those
terms do not appear in the source code.
Which all seems to be a desperate attempt to win an argument.
On 2022-01-20 11:42, Bart wrote:
*where* is 'LP'?
Not even in the underlying implementation? (That's that bit /I/ have
to code.)
No idea. What is the underlying implementation? Gates, capacitors,
silicon wafers?
This has nothing to with love. It is a statement of fact. You asked
whether heap is necessary for a problem X. The answer is no. It is a
formal question:
Whether the ordering of allocation/deallocation used in X is
"random", actually indeterminable? No it is not, it is well defined
in fact.
Not even when X calls random() to decide which parts of a data
structure to allocate/deallocate?
Not even then. The ordering remains same.
Thank you for finally admitting that integers and strings might need
different passing mechanisms.
Two integers might.
Of course all this goes out of the window when Windows calling
convention is requested, sorry for pun.
So what does SYS V ABI do that is so different?
I don't remember SysV calling conventions.
Why do you care? What is
different? Why would it shatter the earth under you feet?
And that string can be implemented in a dozen different ways depending
on language.
And integer can be implemented in two dozens of ways, maybe in three...
You said, long ago, that dealing with strings is no harder than
dealing with fixed-width integers.
Take this example:
string S:="onetwothree"
case random(1..10)
when 1 then S := "ABC"
when 2 then S := T
when 3 then S := T.[10..20]
when 4 then S := S.[4..6]
when 5 then S := F()*10
esac
This is illegal in Ada
and irrelevant anyway. Try to stay focused on
parameter passing methods and managing temporary objects.
The discussion was about the language features and your amazement that
it can have strings
with string expressions. I don't even understand what are you trying to
say. That compiler vendors lie? That they are in a secret cabal with programmers who actually do not use their compilers in production code?
So you can claim that an implementation does not involve registers or
addresses or pointers or stack or heap or even memory, because those
terms do not appear in the source code.
Exactly.
On 20/01/2022 12:34, Dmitry A. Kazakov wrote:
On 2022-01-20 11:42, Bart wrote:
*where* is 'LP'?
Here, where you said:
"will deploy LPLONG for X. The value will be passed by reference
(LPLONG) as Win32 Stdcall prescribes. See any pointers? Right, there is none."
And I asked, So what is the P in LPLONG?
Not even in the underlying implementation? (That's that bit /I/ have
to code.)
No idea. What is the underlying implementation? Gates, capacitors,
silicon wafers?
I implement languages, which usually means going from HLL source code
down to the native code executed by a processor. That doesn't mean going
down to microprogramming or bitslicing or logic gates or transistors or
right down to the movement of electrons.
Stop being silly.
This has nothing to with love. It is a statement of fact. You asked
whether heap is necessary for a problem X. The answer is no. It is a
formal question:
Whether the ordering of allocation/deallocation used in X is
"random", actually indeterminable? No it is not, it is well defined
in fact.
Not even when X calls random() to decide which parts of a data
structure to allocate/deallocate?
Not even then. The ordering remains same.
That very glib, but I notice you don't explain how it's done.
It's like you have a pile of books to be put in order on a shelf. But at
some point, you need to remove a book from the middle of those already
on the shelf.
Thank you for finally admitting that integers and strings might need
different passing mechanisms.
Two integers might.
No, why should they if they are the same size?
Of course all this goes out of the window when Windows calling
convention is requested, sorry for pun.
So what does SYS V ABI do that is so different?
I don't remember SysV calling conventions.
Funny that you can remember Win64 ones!
Why do you care? What is different? Why would it shatter the earth
under you feet?
You had a sly dig at Windows calling conventions. I asked you to clarify
what is so bad about it compared with others, but now you don't care!
And that string can be implemented in a dozen different ways
depending on language.
And integer can be implemented in two dozens of ways, maybe in three...
Really, there is greater diversity in implementing a fixed width small integer than a string?
You said, long ago, that dealing with strings is no harder than
dealing with fixed-width integers.
Take this example:
string S:="onetwothree"
case random(1..10)
when 1 then S := "ABC"
when 2 then S := T
when 3 then S := T.[10..20]
when 4 then S := S.[4..6]
when 5 then S := F()*10
esac
This is illegal in Ada
OK, so the only language that counts is Ada? I'm glad that point has
been settled.
Actually plenty of languages have mutable variables and allow
conditional assignments of strings like the above.
and irrelevant anyway. Try to stay focused on parameter passing
methods and managing temporary objects.
'S' was a temporary object;
The discussion was about the language features and your amazement that
it can have strings
I've given plenty of examples where string data is not created or
destroyed in a LIFO manner, thus requiring ad hoc allocations and
allocations (so, a heap).
You have chosen to ignore the examples, or dismiss languages where that
is routinely done.
Oh. Well, a language either provides heap storage or itWell, there are shades of gray here. To explain, there is concern
does not. Even C does, even if in an unsafe and rather primitive
way. The techniques involved aren't exactly cutting-edge recent.
about small machines. Small means different things for various
people, some think 256M is small. But I mean really small,
think about 4k storage in total (program + data). [...]
So I do not understand why James wants fancy Print on small
systems. But desire to run without heap storage is IMHO
quite resonable.
Two integers might.
No, why should they if they are the same size?
No they don't. Integer_32 and Integer_16 have different sizes.
Sure. There exist hundreds of different integer encodings purposed for different goals.
-----------------------------------------
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Numerics.Discrete_Random;
procedure Test is
type Choice is (Red, Black, Blue);
package Random_Choices is new Ada.Numerics.Discrete_Random (Choice);
use Random_Choices;
procedure Foo (X : String) is
begin
Put_Line (X);
end Foo;
Seed : Generator;
X : constant String := "abcdefgh";
begin
Reset (Seed);
Foo
( (case Random (Seed) is
when Red => "red",
when Black => X (2..6),
when Blue => "Hello " & X & "!")
);
end Test;
The discussion was about the language features and your amazement
that it can have strings
I've given plenty of examples where string data is not created or
destroyed in a LIFO manner, thus requiring ad hoc allocations and
allocations (so, a heap).
No you claimed that
- it is impossible to return string from a subprogram without using heap
- it is impossible to pass a string to a subprogram without using heap
- it is impossible to have a string expression as an argument of a
subprogram without using heap
- it is impossible to use strings with Win32 API interfaces
All that is plain wrong, demonstrated by working examples.
You have chosen to ignore the examples, or dismiss languages where
that is routinely done.
Right, because they all are irrelevant to the claims you made.
On 20/01/2022 14:35, Dmitry A. Kazakov wrote:
Sure. There exist hundreds of different integer encodings purposed for
different goals.
At the low level, there are very few encodings (mainly to to with
endianness, so on a specific target, there might be just one).
Strings however don't really exist at that level, so they could be
myriad ways of representing them.
-----------------------------------------
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Numerics.Discrete_Random;
procedure Test is
type Choice is (Red, Black, Blue);
package Random_Choices is new Ada.Numerics.Discrete_Random (Choice); >> use Random_Choices;
procedure Foo (X : String) is
begin
Put_Line (X);
end Foo;
Seed : Generator;
X : constant String := "abcdefgh";
begin
Reset (Seed);
Foo
( (case Random (Seed) is
when Red => "red",
when Black => X (2..6),
when Blue => "Hello " & X & "!")
);
end Test;
(So why was my example illegal in Ada?)
But also, I don't know the semantics for Ada strings. What happens in a situation like this (I don't know Ada syntax so I will use a made-up one):
function Bar => String is
String X;
if random()<0.5 then
X := <compute some string at runtime>;
else
X := "ABC";
end if
return X;
end Bar
Bar returns the string stored in X, but X is a local that is destroyed
when it exits, so how does that string persist?
And I will ask again, how does it know when the string is done with, and
how does it know whether it is necessary to destroy the string and
recover its memory?
Suppose also that Foo() is changed to copy the string to a global.
Does Ada handle strings by value, which means it copies strings rather
than share them?
The discussion was about the language features and your amazement
that it can have strings
You claim it does it without using heap memory, so either you're not
telling me what restrictions that imposes, or you've carefully avoided
tricky cases in your examples.
No you claimed that
- it is impossible to return string from a subprogram without using heap
- it is impossible to pass a string to a subprogram without using heap
- it is impossible to have a string expression as an argument of a
subprogram without using heap
These are all more difficult to do, and impose more language
restrictions, without using heap-like allocation.
I would find it almost impossible to implement interpreters for example
as I would not be able to model the chaotic allocation patterns of the programs they run.
All that is plain wrong, demonstrated by working examples.
You have chosen to ignore the examples, or dismiss languages where
that is routinely done.
Right, because they all are irrelevant to the claims you made.
I notice you didn't respond to my bookshelf analogy.
The sort of tagged unions /I/ would want, would need a tag value that
is a global enum.
Different cases could also have the same type.
Rowing: Well, you're on better ground here, as the rowingIt's a type issue; such a feature reduces type safety. It stops a
coercion is one of the features of Algol that has been touted as
something that perhaps ought not to have been included. But it's
only syntactic sugar, so it's easily worked around if your language
doesn't have it.
language detecting the use of scalar rather than a list, which could
be an error on the user's part.
Inefficient? ???Yeah. Earlier discussion touched on the inefficiency of turning
something into an explicit string before printing (say, an entire
array, instead of one element at time).
Here, you're turning a set of N print-items into an array, so that it
can traverse them, deal with them, then discard the array.
Half-working?Well, how big a UNION would be needed for all possible printable
types? My static language limits them, but could easily decide to
print arrays and user-defined records if it wants. The diversity is
handled within the compiler.
It's also missing per-item formating codes. A solution using only
user-code, even in Algol68, is unwieldy and ugly.
"Local static variables" were in Algol 60, were problematicIt's not clear what is problematic about them, other than making
in detailed specification and were therefore dropped in Algol 68 [RR
0.2.6f]. Some of the reasons are expanded in the well-known book on
"Algol 60 Implementation" by Randell and Russell, esp in relation to
"own" arrays. The effects are easy to obtain in other ways. [Most
other prominent languages don't have them either.]
functions impure.
On 19/01/2022 11:40, Bart wrote:
The sort of tagged unions /I/ would want, would need a tag value that
is a global enum.
That makes it [unnecessarily] hard to compile independent
modules. But perhaps I've misunderstood your notion of "global"?
Here, you're turning a set of N print-items into an array, so that it
can traverse them, deal with them, then discard the array.
And the time spent doing this, compared with (a) the sum of
the times taken to deal with the N items one at a time, and (b) the
time, in a real application, taken to construct the N items in the
first place is? 0.1% of the total running time of the program? It's
a load of fuss about nothing.
late to "rescue" Algol or C from this; but James is developing a
new language.
Yes. So is formatted transput in general. The degree of
ugliness is comparable, however it's done. Simple things are easy,
as demonstrated; complicated things are complicated; and formats
require users to learn syntax and semantics that add seriously to
the difficulty of learning and using the language.
On 2022-01-20 21:13, Bart wrote:
I notice you didn't respond to my bookshelf analogy.
Because it is a wrong analogy. The right one is the poles in the Tower
of Hanoi game with the difference that instead of moving rings one moves
the poles. Though one could move the rings instead, in some rather inefficient implementation.
On 20/01/2022 21:25, Dmitry A. Kazakov wrote:
On 2022-01-20 21:13, Bart wrote:
I notice you didn't respond to my bookshelf analogy.
Because it is a wrong analogy. The right one is the poles in the Tower
of Hanoi game with the difference that instead of moving rings one
moves the poles. Though one could move the rings instead, in some
rather inefficient implementation.
OK. But if I've got a pile of books on one pole (using a hole near one
corner to minimise the inconvenience), how to do I remove a book in the middle of the pile?
Here's a very similar one which is about actual code:
On 2022-01-21 12:46, Bart wrote:
On 20/01/2022 21:25, Dmitry A. Kazakov wrote:
On 2022-01-20 21:13, Bart wrote:
I notice you didn't respond to my bookshelf analogy.
Because it is a wrong analogy. The right one is the poles in the
Tower of Hanoi game with the difference that instead of moving rings
one moves the poles. Though one could move the rings instead, in some
rather inefficient implementation.
OK. But if I've got a pile of books on one pole (using a hole near one
corner to minimise the inconvenience), how to do I remove a book in
the middle of the pile?
Simple, do not pile up your books!
"Algorithms + Data Structures = Programs"
-- Niklaus Wirth
Here's a very similar one which is about actual code:
[...]
Irrelevant stuff. The case is about string expressions and passing parameters.
Sounds like you are admitting the need for heap-like allocations in some cases.
Which means that if they exist, then objects that could be conditionally heap-allocated could be passed to functions and specifically to Print.
Typically it might be used like this:The sort of tagged unions /I/ would want, would need a tag value thatThat makes it [unnecessarily] hard to compile independent
is a global enum.
modules. But perhaps I've misunderstood your notion of "global"?
record unitrec_etc =
byte tag # or opcode, opndtype etc
...
end
'tag' is an instance of some global enumeration: token type, AST mode
type, IL or target language opcode.
However I've done a test, and the results are interesting. This is[...]
with dynamic code with 1 million iterations of this, directed to a
file:
println a, b, c, d # took 1.8 seconds
F((a, b, c, d)) # F iterates over a list and prints
# one at a time + discrete space
was 2.8 seconds.
On 21/01/2022 00:28, Bart wrote:
Typically it might be used like this:The sort of tagged unions /I/ would want, would need a tag value thatThat makes it [unnecessarily] hard to compile independent
is a global enum.
modules. But perhaps I've misunderstood your notion of "global"?
record unitrec_etc =
byte tag # or opcode, opndtype etc
...
end
'tag' is an instance of some global enumeration: token type, AST mode
type, IL or target language opcode.
Yes, but does "global" mean "throughout this module" or
"throughout this program" [including perhaps independent modules
that were compiled a decade ago] or "global throughout the entire
visible universe" [IOW, a tag that can be derived universally
from the declaration of the type]?
Separately, I don't see what
legitimate use there is for "tag" for union types that makes
case fred.tag in
...
tagforreals: somecode,
...
esac
interestingly different from
case fred in
...
(real): somecode,
...
esac
IOW, the extra cost, on your system, of using some sort of cleverly-formatted user-code transput instead of the default [which
I would call unformatted, as you have not provided the "mould" into
which "a, b, c, d" are to be cast] was one second per million lines,
call it per 15000 pages, ~50 reasonably-sized books. How does that
compare with the typical time taken to generate those four million
variables in the first place? Does anyone care about that extra
microsecond per line in real practical use? [Note that we're here
talking about transput for human consumption, not things like file
copying, which would normally be buffered and be much faster.]
On 21/01/2022 00:28, Bart wrote:
However I've done a test, and the results are interesting. This is[...]
with dynamic code with 1 million iterations of this, directed to a
file:
println a, b, c, d # took 1.8 seconds
F((a, b, c, d)) # F iterates over a list and prints
# one at a time + discrete space
was 2.8 seconds.
IOW, the extra cost, on your system, of using some sort of cleverly-formatted user-code transput instead of the default [which
I would call unformatted, as you have not provided the "mould" into
which "a, b, c, d" are to be cast] was one second per million lines,
call it per 15000 pages, ~50 reasonably-sized books. How does that
compare with the typical time taken to generate those four million
variables in the first place? Does anyone care about that extra
microsecond per line in real practical use?
[snip further examples of *programs* and *computers* running 50% faster]IOW, the extra cost, on your system, of using some sort ofThis is all nonsense. When you need the speed, and most people apart
cleverly-formatted user-code transput instead of the default [which
I would call unformatted, as you have not provided the "mould" into
which "a, b, c, d" are to be cast] was one second per million lines,
call it per 15000 pages, ~50 reasonably-sized books. How does that
compare with the typical time taken to generate those four million
variables in the first place? Does anyone care about that extra
microsecond per line in real practical use? [Note that we're here
talking about transput for human consumption, not things like file
copying, which would normally be buffered and be much faster.]
from you do want it, then 50% is a big deal:
* The Z80 used to run at 4MHz, until eventually there was a 6MHz part
that could run programs 50% faster.
And here you're just dismissing it because it merely means that some
program runs in 3 seconds instead of 2.
The fact is that when you are implementing a language feature, you
don't know what use a programmer will put it to; it might be speed
critical, or maybe not.
On 22/01/2022 23:41, Bart wrote:
[I wrote:]
[snip further examples of *programs* and *computers* running 50% faster]IOW, the extra cost, on your system, of using some sort ofThis is all nonsense. When you need the speed, and most people apart
cleverly-formatted user-code transput instead of the default [which
I would call unformatted, as you have not provided the "mould" into
which "a, b, c, d" are to be cast] was one second per million lines,
call it per 15000 pages, ~50 reasonably-sized books. How does that
compare with the typical time taken to generate those four million
variables in the first place? Does anyone care about that extra
microsecond per line in real practical use? [Note that we're here
talking about transput for human consumption, not things like file
copying, which would normally be buffered and be much faster.]
from you do want it, then 50% is a big deal:
* The Z80 used to run at 4MHz, until eventually there was a 6MHz part
that could run programs 50% faster.
And here you're just dismissing it because it merely means that some
program runs in 3 seconds instead of 2.
The fact is that when you are implementing a language feature, you
don't know what use a programmer will put it to; it might be speed
critical, or maybe not.
Did you read what I wrote? Which was nothing about computers
or programs running faster, but about "transput for human consumption".
I don't know how fast you read; I normally manage ~100 pages/hour.
Nor how fast your lineprinter is; mine does 36 pages/minute.
IOW,
your 1000000 lines would take me around 150 hours [call it three
working weeks] to read or around 7 hours to print. You would have
to be able to read or print 1000x faster for the saved second to be
of even the slightest interest.
*Note again*: this is for /human/ use;
Even that extra second is not comparing like with like; you
were comparing a default "print a b c d" with a procedure call that
formatted "a", "b", "c", "d" first; we have to assume that the
programmer wanted something different from the default, and it's not surprising that it takes extra time to compute that. You're also
still ignoring what would normally be the dominant time, which would
be that taken to compute the four million variables in the first
place; it's 50% faster only if that time is negligible.
Nor how fast your lineprinter is; mine does 36 pages/minute.
I think you're just not getting it.
Imagine if your editor only updated the display at 1200 cps because
that's the speed at which humans can read text. So it would take several seconds per page; a long time to scroll through a file!
For example, when gcc compiles a C program, it produces a .s file
containing assembly code, code that no human is ever going to read.
And yes, you will still need formatting to get it right. [...]
IOW,No, it isn't. A tiny, tiny fraction of such output will actually be
your 1000000 lines would take me around 150 hours [call it three
working weeks] to read or around 7 hours to print. You would have
to be able to read or print 1000x faster for the saved second to be
of even the slightest interest.
*Note again*: this is for /human/ use;
processed by a human.
[...] You're alsoThat's not generally the case either. The data may already exist, or
still ignoring what would normally be the dominant time, which would
be that taken to compute the four million variables in the first
place; it's 50% faster only if that time is negligible.
the generation is neglible compared printing it to a text file. IME
that latter part is the bottleneck.
I'm done another experiment: I took my compiler, and a 700Kloc test
file that normally takes 1.1 seconds to turn into an 8MB executable.
I got it to generate an ASM file instead; now it took 3.7 seconds in
all (and the result still needs to be assembled into a binary). Text processing is slow!
Yes ideally you would avoiding using text, but sometimes it the
easiest way to get things done; it's portable, and also human
readable when you need to debug something on line 873,900 out of 2M
lines. But you're not going to read all of it! (My test above was
2.3M lines.)
Also, WHY can your printer do 36 pages per minute, if you cannot read
that fast? Isn't the output for human consumption too?
The answer to that will help in understanding why a language might
have output facilities that not only work faster than 20cps, but
magnitudes faster.
On 24/01/2022 20:14, Bart wrote:
For example, when gcc compiles a C program, it produces a .s file
containing assembly code, code that no human is ever going to read.
And yes, you will still need formatting to get it right. [...]
The question wasn't about whether you need formatting, but
whether you need formatting beyond what "print" can produce. I
don't dispute that some programs produce output by the megaline;
whether they need fancy output [which was the root cause of the
"wasted" extra second per megaline] is another matter.
IOW,No, it isn't. A tiny, tiny fraction of such output will actually be
your 1000000 lines would take me around 150 hours [call it three
working weeks] to read or around 7 hours to print. You would have
to be able to read or print 1000x faster for the saved second to be
of even the slightest interest.
*Note again*: this is for /human/ use;
processed by a human.
Of course! I'm not going to spend three weeks actually
reading all that fancy output. It makes more sense to spend a
tiny, tiny fraction
I'm done another experiment: I took my compiler, and a 700Kloc test
file that normally takes 1.1 seconds to turn into an 8MB executable.
And you're off again! How long does it take you to write 700Kloc? A year? How often do you turn it into an executable?
Ten times a day? Wow, ...
I got it to generate an ASM file instead; now it took 3.7 seconds in
all (and the result still needs to be assembled into a binary). Text
processing is slow!
..., you could have saved ~30s per day. You could have used
that to get a cup of coffee. Or perhaps not. It really, really is
not worth worrying about. Half an hour per day might be worthwhile,
half a minute isn't.
Leading to another bugbear. There shouldn't be a line
873,900. Large projects are normally broken down into separate
modules. They're easier to debug that way. You can't possibly
understand 2.3M lines;
facility. But the effect over the decades is that the browser on
my computer is 1225 times as big as its predecessor ~20 years ago
for only very marginal improvements in utility.
On 28/01/2022 00:29, Andy Walker wrote:
And you're off again! How long does it take you to write
700Kloc? A year? How often do you turn it into an executable?
Ten times a day? Wow, ...
I can run it every few seconds.
But since you've asked, I've just now put a counter into my main
compiler, and a counter into my current project which is a new compiler. Since when you compile a new version of the new compiler (count A), you
will then run that version at least once (count B).
I'll report back in the next day or so.
No, you still don't get it. The rest of the output isn't discarded,Of course! I'm not going to spend three weeks actually*Note again*: this is for /human/ use;No, it isn't. A tiny, tiny fraction of such output will actually be
processed by a human.
reading all that fancy output. It makes more sense to spend a
tiny, tiny fraction
it's just not read by a human. MOST TEXT OUTPUT IS FOR MACHINE
CONSUMPTION.
Usually it's never read at all by a human (who reads the billions of
lines of XML, HTML or JS code, or billions of lines of .s files
produced by gcc?)
Ten times a day, seriously? Who runs a compiler only once an hour?!I'm done another experiment: I took my compiler, and a 700Kloc testAnd you're off again! How long does it take you to write
file that normally takes 1.1 seconds to turn into an 8MB executable.
700Kloc? A year? How often do you turn it into an executable?
Ten times a day? Wow, ...
Unless the compiler takes 59 minutes to build the program...
I can run it every few seconds. (Most times in response to compile
errors, but mainly to very large numbers of incremental changes.
Also, I might compile anyway, if I can't remember if it needs to. Not
a big deal, as it finishes in the time it takes to press and release
the Enter key.)
But since you've asked, I've just now put a counter into my main
compiler, and a counter into my current project which is a new
compiler. Since when you compile a new version of the new compiler
(count A), you will then run that version at least once (count B).
I'll report back in the next day or so.
The point was that, with the text option this app spent 75% of its
runtime doing text output. A counter-example to your claim that
formatted text output was an insignificant factor in most programs.
You don't seem to get that fact that you can't isolate one feature in
a language, eg. Print, and proclaim that it will only ever be used to generate textual output to be consumed in real time by a single
human, therefore it doesn't matter how slow it is so long as it meets
that requirement.
By making throughput a priority, that opened the door to new possibilities:
* Whole-program compilation
* A trivial build process
* The ability to run directly from source
In other words, allow rapid development techniques just like a script language.
On 28/01/2022 17:29, Bart wrote:
The question wasn't about running the compiler, but about
creating the executable of a 700Kloc test program. Why do you need
to keep on creating that executable?
I would have replied "Don't bother", but as it turned out
the results were interesting. You claim in a nearby article to be
running "up to 1000" compilations per working day, many of them
producing thousands of lines of diagnostics. What? That's around
a compilation every 30s, hour in, hour out.
That's enough of a gap
to permit the correction of an obvious typo, but it doesn't allow
much, if any, time for reflexion and careful analysis.
/New/ possibilities? That's straight back to the first 13
years or so of my career in computing. Load-and-go was all there
was in those days. No file systems, no editors, no storage available
to users unless you were one of the prominenti who managed to qualify
for a mag tape. We developed from there as proper operating systems,
file systems, editors, ... allowed modern project management.
[The Shulz-Evler "Blue Danube" -- see below --, btw (Strauss[...]
as you've never heard him before!) was reputedly the most difficult
piece ever put onto piano rolls, ~100 years ago.]
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Schulz-Evler
[The Shulz-Evler "Blue Danube" -- see below --, btw (Strauss[...]
as you've never heard him before!) was reputedly the most difficult
piece ever put onto piano rolls, ~100 years ago.]
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Schulz-EvlerWoah. It's like Paganini and Liszt had a baby, godfathered by Prokofiev and Conlon Nancarrow. Vintage black midi! Oh so pretty.
On 05/02/2022 03:50, luserdroog wrote:
[I wrote:]
[The Shulz-Evler "Blue Danube" -- see below --, btw (Strauss[...]
as you've never heard him before!) was reputedly the most difficult
piece ever put onto piano rolls, ~100 years ago.]
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Schulz-EvlerWoah. It's like Paganini and Liszt had a baby, godfathered by
Prokofiev and
Conlon Nancarrow. Vintage black midi! Oh so pretty.
Yes, but did you like it?
Prokofiev is still in copyright [for another 13 months!],
and Nancarrow won't be freely available for decades. So I can't
do anything much with them. But Paganini and Liszt are an astute observation. But the Schulz-Evler is a factor 1000 or so short of
being "Black MIDI"! Actually, I had forgotten about CN; I should
perhaps have said "the most difficult piece ever /performed/ onto
piano rolls"? Of course it's easy to write "unplayable" MIDI.
Personally, I'm primarily interested in /real/ music.
On 05/02/2022 03:50, luserdroog wrote:
[I wrote:]
Yes, but did you like it?[The Shulz-Evler "Blue Danube" -- see below --, btw (Strauss[...]
as you've never heard him before!) was reputedly the most difficult
piece ever put onto piano rolls, ~100 years ago.]
Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Schulz-EvlerWoah. It's like Paganini and Liszt had a baby, godfathered by Prokofiev and Conlon Nancarrow. Vintage black midi! Oh so pretty.
Prokofiev is still in copyright [for another 13 months!],
and Nancarrow won't be freely available for decades. So I can't
do anything much with them. But Paganini and Liszt are an astute
observation. But the Schulz-Evler is a factor 1000 or so short of
being "Black MIDI"! Actually, I had forgotten about CN; I should
perhaps have said "the most difficult piece ever /performed/ onto
piano rolls"? Of course it's easy to write "unplayable" MIDI.
Personally, I'm primarily interested in /real/ music.
Yes, but did you like it?Yes, very much so. Thanks for sharing.
Personally, I'm primarily interested in /real/ music.I get what you mean. But it can be difficult to establish a rigorous definition. [...]
The Paganini piece I was particularly reminded of is Caprice No. 6,
"the Trill". I've been trying to play it on the guitar with limited success.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 185 |
Nodes: | 16 (2 / 14) |
Uptime: | 05:29:08 |
Calls: | 3,677 |
Calls today: | 3 |
Files: | 11,149 |
Messages: | 3,447,436 |