Forum: >>> Magnum BBS <<<

Re: What does = mean, was Why are ambiguous grammars usually a bad idea

From Jan Ziak@21:1/5 to Jan Ziak on Thu Dec 30 17:10:39 2021

On Friday, December 31, 2021 at 12:45:56 AM UTC+1, Jan Ziak wrote:

On Thursday, December 30, 2021 at 7:56:15 PM UTC+1, Kaz Kylheku wrote:

When I started programming from nothing, I saw BASIC examples in a
book which was doing things like:

10 X = 2
20 X = X + 1

The only language with formulas that I was coming from was math.

So, I thought, what? How can X be equal to X + 1; you cannot solve
this absurdity!

From then I knew that the people who program computers to understand symbols are free thinkers who make them mean anything they want.

"X = X + Y" means "X[t+1] = X[t] + Y[t]" where t is time. Time had to be omitted from the notation of the BASIC programming language because otherwise the source code would consume a much larger amount of computer memory and it would complicate GOTO and FOR/NEXT statements.

-atom

[Interesting take. In reality, of course, BASIC borrowed that from Fortran. Algol used := for assignment, different from = for equality comparison. -John]

@John: Indeed, BASIC wasn't the 1st programming language. To generalize, I wanted to point out that the notion of time is implicit to almost all programming languages, of course not just BASIC. In my opinion, contrary to
the Kaz's opinion, most children who will later become programmers can quite easily understand what "X=X+1" means in a language like BASIC/Python/etc. (Thus, I disagree with the belief that "people who program computers to understand symbols are free thinkers who make them mean anything they want".)

-atom

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Jan Ziak@21:1/5 to Kaz Kylheku on Thu Dec 30 20:19:31 2021

On Wednesday, December 29, 2021 at 11:28:34 PM UTC+1, Kaz Kylheku wrote:

On 2021-12-16, Roger L Costello

Question: Opine about why languages are usually defined and implemented with
ambiguous grammars.

Novice programmers have historically been attracted to cryptic-looking languages. It is one of the main reasons for the success of languages
like C and Perl.
....

I know that what I am about to write does not answer the original question about ambiguous grammars, but I feel I have to respond to the claim that novices are attracted to cryptic-looking languages. If that was true then the brainf**k language would be in the top 10 languages in use today.

People new to programming aren't attracted to C because it is cryptic, but because - for example - in the 1990-ties they learned that C was used to implement the game Doom with only a few elements of assembly (https://en.wikipedia.org/wiki/Development_of_Doom#Programming). Doom was implemented in C and wasn't implemented in Lisp/Pascal/Smalltalk - which increases the popularity of C and decreases the popularity of Lisp/Pascal/Smalltalk.

Some young programmers were attracted to Smalltalk after the year 2002 because they watched the Squeakers movie (I believe it is this one: https://www.imdb.com/title/tt2172065/).

In summary: Novice programmers are attracted to particular programming languages because those languages are popular in their social networks.

-atom
[Sigh. You're probably right. Historically, novices started with a toy
language which left out more advanced but important ideas like data
structures and name scope, and gave them an unfortunately blinkered
idea of what programming involves. One time when I was a grad student
I had to explain to one of the undergrads why you really didn't want
to write all your programs in Tiny Basic. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gah4@21:1/5 to mac on Mon Jan 3 21:07:07 2022

On Monday, January 3, 2022 at 11:58:39 AM UTC-8, mac wrote:

[Interesting take. In reality, of couse, BASIC borrowed that from Fortran. Algol
used := for assignment, different from = for equality comparison. -John]

Indeed.
Unfortunately, assignment is probably the single most common operator.
The ASCII committee should have kept the left-arrow character instead of replacing it with underscore.

The assignment statement in BASIC, at least the ones I know, has an
(optional) LET keyword, so it might say:

10 LET A=3

Most people leave it off, though.

Is PL/I the only language that uses = for both assignment and the relational operator?
Since expressions are not statements, it avoids the ambiguity that would otherwise occur.
I believe some BASIC also use = for both.

Underscore is a pretty useful character.

The two ASCII characters that don't exist in EBCDIC are ^ and ~.
Two EBCDIC characters that don't exist in ASCII are 𝇍 (cent)
and ¬ (logical NOT sign). Conversion tables usually cross
map those pairs. (PL/I, at least, uses ¬ and ¬= operators.)

[In original Dartmouth BASIC the LET was mandatory, but it was a considerably smaller and fully compiled language than the later dialects. On the other hand, PL/I made a fetish of nothing being a reserved word, e.g.

IF IF = THEN THEN ELSE = BEGIN; ELSE END = IF;

-John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Thomas Koenig@21:1/5 to All on Tue Jan 4 19:23:04 2022

[In original Dartmouth BASIC the LET was mandatory, but it was a considerably smaller and fully compiled language than the later dialects. On the other hand, PL/I made a fetish of nothing being a reserved word, e.g.

IF IF = THEN THEN ELSE = BEGIN; ELSE END = IF;

Fortran shares this property.

This may sound slightly odd to people brought up on languages with
reserved keywords, but it has a big advantage: You can extend the
language with new keywords without making existing programs invalid.

There is a cost to this, in several aspects:

- More CPU time needed for parsing (important earlier, now it
can generally be neglected).

- More complexity in the parser. This cost is paid once, and
by the compiler developers, not the users.

- Similarity to FORTRAN may induce fear and loathing in computer
scientists (the last remark is not to be taken too seriously :-)

[Fortran barely had tokens since it ignored spaces outside of Hollerith strings. Having written a few F77 parsers, I can say it was possible
to tokenize using hints from the parser about what lexical kludge to
apply, but it wasn't pleasant. The yacc parser was straightforward
other than figuring out when to send which kludge ID to the lexer.

It also meant that one character typos could cause large semantic
changes, notably DO 5 I = 1,10 is a loop while DO 5 I = 1.10
is an assignment. Legend says we lost a satellite to that one.

-John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From gah4@21:1/5 to our moderator on Tue Jan 4 13:26:00 2022

On Tuesday, January 4, 2022 at 10:15:42 AM UTC-8, gah4 wrote:

(snip, our moderator wrote)

[In original Dartmouth BASIC the LET was mandatory, but it was a considerably smaller and fully compiled language than the later dialects. On the other hand, PL/I made a fetish of nothing being a reserved word, e.g.

IF IF = THEN THEN ELSE = BEGIN; ELSE END = IF;

-John]

I never used any close to the original BASIC, but did use, for some time,
the HP TSB2000 version. HP stores programs after tokenizing, so I suspect
that even if you don't put in LET, the tokenizer will add it.

As for PL/I, it borrowed many features from COBOL, but not the use
of reserved words. For one, they wanted people not to have to know the whole language, and not even the words. Stories are that COBOL programmers always keep the list of reserved words nearby, to avoid using them.

Counting from a recent IBM web page on their COBOL compiler, there are
over 400 reserved words, many common English words that people might
like to use. Somehow out of 50 years of programming, I have managed
never to even type in and run a COBOL program, and especially not to
write one.

As for Fortran parsing, I do remember that WATFIV reserves the sequence 'FORMAT(' at the beginning of a statement for actual FORMAT statements.
You can't assign to elements of an array named FORMAT. That might not
be so bad, except that Fortran 66, in its run-time format feature, requires the format data to be in an array. And the obvious name is FORMAT!
[COBOL doesn't have that many reserved words. See https://www.ibm.com/docs/en/i/7.1?topic=list-reserved-words
Re FORMAT statements, WATFOR/FIV punted for some reason. It's not that
hard to tell a format statement from a statement like FORMAT(I5,A4) =
42 but I realize no sane programmer would do that. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Martin Ward@21:1/5 to All on Wed Jan 5 10:25:37 2022

On 04/01/2022 21:26, gah4 wrote:

Stories are that COBOL programmers always
keep the list of reserved words nearby, to avoid using them.

Our esteemed moderator claims:

[COBOL doesn't have that many reserved words

I count 510 reserved words in IBM COBOL. Adding a few other dialects
can push the total to 700 or more. By comparison, C has about 32
reserved words.

The story I heard was of a COBOL shop where it was mandatory to
include a hyphen in every data name: in effect, *every* unhyphenated
word was treated as a reserved word. The slightly more managable list
of *hyphenated* reserved words (149 in IBM COBOL, but 46 of these are
of the form COMP-0, COMP-1, COMP-2 etc) was printed out and posted on
the wall.

I just noticed that if you include a digit in the part of the name
before the first hyphen, you can guarantee to avoid all
the reserved words!

PL/I went to the other extreme of no reserved words in reaction
to COBOL. Also, the aim of PL/I was to be a language which does
everything: business programming (like COBOL) and scientific
programming (like FORTRAN). In theory, if you only wanted
to do, say, business programming, you only needed to learn
part of the language and you would not get tripped up by keywords
from the other part of the language that you didn't know about yet.

Using a language that you don't know in its entirety might seem
dangerous, but everybody seems to do it these days:
how many C programmers have read the entire 500+ pages of
the latest C standard and memorised the 200+ varieties
of "undefined behaviour" so that they can avoid all of them
in every line of code that they write?
--
Martin

Dr Martin Ward | Email: martin@gkc.org.uk | http://www.gkc.org.uk G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4
[IBM hoped everyone would switch from Fortran and COBOL to PL/I and
it was obvious Fortran programmers would not put up with reserved
words, particularly ones unrelated to scientific programming.
As far as the size of languages, that seems a matter of point of
view. Python is a large language if you consider the standard
library to be part of the language, a very small one if you don't.
-John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From David Brown@21:1/5 to Martin Ward on Thu Jan 6 09:11:29 2022

On 05/01/2022 11:25, Martin Ward wrote:

Using a language that you don't know in its entirety might seem
dangerous, but everybody seems to do it these days:
how many C programmers have read the entire 500+ pages of
the latest C standard and memorised the 200+ varieties
of "undefined behaviour" so that they can avoid all of them
in every line of code that they write?

I think it is normal not to know everything about the language you use.
And if you include the language's standard library, then there are very
few currently used languages where it would even be possible to learn it
all. By the time you learned all of the language and default libraries
of C++, Java, Python, etc., there would be a new version out and you'd
have more to learn.

The important things for writing code are to know enough to be able to
write the kind of code you are doing, and to avoid accidentally doing
things you didn't intend. Static warning tools are vital here - from syntax-highlighting and check-as-you-type editors and IDE's, through
compiler warning flags, to stand-alone checkers. Your tools should
tell you if you are accidentally using a reserved word as an
identifier.

There is no need to memorize undefined behaviours for a language -
indeed, such a thing is impossible since everything not defined by a
language standard is, by definition, undefined behaviour. (C and C++
are not special here - the unusual thing is just that their standards
say this explicitly.)

The trick is to memorize the /defined/ behaviours, and stick to them.
You generally don't need to know if a language leaves (1 / 0) as
undefined, or gives a specific value, or prints an error message -
usually it is sufficient to know the values for which (x / y) /is/
defined, and stick to those values.

Basically, trying to execute undefined behaviour is no more and no less
than a bug in the program - whether it is "undefined" in terms of the
language, the library, the code you wrote yourself, the customer's specification, or anything else. People program primarily by trying to
write correct code - not by trying to think of all the ways they could
write incorrect code!

The real challenge from big languages and big standard libraries is not /writing/ code, it is /reading/ it. It doesn't really matter if a C programmer, when writing some code, does not know what the syntax "void
foo(int a[static 10]);" means. (Most C programmers don't know it, and
never miss it.) But it can be a problem if they have to read and
understand code that uses something they don't know.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Robert Prins@21:1/5 to David Brown on Thu Jan 6 19:07:20 2022

On 2022-01-06 08:11, David Brown wrote:

On 05/01/2022 11:25, Martin Ward wrote:

Your tools should tell you if you are accidentally using a reserved word as an
identifier.

Your language should not have reserved words, if PL/I (AD 1964) could already do
without them...

'nuff said!

Robert
--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather - https://prino.neocities.org/
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html [Just because it's possible to do something doesn't mean it is a good idea. A lot of us think a reasonable number of reserved words are fine and make it less likely that a typo will silently change the meaning of a program. -John]

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	293
Nodes:	16 (2 / 14)
Uptime:	233:52:45
Calls:	6,624
Files:	12,172
Messages:	5,319,635

Re: What does = mean, was Why are ambiguous grammars usually a bad idea

Who's Online

System Info