Forum: >>> Magnum BBS <<<

Hardening Defined Words

From Krishna Myneni@21:1/5 to All on Fri Aug 5 22:06:11 2022

Summary: for some non-native Forth systems, it should be possible to
relocate the compiled code of a colon definition into memory which can
be marked read-only, to protect against corruption. For this to be
feasible the run-time xt for a word should have at least one level of indirection to the code being executed by the virtual machine.

Consider the following ordinary colon definition in kForth, an indirect threaded code Forth interpreter/compiler:

: foo 0 ;

' foo execute . \ same as typing FOO
0 ok

see foo
565403F101F0 #0
565403F101F9 RET
ok

Now, let's do some bad things to FOO.

0 ' foo ! \ store a zero at the execution address for FOO
foo
Segmentation fault (core dumped)
$

Start kForth again and define FOO as above.

' foo a@ execute-bc . \ execute the byte code for FOO

One may infer that the xt for FOO is an address at which the compiled
byte code for FOO resides. The byte code is the code executed by
kForth's virtual machine.

Now, let's define a word BAR and demonstrate that we can modify the byte
code for BAR directly from the Forth interpreter.

: bar 10 0 do i . loop ;

see bar
560DD5DABDA0 #10
560DD5DABDA9 #0
560DD5DABDB2 >R
560DD5DABDB3 >R
560DD5DABDB4 IP>R
560DD5DABDB5 I
560DD5DABDB6 .
560DD5DABDB7 LOOP
560DD5DABDB8 RET
ok

To see the actual byte code of BAR,

' bar a@ 32 dump

560DD5DABDA0 : 49 0A 00 00 00 00 00 00 00 49 00 00 00 00
00 00 I........I......
560DD5DABDB0 : 00 00 DC DC DE 69 2E E9 EE 00 00 00 00 00
00 00 .....i..........

( the RET instruction for the virtual machine is byte EE ).

Now, we may corrupt the byte code, for example, by changing the loop
count to 5, instead of 10:

5 ' bar a@ 1+ !

Now, when BAR is executed, it will output "0 1 2 3 4 ok"

It is possible to use mmap and mprotect system calls (or equivalents
under Windows) to relocate the byte code to a new memory region and mark
that memory region as read-only, thereby avoiding this type of
corruption. It is relatively simple to do this from Forth itself,
although the details are obviously system-dependent. In this way, we
can, in principle, protect the executed code for a colon definition.

It's important to note that the dictionary structure for the word itself
is not able to be protected from being overwritten in this scheme.
Protecting the dictionary headers for colon definitions would require a significant change in architecture, but it's not out of the question.

Although I used kForth as the example system since I'm familiar with its internals, other systems may be able to do the same. I don't know the
internals of Gforth, but one can see that at least one level of
indirection appears to be involved in going from the xt to the executed
code, e.g., in Gforth,

see execute
Code execute
404AB9: mov $50[r13],r15
404ABD: mov rdx,[r14]
404AC0: add r14,$08
404AC4: mov rcx,-$10[rdx]
404AC8: jmp ecx
end-code

Here, the assembly code gives us the hint that r14 is the TOS (top of
stack) and there seems to be one level of indirection from the xt on top
of the stack to the code which is subsequently executed. The code
pointed to by xt can be overwritten, e.g., in Gforth,

: bar 10 0 do i . loop ; ok
bar 0 1 2 3 4 5 6 7 8 9 ok

0 ' bar @ ! ok
bar
*the terminal*:3:1: error: Stack underflow

I don't know enough about Gforth internals to be able to say that a
relocation of the code for BAR to a region which can be protected as
read only is possible. Perhaps one of the Gforth developers can say definitively whether or not this is possible.

--
Krishna Myneni

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to All on Sat Aug 6 14:09:13 2022

The only protected system I've used is FlashForth. It attempts to protect
the kernel on the basis a user should be able restart forth after a crash without having to re-flash the system. It's hard for me to evaluate the benefits of such a system without disabling the protection (not easy). The costs are known but the gain remains nebulous. Is there a developer who wouldn't have access to a programmer should re-flashing become necessary?
And what failure rate are we talking about - once a day, once a month?

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Krishna Myneni on Sat Aug 6 05:48:23 2022

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

It is possible to use mmap and mprotect system calls (or equivalents
under Windows) to relocate the byte code to a new memory region and mark
that memory region as read-only, thereby avoiding this type of
corruption. It is relatively simple to do this from Forth itself,
although the details are obviously system-dependent. In this way, we
can, in principle, protect the executed code for a colon definition.

It's important to note that the dictionary structure for the word itself
is not able to be protected from being overwritten in this scheme.

I don't see a reason why not. Compile-to-flash systems do it. If you
don't want to change protection on every IMMEDIATE, DOES> etc., keep
the most recent header in writeable memory, and only move it to
read-only memory when the next header is created.

I don't know the
internals of Gforth, but one can see that at least one level of
indirection appears to be involved in going from the xt to the executed
code, e.g., in Gforth,

see execute
Code execute
404AB9: mov $50[r13],r15
404ABD: mov rdx,[r14]
404AC0: add r14,$08
404AC4: mov rcx,-$10[rdx]
404AC8: jmp ecx
end-code

Here, the assembly code gives us the hint that r14 is the TOS (top of
stack) and there seems to be one level of indirection from the xt on top
of the stack to the code which is subsequently executed.

As far as EXECUTE is concerned, Gforth uses indirect-threaded code.
That's the indirection you are seeing.

It would require substantial changes to make the threaded code and/or
the headers read-only; for the native code it would be relatively straight-forward to make all but the most recent native-code page
unwriteable.

Bugs where code or headers were overwritten have not been problematic
enough in our experience to take any such action. I have not had such
a request by users, either.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@arcor.de@21:1/5 to Krishna Myneni on Fri Aug 5 23:19:32 2022

Krishna Myneni schrieb am Samstag, 6. August 2022 um 05:06:15 UTC+2:

Summary: for some non-native Forth systems, it should be possible to
relocate the compiled code of a colon definition into memory which can
be marked read-only, to protect against corruption. For this to be
feasible the run-time xt for a word should have at least one level of indirection to the code being executed by the virtual machine.

The easiest way in a VM-based Forth would be to just add
address-checking to all words that write to memory.
Eg
! (store sanitized, safe but slow)
_! (store naked, fast and unaccessible to the user)

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marcel Hendrix@21:1/5 to Anton Ertl on Sat Aug 6 00:49:31 2022

On Saturday, August 6, 2022 at 8:02:26 AM UTC+2, Anton Ertl wrote:
[..]

Bugs where code or headers were overwritten have not been problematic
enough in our experience to take any such action. I have not had such
a request by users, either.

I have had a problem with this a few times. Actually, I'm extremely glad that code is *not* read protected: how would I have noticed that something was wrong? A slightly off final result in a big program is not straightforward.

An overwite of native code causes an almost immediate crash
that leads to a useful exception report. It can be problematic
to instrument the calling code to catch the reason (i.e. if the overwrite happens very infrequently under special conditions). I have had one or two cases (in 40 years) where I had to use an external debugger which supports break on memory access. The steps are: start Forth first, then attach
the debugger to the image. Run Forth until you get the exception address, switch to the debugger and setup the breakpoint there, then go back to
Forth and restart or halt the program in the vicinity of the problem. Inspecting memory and data is much more convenient at the Forth end.

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to minf...@arcor.de on Sat Aug 6 20:14:24 2022

On 6/08/2022 16:19, minf...@arcor.de wrote:

Krishna Myneni schrieb am Samstag, 6. August 2022 um 05:06:15 UTC+2:

Summary: for some non-native Forth systems, it should be possible to
relocate the compiled code of a colon definition into memory which can
be marked read-only, to protect against corruption. For this to be
feasible the run-time xt for a word should have at least one level of
indirection to the code being executed by the virtual machine.

The easiest way in a VM-based Forth would be to just add
address-checking to all words that write to memory.
Eg
! (store sanitized, safe but slow)
_! (store naked, fast and unaccessible to the user)

! need not be slow - at least not for RAM - where it matters.
If application RAM in a system is segregated then it's a simple
test for ! to determine. Storing to CODE/FLASH/EEPROM can afford to
be slower as such operations are either atypical or inherently slow.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Marcel Hendrix on Sat Aug 6 09:21:07 2022

Marcel Hendrix <mhx@iae.nl> writes:

Actually, I'm extremely glad that
code is *not* read protected: how would I have noticed that something was >wrong?

If the code was write-protected and you tried to write to it, you
would get a SIGSEGV on Unix. E.g., when you do SEE FSIN in Gforth,
you see the code for FSIN coming from the gcc, which is
write-protected. Now let's see what happens when I try to write
there:

see fsin
Code fsin
5586FF58AFFB: movapd xmm0,xmm15
5586FF58B000: mov $20[rsp],r8
5586FF58B005: add r15,$08
5586FF58B009: call $5586FF5876E0
5586FF58B00E: mov r8,$20[rsp]
5586FF58B013: movapd xmm15,xmm0
5586FF58B018: mov rcx,-$08[r15]
5586FF58B01C: jmp ecx
end-code
ok
1 $5586FF58AFFB c!
*the terminal*:3:17: error: Invalid memory address
1 $5586FF58AFFB >>>c!<<<

An overwite of native code causes an almost immediate crash
that leads to a useful exception report.

If you are unlucky, the code is executed long after the write, and you
have to puzzle out what went wrong. With write-protected code you see
the write that would otherwise cause the problem, as shown above.

So write-protecting the code can have an advantage. The question is
if the advantage is big enough to justify the effort.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Marcel Hendrix@21:1/5 to Anton Ertl on Sat Aug 6 04:12:14 2022

On Saturday, August 6, 2022 at 11:34:00 AM UTC+2, Anton Ertl wrote:
[..]

If the code was write-protected and you tried to write to it, you
would get a SIGSEGV on Unix. E.g., when you do SEE FSIN in Gforth,
you see the code for FSIN coming from the gcc, which is
write-protected. Now let's see what happens when I try to write
there:

[..]

1 $5586FF58AFFB c!
*the terminal*:3:17: error: Invalid memory address
1 $5586FF58AFFB >>>c!<<<

Indeed useful: the exception is generated immediately
when the overwrite happens. The stack trace shows
who tried to do that.

FORTH> ' fsin idis
$01250FE0 : FSIN
$01250FEA call REDUCE.2PI ( $0124AD00 ) offset NEAR
$01250FEF fsin
$01250FF1 ;
FORTH> 1 $01250FEA c! ok
FORTH> ' fsin idis
$01250FE0 : FSIN
$01250FEA add [ecx] dword, rdx
$01250FEC popfq
$01250FED ??? rdi
$01250FEF fsin
$01250FF1 ;
FORTH> 0e fsin
Caught exception 0xc0000005
ACCESS VIOLATION
instruction pointer = $0000000001250FEA
RAX = $01253425 RBX = $01250FE0
RCX = $00000000 RDX = $0000028E
RSI = $01155C00 RDI = $2C9EF798
RBP = $01125F88 RSP = $2C9EF7D8
R8 = $01099A20 R9 = $00000020
R10 = $01046F50 R11 = $011128A5
R12 = $01099AC0 R13 = $01156FF0
R14 = $01136000 R15 = $01110000
Hardware exception in ``FSIN''+$0000000A
**** RETURN STACK DUMP **** for MAIN-THREAD

Only shows the problem after the fact, needing an external
debugger to set up a break.
Knowing how much effort/nuisance it is to make words
r/o during compilation and debugging would make it possible
to weigh advantages and disadvantages. It seems that
code and data can't be interleaved at all.

-marcel

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to Anton Ertl on Sat Aug 6 10:56:42 2022

On 8/6/22 04:21, Anton Ertl wrote:

Marcel Hendrix <mhx@iae.nl> writes:

Actually, I'm extremely glad that
code is *not* read protected: how would I have noticed that something was
wrong?

If the code was write-protected and you tried to write to it, you
would get a SIGSEGV on Unix. ...
So write-protecting the code can have an advantage. The question is
if the advantage is big enough to justify the effort.

Yes, even when the code to be executed is in the form of tokenized byte
code and executed by a virtual machine, protecting the memory containing
the byte code by making it read only will cause an overwrite to fail immediately and unambiguously.

I've run into this type of memory corruption problem a few times and I
remember they were extremely difficult to debug. It seems more common to
see memory corruption of the dictionary headers via a stray address, so
that problem may be more pressing than overwriting the byte code. In
kForth, corruption of the dictionary headers is often indicated by a
core dump upon performing bye, when the dynamically allocated dictionary
space is freed.

The machine code part of all code words in kForth are stored in memory
which is read-execute except when new code is added (see mc.4th). It is
not terribly difficult to protect ordinary colon definitions in kForth, although it is a bit of a hack write now. The dictionary header needs
another field or two, one of which indicates whether or not the word's executable code is read-only, and another to store the executable code size.

I wrote a variant of mc.4th, called protect.4th, which allows a colon definition to be protected. Going back to our BAR example,

: bar 10 0 do i . loop ;
ok
see bar
55C34CBC89C0 #10
55C34CBC89C9 #0
55C34CBC89D2 >R
55C34CBC89D3 >R
55C34CBC89D4 IP>R
55C34CBC89D5 I
55C34CBC89D6 .
55C34CBC89D7 LOOP
55C34CBC89D8 RET
ok
bar
0 1 2 3 4 5 6 7 8 9 ok

' bar 32 Protect-Def \ relocate BAR's byte code to protected memory
ok

\ Protect-Def also updates the dictionary header for BAR

see bar
7F09C30E3000 #10
7F09C30E3009 #0
7F09C30E3012 >R
7F09C30E3013 >R
7F09C30E3014 IP>R
7F09C30E3015 I
7F09C30E3016 .
7F09C30E3017 LOOP
7F09C30E3018 RET
ok
\ Note the new address space of the relocated byte code.
bar
0 1 2 3 4 5 6 7 8 9 ok

Now, unlike before, when we try to overwrite the byte code memory there
is an immediate and hard failure.

5 ' bar a@ 1+ !
Segmentation fault (core dumped)
$

The details of PROTECT-DEF are, of course, Forth system and OS system-dependent. The source code for protect.4th is posted at

https://github.com/mynenik/kForth-64/blob/master/forth-src/protect.4th

One problem with the current approach is a segmentation fault on
executing BYE , because the cleanup code executed upon BYE tries to free
the new byte code memory. This is why a protection flag is needed in the dictionary header, which involves changes to the source code for the
Forth system. However, these are relatively simple changes to kForth.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Marcel Hendrix on Sat Aug 6 16:21:16 2022

Marcel Hendrix <mhx@iae.nl> writes:

Knowing how much effort/nuisance it is to make words
r/o during compilation and debugging would make it possible
to weigh advantages and disadvantages. It seems that
code and data can't be interleaved at all.

If you want to write-protect code, you cannot have writable data in
the same page. You can have read-only data (e.g., settled headers,
constant values) in the same page (and that's good enough to avoid the
cache consistency performance problem on the Pentium Pro, Athlon, and
later CPUs), but cache utilization is better if you separate code and
data.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to Anton Ertl on Sat Aug 6 14:39:34 2022

On 8/6/22 00:48, Anton Ertl wrote:

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

...

It's important to note that the dictionary structure for the word itself
is not able to be protected from being overwritten in this scheme.

I don't see a reason why not. Compile-to-flash systems do it. If you
don't want to change protection on every IMMEDIATE, DOES> etc., keep
the most recent header in writeable memory, and only move it to
read-only memory when the next header is created.

All dictionary headers don't correspond to ordinary colon definitions.
If one were to protect all headers, there may be issues with relocation affecting previously compiled code, such as with DEFERred words. I
haven't thought through this problem enough yet to say with certainty
that all headers can be protected as read-only. It may be highly system-dependent.

...

It would require substantial changes to make the threaded code and/or
the headers read-only; for the native code it would be relatively straight-forward to make all but the most recent native-code page unwriteable.

I expect that your use of memory segments in Gforth should simplify the
problem of placing the threaded code in write-protected segments.

Bugs where code or headers were overwritten have not been problematic
enough in our experience to take any such action. I have not had such
a request by users, either.

Well, such bugs may be occurring more often than you realize. Such bugs
often don't have immediate consequences. I can run Forth code which
works perfectly fine because it hasn't made use of corrupt parts of the
system, but if such corruption has occurred I often find out when I type
BYE and then see a Seg Fault as memory is freed while the application is terminating. This tells me to go back and find the bugs in my
application Forth code.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to dxforth on Sat Aug 6 14:45:20 2022

On 8/5/22 23:09, dxforth wrote:

The only protected system I've used is FlashForth. It attempts to protect the kernel on the basis a user should be able restart forth after a crash without having to re-flash the system. It's hard for me to evaluate the benefits of such a system without disabling the protection (not easy). The costs are known but the gain remains nebulous. Is there a developer who wouldn't have access to a programmer should re-flashing become necessary?
And what failure rate are we talking about - once a day, once a month?

I expect the failure rate is highly application dependent. An
alternative to write-protecting the executable code and constant data,
is to store hash/checksum of the data within a read-only region (even an
EEPROM for a release application). Then, the user can check for
corruption on demand by recomputing the hashes/checksums and comparing
with the read-only data.

I expect the probability of a failure to be highly dependent on the
complexity of the application, and possibly on the hardware operating environment, e.g. if the power cycles on an off frequently while writing
to the storage medium.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to minf...@arcor.de on Sat Aug 6 14:24:15 2022

On 8/6/22 01:19, minf...@arcor.de wrote:

Krishna Myneni schrieb am Samstag, 6. August 2022 um 05:06:15 UTC+2:

Summary: for some non-native Forth systems, it should be possible to
relocate the compiled code of a colon definition into memory which can
be marked read-only, to protect against corruption. For this to be
feasible the run-time xt for a word should have at least one level of
indirection to the code being executed by the virtual machine.

The easiest way in a VM-based Forth would be to just add
address-checking to all words that write to memory.
Eg
! (store sanitized, safe but slow)
_! (store naked, fast and unaccessible to the user)

Not sure what you mean. How does a VM-based Forth distinguish between
addresses which are data space vs addresses which contain the virtual
machine code?

kForth uses a separate type stack to distinguish between ordinary
numbers and addresses. This has proven useful in flagging common
mistakes caused by incorrect stack manipulation. It is quite useful, I
think, in aiding beginning Forth programmers to reveal the source of the problem. The added complexity of the type stack gives a performance hit
of about 15% in kForth.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From minforth@arcor.de@21:1/5 to Krishna Myneni on Sat Aug 6 22:33:35 2022

Krishna Myneni schrieb am Samstag, 6. August 2022 um 21:24:20 UTC+2:

On 8/6/22 01:19, minf...@arcor.de wrote:

Krishna Myneni schrieb am Samstag, 6. August 2022 um 05:06:15 UTC+2:

Summary: for some non-native Forth systems, it should be possible to
relocate the compiled code of a colon definition into memory which can
be marked read-only, to protect against corruption. For this to be
feasible the run-time xt for a word should have at least one level of
indirection to the code being executed by the virtual machine.

The easiest way in a VM-based Forth would be to just add
address-checking to all words that write to memory.
Eg
! (store sanitized, safe but slow)
_! (store naked, fast and unaccessible to the user)

Not sure what you mean. How does a VM-based Forth distinguish between addresses which are data space vs addresses which contain the virtual
machine code?

When the dataspace is allocated at program start, it will contain no VM code.

kForth uses a separate type stack to distinguish between ordinary
numbers and addresses. This has proven useful in flagging common
mistakes caused by incorrect stack manipulation. It is quite useful, I
think, in aiding beginning Forth programmers to reveal the source of the problem. The added complexity of the type stack gives a performance hit
of about 15% in kForth.

IIRC StrongForth used a type stack to type-tag _all_ numeric types. But I
don't know how much performance it eat.

For pure stack and address checking (hard-coded in the words) I measured
a runtime penalty of about 5%-7% in my system. But it is C-based and therefore slower anyhow so that those different performance hits cannot be compared.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Anton Ertl@21:1/5 to Krishna Myneni on Sun Aug 7 14:11:58 2022

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

On 8/6/22 00:48, Anton Ertl wrote:

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

...

It's important to note that the dictionary structure for the word itself >>> is not able to be protected from being overwritten in this scheme.

I don't see a reason why not. Compile-to-flash systems do it. If you
don't want to change protection on every IMMEDIATE, DOES> etc., keep
the most recent header in writeable memory, and only move it to
read-only memory when the next header is created.

All dictionary headers don't correspond to ordinary colon definitions.

?

If one were to protect all headers, there may be issues with relocation >affecting previously compiled code, such as with DEFERred words.

It's unclear to me how relocation comes in here, but for DEFERed words
the thing that IS changes is not part of the header, just like for a
VALUE the thing the TO changes is not part of the header. And of
course you would need one indirection to get from the header to the
data in these cases.

I
haven't thought through this problem enough yet to say with certainty
that all headers can be protected as read-only. It may be highly >system-dependent.

If you put in enough effort, they can.

I expect that your use of memory segments in Gforth should simplify the >problem of placing the threaded code in write-protected segments.

Yes, one can use sections for that purpose, but there is still a
substantial amount of changes to make.

Well, such bugs may be occurring more often than you realize. Such bugs
often don't have immediate consequences. I can run Forth code which
works perfectly fine because it hasn't made use of corrupt parts of the >system, but if such corruption has occurred I often find out when I type
BYE and then see a Seg Fault as memory is freed while the application is >terminating. This tells me to go back and find the bugs in my
application Forth code.

Apart from your use as a debugging tool, freeing before exit()ing the
process is a waste of time.

One extreme case was Mozilla, which
apparently leaked memory, and that memory was paged out over time
(that was at a time when we still used swap space). When exiting
Mozilla, it took several minutes to page the leaked memory back in in
order to free() it. Only then it performed the exit(). It would have
exited much faster if it had not freed first.

My guess is that a manager saw the memory leaks, and demanded that the programmers fix them; so they dutifully recorded all memory that they allocated, and freed it when the user wanted to quit Mozilla. As a
result, the memory leak tool they used for finding leaks reported that
no memory was leaked, and the manager was satisfied. In reality the
leaks were still there, and Mozilla was now sluggish on termination.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: https://forth-standard.org/
EuroForth 2022: https://euro.theforth.net

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to Anton Ertl on Sun Aug 7 11:03:22 2022

On 8/7/22 09:11, Anton Ertl wrote:

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

On 8/6/22 00:48, Anton Ertl wrote:

Krishna Myneni <krishna.myneni@ccreweb.org> writes:

...

It's important to note that the dictionary structure for the word itself >>>> is not able to be protected from being overwritten in this scheme.

I don't see a reason why not. Compile-to-flash systems do it. If you
don't want to change protection on every IMMEDIATE, DOES> etc., keep
the most recent header in writeable memory, and only move it to
read-only memory when the next header is created.

All dictionary headers don't correspond to ordinary colon definitions.

?

I was thinking of CREATEd words and whether or not protecting the
dictionary headers for such words could cause problems for subsequently
defined words which call the earlier words. It seems that as long as
each dictionary header is protected after the corresponding word is
protected (relocated) there shouldn't arise a problem with incorrect
references to header entry fields, referenced by subsequent code.

If one were to protect all headers, there may be issues with relocation
affecting previously compiled code, such as with DEFERred words.

It's unclear to me how relocation comes in here, but for DEFERed words
the thing that IS changes is not part of the header, just like for a
VALUE the thing the TO changes is not part of the header. And of
course you would need one indirection to get from the header to the
data in these cases.

As long as the indirection is not bypassed by a compiler, there
shouldn't be a problem.

I
haven't thought through this problem enough yet to say with certainty
that all headers can be protected as read-only. It may be highly
system-dependent.

If you put in enough effort, they can.

I agree that a Forth system architecture which provides memory
protection for dictionary headers, non-native executable code of colon definitions, and for native code of CODE words/ordinary definitions is possible.

I expect that your use of memory segments in Gforth should simplify the
problem of placing the threaded code in write-protected segments.

Yes, one can use sections for that purpose, but there is still a
substantial amount of changes to make.

I'm taking a cautious approach, focusing on protecting the non-native (tokenized) executable code for colon definitions for now, and seeing if
there are any issues which come up. If there aren't any significant
issues, then I can tackle dictionary header protection.

Well, such bugs may be occurring more often than you realize. Such bugs
often don't have immediate consequences. I can run Forth code which
works perfectly fine because it hasn't made use of corrupt parts of the
system, but if such corruption has occurred I often find out when I type
BYE and then see a Seg Fault as memory is freed while the application is
terminating. This tells me to go back and find the bugs in my
application Forth code.

Apart from your use as a debugging tool, freeing before exit()ing the
process is a waste of time.

My recollection is that, in Linux, it wasn't always the case that the OS cleaned up dynamically allocated memory for an application after
termination -- that was the original reason for freeing memory prior to
exit(). In almost all of my use cases, the exit() time is ignorable/not noticeable and, thus, there was no need to remove the memory freeing
step. But it has the highly useful benefit of warning me via a seg fault
that my session was corrupt and that any results from the session should
be checked after fixing the bug(s) causing the corruption.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to Krishna Myneni on Sun Aug 7 17:41:24 2022

On 8/6/22 10:56, Krishna Myneni wrote:
...

The details of PROTECT-DEF are, of course, Forth system and OS system-dependent. The source code for protect.4th is posted at

https://github.com/mynenik/kForth-64/blob/master/forth-src/protect.4th

I revised protect.4th to provide general purpose data buffer protection,
e.g. create write-protected data tables. In addition to PROTECT-DEF for
write protecting the executable data of colon definitions, PROTECT-DATA
may be used to protect a data buffer of constant values,

PROTECT-DATA ( aold u -- anew u )

aold is the address of the existing read/write-able data buffer and u is
its size in bytes. PROTECT-DATA copies the constant data into a
read-only buffer of the same size. Currently there is a size limit of 1K
bytes for the constant data buffer -- the memory management code in
protect.4th may be revised to overcome this limit. PROTECT-DATA does not
tamper with the original data buffer

An example (in kForth) where this is useful is in the checksum program, sha512.4th, which uses a table of constants for computing the 512 byte
checksum of a data buffer.

-----
include ans-words
include strings
include modules
include utils
include ssd
include protect

\ Hash constant words K for SHA-512
HEX
428A2F98D728AE22 7137449123EF65CD B5C0FBCFEC4D3B2F E9B5DBA58189DBBC 3956C25BF348B538 59F111F1B605D019 923F82A4AF194F9B AB1C5ED5DA6D8118 D807AA98A3030242 12835B0145706FBE 243185BE4EE4B28C 550C7DC3D5FFB4E2 72BE5D74F27B896F 80DEB1FE3B1696B1 9BDC06A725C71235 C19BF174CF692694 E49B69C19EF14AD2 EFBE4786384F25E3 0FC19DC68B8CD5B5 240CA1CC77AC9C65 2DE92C6F592B0275 4A7484AA6EA6E483 5CB0A9DCBD41FBD4 76F988DA831153B5 983E5152EE66DFAB A831C66D2DB43210 B00327C898FB213F BF597FC7BEEF0EE4 C6E00BF33DA88FC2 D5A79147930AA725 06CA6351E003826F 142929670A0E6E70 27B70A8546D22FFC 2E1B21385C26C926 4D2C6DFC5AC42AED 53380D139D95B3DF 650A73548BAF63DE 766A0ABB3C77B2A8 81C2C92E47EDAEE6 92722C851482353B A2BFE8A14CF10364 A81A664BBC423001 C24B8B70D0F89791 C76C51A30654BE30 D192E819D6EF5218 D69906245565A910 F40E35855771202A 106AA07032BBD1B8 19A4C116B8D2D0C8 1E376C085141AB53 2748774CDF8EEB99 34B0BCB5E19B48A8 391C0CB3C5C95A63 4ED8AA4AE3418ACB 5B9CCA4F7763E373 682E6FF3D6B2B8A3 748F82EE5DEFB2FC 78A5636F43172F60 84C87814A1F0AB72 8CC702081A6439EC 90BEFFFA23631E28 A4506CEBDE82BDE9 BEF9A3F7B2C67915 C67178F2E372532B CA273ECEEA26619C D186B8C721C0C207 EADA7DD6CDE0EB1E F57D4F7FEE6ED178 06F067AA72176FBA 0A637DC5A2C898A6 113F9804BEF90DAE 1B710B35131C471B 28DB77F523047D84 32CAAB7B40C72493 3C9EBE0A15C9BEBC 431D67C49C100D4C 4CC5D4BECB3E42B6 597F299CFC657E2A 5FCB6FAB3AD6FAEC 6C44198C4A475817
50 table K512[]

\ Read-only versiion of K512[]
K512[] 50 cells Protect-Data drop constant K512P[]

cr .( Base is HEX ) cr
-----

The word TABLE is defined in utils.4th -- it creates the ordinary data
buffer, the address of which is returned by the word K512[]. To make a read-only version of this data buffer,

Example of using protect.4th (in kForth-64)
-----
include protect-ex1

Base is HEX
ok
K512P[] @ .
428A2F98D728AE22 ok

K512P[] 8 cells + @ u.
D807AA98A3030242 ok
ok
0 K512[] ! \ the original buffer still exists and we can modify it.
ok
0 K512P[] ! \ attempt to modify the read-only version
Segmentation fault (core dumped)
$
------

One problem with the current approach is a segmentation fault on
executing BYE , because the cleanup code executed upon BYE tries to free
the new byte code memory. This is why a protection flag is needed in the dictionary header, which involves changes to the source code for the
Forth system. However, these are relatively simple changes to kForth.
...

Unlike protecting the executable code of colon definitions, PROTECT-DATA
does not require any changes to the internals of the Forth system, so it
is immediately useful for creating read-only data buffers. There is no segmentation fault upon exiting the Forth system when using
PROTECT-DATA, because K512P[] is simply a constant, and there is no
attempt to free the read-only buffer.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From antispam@math.uni.wroc.pl@21:1/5 to Krishna Myneni on Mon Aug 8 01:07:29 2022

Krishna Myneni <krishna.myneni@ccreweb.org> wrote:

kForth uses a separate type stack to distinguish between ordinary
numbers and addresses. This has proven useful in flagging common
mistakes caused by incorrect stack manipulation. It is quite useful, I
think, in aiding beginning Forth programmers to reveal the source of the problem. The added complexity of the type stack gives a performance hit
of about 15% in kForth.

If you limit this to stack, then it may help beginners, but
will miss more "interesting" errors, like having a record
with numbers in some fields and xt's in other. To detect
access to wrong field you would need tags on _all_ data.

--
Waldek Hebisch

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to krishna.myneni@ccreweb.org on Tue Aug 16 22:05:54 2022

In article <tconob$klc8$1@dont-email.me>,
Krishna Myneni <krishna.myneni@ccreweb.org> wrote:
<SNIP>

I agree that a Forth system architecture which provides memory
protection for dictionary headers, non-native executable code of colon >definitions, and for native code of CODE words/ordinary definitions is >possible.

Note that all this effort expended is for the case of defects in the
program. It is much more useful to prevent defects.
Making the the architecture more complicated doesn't help for preventing defects.

--
Krishna

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From S Jack@21:1/5 to none albert on Wed Aug 17 05:42:27 2022

On Tuesday, August 16, 2022 at 3:05:59 PM UTC-5, none albert wrote:

In article <tconob$klc8$1...@dont-email.me>,
Krishna Myneni <krishna...@ccreweb.org> wrote:

Note that all this effort expended is for the case of defects in the
program. It is much more useful to prevent defects.

I don't discount hardening of "perfect" code not because the code may
not be perfect but because hardware in the field under stress doesn't
always follow the code.
--
me

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Hans Bezemer@21:1/5 to Krishna Myneni on Wed Aug 17 08:27:14 2022

On Saturday, August 6, 2022 at 5:06:15 AM UTC+2, Krishna Myneni wrote:
Well, 4tH doesn't have that issue.

: foo 0 ;
' foo execute . \ same as typing FOO
0 ' foo ! \ store a zero at the execution address for FOO
foo

Compiles to:

4tH message: No errors at word 10
Object size: 10 words
String size: 0 chars
Variables : 0 cells
Strings : 0 chars
Symbols : 1 names
Reliable : Yes

Addr| Opcode Operand Argument

0| branch 2 foo
1| literal 0
2| exit 0
3| literal 0
4| execute 0
5| . 0
6| literal 0
7| literal 0
8| ! 0
9| call 0 foo

And executes: "0 Executing; Word 8: Bad variable"
Where of course, the leading zero is generated by the program.

What ' returns is the address of "foo" in the Code Segment and "!" treats this as an address in the Integer Segment.
Unfortunately, address 0 of the Code Segment points to an address in the Integer Segment that is protected and
cannot be overwritten. Hence, it is a "bad variable".

4tH segmentation has been there since its very inception - and some segments are r/o (like the code- or the string
segment), some are partially r/o (system vars in the integer segment) and some are completely r/w (like the
character segment). Every access to these segments is either closely guarded or just impossible, since there
are no words to write anything there.

I don't have to go into "bar", because that one is equally impossible.

Maybe one could design an equally segmented Forth compiler, idunno. Never tried. But it *has* worked for me the last
30 odd years.

Hans Bezemer

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to albert on Thu Aug 18 08:30:42 2022

On 8/16/22 15:05, albert wrote:

In article <tconob$klc8$1@dont-email.me>,
Krishna Myneni <krishna.myneni@ccreweb.org> wrote:
<SNIP>

I agree that a Forth system architecture which provides memory
protection for dictionary headers, non-native executable code of colon
definitions, and for native code of CODE words/ordinary definitions is
possible.

Note that all this effort expended is for the case of defects in the
program. It is much more useful to prevent defects.

Write protecting the virtual threaded code using low-level OS methods is
a means of *detecting* program defects which corrupt the Forth system's
code. Otherwise, a defective word may corrupt a part of the Forth system
for which the consequences may not be readily apparent when executing
words. With low level memory protection of the virtual threaded code,
such corruption becomes immediately obvious.

Making the the architecture more complicated doesn't help for preventing defects.

The people who write link loaders may disagree with you -- such
protection usually exists for native code on desktop systems -- the
suggestion here is to extend memory protection to virtual threaded code
on the same type of systems.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to antispam@math.uni.wroc.pl on Thu Aug 18 08:43:55 2022

On 8/7/22 20:07, antispam@math.uni.wroc.pl wrote:

Krishna Myneni <krishna.myneni@ccreweb.org> wrote:

kForth uses a separate type stack to distinguish between ordinary
numbers and addresses. This has proven useful in flagging common
mistakes caused by incorrect stack manipulation. It is quite useful, I
think, in aiding beginning Forth programmers to reveal the source of the
problem. The added complexity of the type stack gives a performance hit
of about 15% in kForth.

If you limit this to stack, then it may help beginners, but
will miss more "interesting" errors, like having a record
with numbers in some fields and xt's in other. To detect
access to wrong field you would need tags on _all_ data.

The programmer must keep track of which member fields of a structure are addresses (pointers) and which are not when accessing them through fetch operators. kForth provides distinct fetch operators for single cell
non-address values and for addresses: @ for non-address values, and A@
for address values. This allows the value to be type-tagged when it is
fetched onto the data stack and for subsequent errors to be caught. For example, the sequences "@ @" and "@ A@" will result in a virtual machine
error, while "A@ @" or "A@ A@" are legal. It is up to the programmer to
use @ and A@ appropriately when accessing member fields of a structure.
kForth does not catch usage errors when a field is accessed, but such
errors are likely to result in the VM reporting a type mismatch error
further down the execution chain.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From dxforth@21:1/5 to Krishna Myneni on Fri Aug 19 10:19:35 2022

On 18/08/2022 23:30, Krishna Myneni wrote:

On 8/16/22 15:05, albert wrote:

In article <tconob$klc8$1@dont-email.me>,
Krishna Myneni <krishna.myneni@ccreweb.org> wrote:
<SNIP>

I agree that a Forth system architecture which provides memory
protection for dictionary headers, non-native executable code of colon
definitions, and for native code of CODE words/ordinary definitions is
possible.

Note that all this effort expended is for the case of defects in the
program. It is much more useful to prevent defects.

Write protecting the virtual threaded code using low-level OS methods is a means of *detecting* program defects which corrupt the Forth system's code. Otherwise, a defective word may corrupt a part of the Forth system for which the consequences may not

be readily apparent when executing words. With low level memory protection of the virtual threaded code, such corruption becomes immediately obvious.

Lack of checking in general should mean Forth applications are the most unreliable there are. Yet reports I've seen suggest opposite is true.
Working 'closer to the metal' I believe forth programmers are in a better position to know what can go wrong. In contrast, programmers in other languages rely on the compiler to tell them what they're doing is wrong.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Andy Valencia@21:1/5 to dxforth on Thu Aug 18 18:01:18 2022

dxforth <dxforth@gmail.com> writes:

Lack of checking in general should mean Forth applications are the most unreliable there are. Yet reports I've seen suggest opposite is true. Working 'closer to the metal' I believe forth programmers are in a better position to know what can go wrong. In contrast, programmers in other languages rely on the compiler to tell them what they're doing is wrong.

I'll go ahead and admit it: the hardest bugs to find I've ever written
were in ForthOS. Next was VSTa (in C), and then downward from there.
I think Golang let me write the hairiest performance intensive code
while still hitting reliability with little effort.

But, admittedly, it wasn't OS kernel code. Nor was the Python code
which was not far away from Golang in ease, though its performance
and scalability are a pathetic shadow of Golang.

Andy Valencia
Home page: https://www.vsta.org/andy/
To contact me: https://www.vsta.org/contact/andy.html

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to vandys@vsta.org on Fri Aug 19 12:06:36 2022

In article <166087087894.31034.14766942655302290779@media.vsta.org>,
Andy Valencia <vandys@vsta.org> wrote:

dxforth <dxforth@gmail.com> writes:

Lack of checking in general should mean Forth applications are the most
unreliable there are. Yet reports I've seen suggest opposite is true.
Working 'closer to the metal' I believe forth programmers are in a better
position to know what can go wrong. In contrast, programmers in other
languages rely on the compiler to tell them what they're doing is wrong.

I'll go ahead and admit it: the hardest bugs to find I've ever written
were in ForthOS. Next was VSTa (in C), and then downward from there.
I think Golang let me write the hairiest performance intensive code
while still hitting reliability with little effort.

But, admittedly, it wasn't OS kernel code. Nor was the Python code
which was not far away from Golang in ease, though its performance
and scalability are a pathetic shadow of Golang.

Note how a defect (bug?) in ForthOs doesn't profit from an elaborate
protection scheme, because for this type of software this is not
in place yet.

Andy Valencia

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From none) (albert@21:1/5 to sdwjack69@gmail.com on Fri Aug 19 12:50:08 2022

In article <cf38725e-5d33-4792-a9ed-2280e9b573ean@googlegroups.com>,
S Jack <sdwjack69@gmail.com> wrote:

On Tuesday, August 16, 2022 at 3:05:59 PM UTC-5, none albert wrote:

In article <tconob$klc8$1...@dont-email.me>,
Krishna Myneni <krishna...@ccreweb.org> wrote:

Note that all this effort expended is for the case of defects in the
program. It is much more useful to prevent defects.

I don't discount hardening of "perfect" code not because the code may
not be perfect but because hardware in the field under stress doesn't
always follow the code.

I agree. However retrofitting these methods on Forth I consider
doubtful.
I would rather use strict languages that has built safety in,
possibly augmented with a periodic crc check of code, and of course
parity memory and watch dog timers.
Modern language in this vein are Ada and go.

My first Algol60 experience had two errors:
array index out of bounds
memory exhausted
In each case you were given chapter and verse were the error occurred.
('Memory exhausted' was more often than not caused by infinite recursion.
You had to known that.)

--
me

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Krishna Myneni@21:1/5 to dxforth on Fri Aug 19 08:11:30 2022

On 8/18/22 19:19, dxforth wrote:

On 18/08/2022 23:30, Krishna Myneni wrote:

On 8/16/22 15:05, albert wrote:

In article <tconob$klc8$1@dont-email.me>,
Krishna Myneni <krishna.myneni@ccreweb.org> wrote:
<SNIP>

I agree that a Forth system architecture which provides memory
protection for dictionary headers, non-native executable code of colon >>>> definitions, and for native code of CODE words/ordinary definitions is >>>> possible.

Note that all this effort expended is for the case of defects in the
program. It is much more useful to prevent defects.

Write protecting the virtual threaded code using low-level OS methods
is a means of *detecting* program defects which corrupt the Forth
system's code. Otherwise, a defective word may corrupt a part of the
Forth system for which the consequences may not be readily apparent
when executing words. With low level memory protection of the virtual
threaded code, such corruption becomes immediately obvious.

Lack of checking in general should mean Forth applications are the most unreliable there are. Yet reports I've seen suggest opposite is true. Working 'closer to the metal' I believe forth programmers are in a better position to know what can go wrong. In contrast, programmers in other languages rely on the compiler to tell them what they're doing is wrong.

The discussion up to now is unrelated to compiler features -- it's about
the Forth system design enabling detection of coding errors. In the case
of the compiler, the Forth language does not provide strict syntax rules
and strong typing to allow for compiler checking, to the extent of other languages. Perhaps this does make for better programmers in the long run through a trial by fire experience -- such claims made here on c.l.f
appear to be purely anecdotal and if there's hard evidence for working
Forth programmers producing more robust code it would certainly be
interesting to see. However, to the extent that a Forth system or
compiler can aid in detection and reporting of program errors, I fail to
see how that's a bad thing.

--
Krishna

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Keyop
  Sun May 5 19:26:27 2024
  from Huddersfield, West Yorkshire via SSH
- Keyop
  Sun May 5 19:26:11 2024
  from Huddersfield, West Yorkshire via SSH
- Bob Worm
  Mon May 6 11:44:29 2024
  from Wales, Uk via Telnet
- Bob Worm
  Tue May 7 09:06:52 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	300
Nodes:	16 (2 / 14)
Uptime:	38:43:41
Calls:	6,708
Calls today:	1
Files:	12,241
Messages:	5,353,575

Hardening Defined Words

Who's Online

Recent Visitors

System Info