On Tue, 25 Apr 2023, Michael Schmitz wrote:
As to a cause for the corruption: all the calculations in setup_frame...
and sys_sigreturn use fsize, but get_sigframe() masks off the result
of usp - sizeof(sigframe) - fsize to place the entire frame at a
quadword boundary.
I wonder if we are seeing some fallout from the issue described in do_page_fault() i.e. usp is unreliable.
/* Accessing the stack below usp is always a bug. The
"+ 256" is there due to some instructions doing
pre-decrement on the stack and that doesn't show up
until later. */
if (address + 256 < rdusp())
goto map_err;
Maybe we should try modifying get_sigframe() to increase the gap between
the signal and exception frames from 0-1 long words up to 64-65 long
words.
On Tue, 25 Apr 2023, Finn Thain wrote:
On Tue, 25 Apr 2023, Michael Schmitz wrote:
As to a cause for the corruption: all the calculations in setup_frame...
and sys_sigreturn use fsize, but get_sigframe() masks off the result
of usp - sizeof(sigframe) - fsize to place the entire frame at a
quadword boundary.
I wonder if we are seeing some fallout from the issue described in
do_page_fault() i.e. usp is unreliable.
/* Accessing the stack below usp is always a bug. The
"+ 256" is there due to some instructions doing
pre-decrement on the stack and that doesn't show up
until later. */
if (address + 256 < rdusp())
goto map_err;
Maybe we should try modifying get_sigframe() to increase the gap between
the signal and exception frames from 0-1 long words up to 64-65 long
words.
It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
}
static int setup_frame(struct ksignal *ksig, sigset_t *set,
Maybe we should try modifying get_sigframe() to increase the gap between >>> the signal and exception frames from 0-1 long words up to 64-65 long
words.
It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?
Might happen, if the frame gap isn't actually equal to the exception
frame extra size anymore? Aligning the start of the signal frame to the
next lower quadword boundary increases the gap size.
When setting up the sigframe, the extra is copied to the correct
location (right past struct sigframe, or into uc_filler). When moving
that exception frame into place, the assumption is that the gap is the
extra size, not more.
I'll try dropping the quadword alignment constraint - the return
trampoline still ought to remain longword aligned.
On Tue, 25 Apr 2023, Finn Thain wrote:...
I wonder if we are seeing some fallout from the issue described in
do_page_fault() i.e. usp is unreliable.
/* Accessing the stack below usp is always a bug. The
"+ 256" is there due to some instructions doing
pre-decrement on the stack and that doesn't show up
until later. */
if (address + 256 < rdusp())
goto map_err;
Maybe we should try modifying get_sigframe() to increase the gap between
the signal and exception frames from 0-1 long words up to 64-65 long
words.
It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
}
static int setup_frame(struct ksignal *ksig, sigset_t *set,
It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
On Apr 25 2023, Finn Thain wrote:
It turns out that doing so (patch below) does make the problem go away.Probably the issue is that a bus error exception should never start
Was the exception frame getting clobbered?
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
signal delivery when returning to user space. On the 030 returning from
a bus error resumes the execution of the faulting insn (unlike the
040/060 which restart it), and the saved USP may have the original value
from before the insn started (modified registers may not be updated
until the insn is complete or just before the final bus cycle). Signal delivery should only ever happen at insn boundaries.
Hi Andreas,
On 25/04/23 23:25, Andreas Schwab wrote:
On Apr 25 2023, Finn Thain wrote:
It turns out that doing so (patch below) does make the problem go away.Probably the issue is that a bus error exception should never start
Was the exception frame getting clobbered?
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t
frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
signal delivery when returning to user space. On the 030 returning from
a bus error resumes the execution of the faulting insn (unlike the
040/060 which restart it), and the saved USP may have the original value
from before the insn started (modified registers may not be updated
until the insn is complete or just before the final bus cycle). Signal
delivery should only ever happen at insn boundaries.
Thanks - we had seen evidence that a bus error generated
mid-instruction did leave the USP at the address where the bus fault
happened (not before the instruction started, neither what it would
have been once the instruction completed), and the operation did not
complete normally after the bus error (at least the value/address seen
in the exception frame not stored). Finn had also demonstrated that
skipping signal delivery on bus errors abolishes the stack
corruption. Your patch achieves the same objective in a different
way, so I'm sure this will work as well.
I had thought the 030 could resume the interrupted instruction using
the information from the exception frame - and that does appear to
work in all other cases except where signal delivery gets in the way,
and it also works if moving the exception frame a little bit further
down the stack. So our treatment of the bus error exception frame
during signal delivery appears to be incorrect. Wouldn't you agree?
Thanks - we had seen evidence that a bus error generated mid-instruction
did leave the USP at the address where the bus fault happened (not
before the instruction started, neither what it would have been once the instruction completed), and the operation did not complete normally
after the bus error (at least the value/address seen in the exception
frame not stored).
Finn had also demonstrated that skipping signal delivery on bus errors abolishes the stack corruption. Your patch achieves the same objective
in a different way, so I'm sure this will work as well.
I had thought the 030 could resume the interrupted instruction using the information from the exception frame - and that does appear to work in
all other cases except where signal delivery gets in the way, and it
also works if moving the exception frame a little bit further down the
stack. So our treatment of the bus error exception frame during signal delivery appears to be incorrect.
Please try this patch.
Signal delivery should only happen at insn boundaries, but due to the
way the 030 handles return from bus error
exceptions (the insn is resumed, not restarted like on the 040/060) the kernel may do it in the middle of the faulting insn.
diff --git a/arch/m68k/kernel/entry.S b/arch/m68k/kernel/entry.S
index 4dd2fd7acba9..6c09a5710728 100644
--- a/arch/m68k/kernel/entry.S
+++ b/arch/m68k/kernel/entry.S
@@ -117,7 +117,11 @@ ENTRY(buserr)
movel %sp,%sp@- | stack frame pointer argument
jbsr buserr_c
addql #4,%sp
- jra ret_from_exception
+ | don't do signal delivery when interrupted while insn is in progress
+ | (on the 020/030)
+ tstl %d0
+ jeq ret_from_exception
+ RESTORE_ALL
ENTRY(trap)
SAVE_ALL_INT
diff --git a/arch/m68k/kernel/traps.c b/arch/m68k/kernel/traps.c
index a700807c9b6d..40fc408d1333 100644
--- a/arch/m68k/kernel/traps.c
+++ b/arch/m68k/kernel/traps.c
@@ -751,8 +751,10 @@ static inline void access_errorcf(unsigned int fs, struct frame *fp)
}
#endif /* CONFIG_COLDFIRE CONFIG_MMU */
-asmlinkage void buserr_c(struct frame *fp)
+asmlinkage int buserr_c(struct frame *fp)
{
+ int not_insn_boundary = 0;
+
/* Only set esp0 if coming from user mode */
if (user_mode(&fp->ptregs))
current->thread.esp0 = (unsigned long) fp;
@@ -793,8 +795,9 @@ asmlinkage void buserr_c(struct frame *fp)
break;
#endif
#if defined (CPU_M68020_OR_M68030)
- case 0xa:
case 0xb:
+ not_insn_boundary = 1;
+ case 0xa:
bus_error030 (fp);
break;
#endif
@@ -803,6 +806,8 @@ asmlinkage void buserr_c(struct frame *fp)
pr_debug("Unknown SIGSEGV - 4\n");
force_sig(SIGSEGV);
}
+
+ return not_insn_boundary;
}
On Wed, 26 Apr 2023, Michael Schmitz wrote:
Thanks - we had seen evidence that a bus error generated mid-instruction
did leave the USP at the address where the bus fault happened (not
before the instruction started, neither what it would have been once the
instruction completed), and the operation did not complete normally
after the bus error (at least the value/address seen in the exception
frame not stored).
I'm afraid I still don't fully understand how and why the user stack
(rather than the supervisor stack) gets used for processing the exception frame.
Finn had also demonstrated that skipping signal delivery on bus errors
abolishes the stack corruption. Your patch achieves the same objective
in a different way, so I'm sure this will work as well.
I had thought the 030 could resume the interrupted instruction using the
information from the exception frame - and that does appear to work in
all other cases except where signal delivery gets in the way, and it
also works if moving the exception frame a little bit further down the
stack. So our treatment of the bus error exception frame during signal
delivery appears to be incorrect.
Wouldn't that depend on the exception frame format? Perhaps it is unsafe
to treat any format 0xB exception frame in the way we do. If so, what do
we do about address error exceptions, which are to produce SIGBUS? The Programmers Reference Manual says "a long bus fault stack frame may be generated" in this case.
On 26/04/23 07:46, Michael Schmitz wrote:
I had thought the 030 could resume the interrupted instruction using
the information from the exception frame - and that does appear to
work in all other cases except where signal delivery gets in the way,
and it also works if moving the exception frame a little bit further
down the stack. So our treatment of the bus error exception frame
during signal delivery appears to be incorrect. Wouldn't you agree?
Inspection of the format b frame placed in the signal frame in both rt
and non-rt cases (at the time the signal handler runs) shows the
expected contents in the data output buffer, data fault address and ssw.
At that time, returning to user space with rte would correctly resume
the instruction execution. I had previously confirmed that the register contents saved in the rt signal frame is correct also.
That is with a kernel patched similar to above patch by Finn (using an
offset of 128 or 64 instead of 256).
On Wed, 26 Apr 2023, Michael Schmitz wrote:
On 26/04/23 07:46, Michael Schmitz wrote:
I had thought the 030 could resume the interrupted instruction using
the information from the exception frame - and that does appear to
work in all other cases except where signal delivery gets in the way,
and it also works if moving the exception frame a little bit further
down the stack. So our treatment of the bus error exception frame
during signal delivery appears to be incorrect. Wouldn't you agree?
Inspection of the format b frame placed in the signal frame in both rt
and non-rt cases (at the time the signal handler runs) shows the
expected contents in the data output buffer, data fault address and ssw.
At that time, returning to user space with rte would correctly resume
the instruction execution. I had previously confirmed that the register
contents saved in the rt signal frame is correct also.
That is with a kernel patched similar to above patch by Finn (using an
offset of 128 or 64 instead of 256).
That means things go awry during sys_sigreturn or sys_rt_sigreturn. I'm
not sure what happens to the exception frame:
1:
| stack contents now:
| [original pt_regs address] [original switch_stack address]
| [unused part of the gap] [moved switch_stack] [moved pt_regs]
| [replacement exception frame]
| return value of do_{rt_,}sigreturn() points to moved switch_stack.
movel %d0,%sp | discard the leftover junk
RESTORE_SWITCH_STACK
| stack contents now is just [syscall return address] [pt_regs] [frame]
| return pt_regs.d0
movel %sp@(PT_OFF_D0+4),%d0
rts
... but I noticed that the sys_rt_sigreturn entry point does SAVE_SWITCH_STACK while the sys_sigreturn entry point does not. Yet both
jump to label 1 above, so both syscalls do RESTORE_SWITCH_STACK. Hmmm.
.macro SAVE_SWITCH_STACK
moveml %a3-%a6/%d6-%d7,%sp@-
.endm
.macro RESTORE_SWITCH_STACK
moveml %sp@+,%a3-%a6/%d6-%d7
.endm
Well, that has to corrupt %a3, right?
In my copy of entry.S, sys_sigreturn has SAVE_SWITCH_STACK, just as sys_rt_sigreturn does.
It seems I got confused about user and kernel stack there myself. And
managed to confuse almost everyone else about this bug. Apologies for
the incessant noise.
What matters for the return from exception is an intact frame on the
kernel stack. Anything we do on the user stack (mucking around with the offset the sigframe is set up at, copying siginfo, ucontext or
sigcontext plus exception frame extra) does not change the kernel stack
one whit.
The mangle_kernel_stack stuff is needed because sys_sigreturn will place another exception frame on the kernel stack (a four word frame) that
needs to be replaced by the bus error exception frame (or any other
frame that caused the kernel mode entry prior to signal delivery) before finally returning from the bus error exception.
Only at that time will the movel instruction that took the bus fault
resume (and complete its writes correctly, I hope).
Our problem may be that, if we take the signal too late and our main
process inspects the stack that has been left partially saved only (due
to the bus error processing still in-flight), we appear to be in
trouble. After completing sys_sigreturn, everything will be OK.
I can see this cause the stack error in the test case. Not sure it also applies to the dash case ...
Wouldn't that depend on the exception frame format? Perhaps it is
unsafe to treat any format 0xB exception frame in the way we do. If
so, what do we do about address error exceptions, which are to produce SIGBUS? The Programmers Reference Manual says "a long bus fault stack
frame may be generated" in this case.
We don't handle access errors (beyond terminating the offending
process).
The exception frame is copied to the signal frame for informational
purposes only
If the long format frame was corrupted while on the user stack, the
partially completed MOVEM won't be resumed correctly. That's why I was concerned about a bug in sys_sigreturn.
Wouldn't that depend on the exception frame format? Perhaps it is
unsafe to treat any format 0xB exception frame in the way we do. If
so, what do we do about address error exceptions, which are to produce
SIGBUS? The Programmers Reference Manual says "a long bus fault stack
frame may be generated" in this case.
We don't handle access errors (beyond terminating the offending
process).
SIGBUS could be caught and handled, perhaps followed by a setcontext() or siglongjmp() rather than return. So I don't think we get to disallow frame format 0xB.
Hi Finn,Minimal gap that avoids the corruption for me (in your patch below) is
Am 25.04.2023 um 13:55 schrieb Finn Thain:
On Tue, 25 Apr 2023, Finn Thain wrote:
On Tue, 25 Apr 2023, Michael Schmitz wrote:
As to a cause for the corruption: all the calculations in setup_frame...
and sys_sigreturn use fsize, but get_sigframe() masks off the result
of usp - sizeof(sigframe) - fsize to place the entire frame at a
quadword boundary.
I wonder if we are seeing some fallout from the issue described in
do_page_fault() i.e. usp is unreliable.
/* Accessing the stack below usp is always a bug. The
"+ 256" is there due to some instructions doing
pre-decrement on the stack and that doesn't show up
until later. */
if (address + 256 < rdusp())
goto map_err;
Maybe we should try modifying get_sigframe() to increase the gap
between
the signal and exception frames from 0-1 long words up to 64-65 long
words.
It turns out that doing so (patch below) does make the problem go away.
Was the exception frame getting clobbered?
Might happen, if the frame gap isn't actually equal to the exception
frame extra size anymore? Aligning the start of the signal frame to
the next lower quadword boundary increases the gap size.
When setting up the sigframe, the extra is copied to the correct
location (right past struct sigframe, or into uc_filler). When moving
that exception frame into place, the assumption is that the gap is the
extra size, not more.
I'll try dropping the quadword alignment constraint - the return
trampoline still ought to remain longword aligned.
Cheers,
Michael
diff --git a/arch/m68k/kernel/signal.c b/arch/m68k/kernel/signal.c
index b9f6908a31bc..94104699f5a8 100644
--- a/arch/m68k/kernel/signal.c
+++ b/arch/m68k/kernel/signal.c
@@ -862,7 +862,7 @@ get_sigframe(struct ksignal *ksig, size_t
frame_size)
{
unsigned long usp = sigsp(rdusp(), ksig);
- return (void __user *)((usp - frame_size) & -8UL);
+ return (void __user *)((usp - 256 - frame_size) & -8UL);
}
static int setup_frame(struct ksignal *ksig, sigset_t *set,
Am 26.04.2023 um 16:42 schrieb Finn Thain:
If the long format frame was corrupted while on the user stack, the >partially completed MOVEM won't be resumed correctly. That's why I was >concerned about a bug in sys_sigreturn.
Yes, it turns out I hadn't read mangle_kernel_stack() carefully enough. I thought the exception frame had remained on the kernel stack to be restored, but I'd missed that it is actually being restored from the user stack copy
to the kernel stack.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 300 |
Nodes: | 16 (2 / 14) |
Uptime: | 106:13:09 |
Calls: | 6,700 |
Files: | 12,232 |
Messages: | 5,350,363 |