Moderator, Frank : I'm not sure if questions about the x87 FPU are permitted here. If not than please just discard. If they are than please remove this line. :-)
Hello all,
I've just been writing some basic code to parse a simple float, and realized that I had no idea how to check if the x87 FPU was empty after I was done - as a simple measure to check if my code cleaned up correctly.
I've been looking at using the ST bits in the FPU status word, but had to find that they (unexpectedly) didn't end at zero after I done my thing :
minimal example:
fld1 ;Load
fld1
fstp st(2) ;Swap ST(0) and ST(1) <-- this is the culprit
fstp st(0) ;Discard
fstp st(0)
At this point all the ST bits are set, indicating a minus one, not zero.
My questions at this point are:
1) Have I done anything wrong in the above ? I don't think so, but "you never know" ....
2) How do I, for debugging purposes, check the FPU stack ?
Regards,
Rudy Wieser
Moderator, Frank : I'm not sure if questions about the x87 FPU are
permitted
here. If not than please just discard. If they are than please remove
this
line. :-)
Hi Rudy,
Consider the line removed.
I think x87 is on topic. If necessary, I so rule it. :)
I don't know the answer, though...
2) How do I, for debugging purposes, check the FPU stack ?
If your debugger doesn't support it
you can at least use FSAVE/FRESTOR to fill in a chunk
of data which you can then inspect.
DJ,
If your debugger doesn't support it
No debugger here (never liked them).
you can at least use FSAVE/FRESTOR to fill in a chunk
of data which you can then inspect.
Thanks. That one does give quite a bit of information.
It does have a drawback though: it re-initializes the FPU stack, meaning it cannot be used while in the middle of a calculation. Any idea to some non-destructive probing ?
2) How do I, for debugging purposes, check the FPU stack ?
but how did you check 1)
FCLEX or FXSAVE followed by FINIT work fine for me to clean up.
and FFREE/r is my way to empty a specific register.
I actually hate this stupid stack-up/dn design, an overall ST(n)
would work just fine with much lesser doubtful quirks.
It does have a drawback though: it re-initializes the FPU stack, meaning
it cannot be used while in the middle of a calculation. Any idea to some
non-destructive probing ?
FXSAVE
Moderator, Frank : I'm not sure if questions about the x87
FPU are permitted here. If not than please just discard.
If they are than please remove this line. :-)
Hello all,
I've just been writing some basic code to parse a simple
float, and realized that I had no idea how to check if the
x87 FPU was empty after I was done - as a simple measure
to check if my code cleaned up correctly.
I've been looking at using the ST bits in the FPU status word, but had to find that they (unexpectedly) didn't end at zero after I done my thing :
minimal example:
fld1 ;Load
fld1
fstp st(2) ;Swap ST(0) and ST(1) <-- this is the culprit
fstp st(0) ;Discard
fstp st(0)
At this point all the ST bits are set, indicating a minus one, not zero.
My questions at this point are:
1) Have I done anything wrong in the above ? I don't think
so, but "you never know" ....
2) How do I, for debugging purposes, check the FPU stack ?
You will need FSAVE/FRSTOR (and varients)
Your first FLD will clobber the stack top,
As another poster has said, I don't think the x87 automagically
sets value flags
Dump and examine in main memory.
Robert,
You will need FSAVE/FRSTOR (and varients)Wolfgang gave that suggestion too. Alas, the F(N)SAVE resets
the FPU stack, and for some reason I can't get the FXSAVE to work
(my assembler shows its age by not knowing the opcode, and trying
to use a "db 0Fh,0AEH, ....." sequence crashes the program).
Your first FLD will clobber the stack top,
I don't get that - why only the first one, and why would
it clobber (the value at) the stack top ?
As another poster has said, I don't think the x87 automagically
sets value flags
I don't quite get this either. Value flags ? I'm reading the
"Status Word" and in it look at the ST bits (at 11-13).
Remark : I later found out/realized that the "Stack Top"
is just the starting offset for the ST(x) arguments. IOW :
whats in it isn't really relevant.
Dump and examine in main memory.
:-) The problem was that I had no idea that I could or how to do that .
Ofcourse it didn't help that I got confused by (and by it focussed on)
the "Stack Top" value. :-\
It might have some safeguards against executing data :)
I don't get that - why only the first one, and why would
it clobber (the value at) the stack top ?
The stack is eight FP registers, any load pushes the one
on the top into the bit bucket.
Actually, I believe the registers are a circular file,
and the load overwrites and decrements TOS.
Aren't those three bits (0-7) the Top-of-Stack pointer?
People sometimes compare the FPSW with the x86 flags register.
It is not.
34 years ago I wrote an extention to MS-DOS DEBUG.COM
to examine the x87.
Converting binaryFP to decimal FP was hard.
and for some reason I can't get the FXSAVE to work (my assembler shows its age by not knowing the opcode, and trying to use a "db 0Fh,0AEH, ....." sequence crashes the program).
Robert,
It might have some safeguards against executing data :)
I've used the "trick" before, so I don't think so. Currently I'm torn between the posibilities that the processor I'm using might not be having that command, that I'm simply bungling up or that there is some kind of memory alignment involved (the latter one would not be the first time I've run into it).
Is there any possibility you could take a look at and post what code gets generated for an "FXSAVE {register pointer}" ?
I don't get that - why only the first one, and why would
it clobber (the value at) the stack top ?
The stack is eight FP registers, any load pushes the one
on the top into the bit bucket.
True. But such a push would only clobber anything if the (circular) stack
is completely full.
Actually, I believe the registers are a circular file,
It has to be, as my example code works : after the second FLD1 the TOS is 6. But I can still execute a FSTP ST(2) ,which seemingly points at 6+2 = 8.
and the load overwrites and decrements TOS.
The info to, for instance, FLD mentions decrementing first, than store
(which is why I didn't understand your "clobbering" remark).
Aren't those three bits (0-7) the Top-of-Stack pointer?
Yep. I was assuming that that value would (implicitily) tell me how many values where placed on the stack. Turns out it doesn't. :-\
People sometimes compare the FPSW with the x86 flags register.
It is not.
Similar perhaps (both contain status flags), but (ofcourse) not the same.
34 years ago I wrote an extention to MS-DOS DEBUG.COM
to examine the x87.
I'm not sure what you mean with an 'extension' (wasn't aware that Debug supported such a thing), but years ago I wrote something for it (using
memory patching) so it could deal with a few more opcodes.
Converting binaryFP to decimal FP was hard.
Thats something I still have to take a look at. Just not at this moment.
You still haven't told us what OS (DOS, Windoze, Linux) or CPU (32/64 bit) you're running this code on....
David Lindauer's GRDB (DOS) can show the contents of FPU registers
// save area must be aligned on 64-byte boundary...
{ xsave [edi] } db $0f,$ae,$27
On 07.08.2021 17:51, R.Wieser wrote:
...
and for some reason I can't get the FXSAVE to work (my assembler shows
its
age by not knowing the opcode, and trying to use a "db 0Fh,0AEH, ....."
sequence crashes the program).
on older CPUs 0F AE xx will raise exception 6 [illegal opcode] if:
1) bit 5 of xx is 1 (xx 20..3F, 60..7F, A0..BF)
newer CPU may show a few valid instructions (see sandpile.org)
2) mod=3 aka register operand (C0..FF) [memory only!]
3) may raise EXC_6 if not supported
0F AE 90..97 98..9f mean STMXCSR LDMXCSR [support specific]
so I'd recommend either
0F AE 06 00 xx FXSAVE [xx00h] (needs 512 byte DS: buffer !)
or shorter
0F AE 00 FXSAVE [bx+si] (ditto)
or HLL styled :)
0F AE 46 00 FXSAVE [bp+0] (needs 512 byte on SS: stack)
so I'd recommend either...
or shorter
0F AE 00 FXSAVE [bx+si] (ditto)
so I'd recommend either...
or shorter
0F AE 00 FXSAVE [bx+si] (ditto)
For testing purposes I tend to go with the most basic one first, so I took that one.
Remark: I'm on XP 32 bit, so the registers are EBX and ESI respectivily.
Alas, same problem : crash.
Aligning [EBX+ESI] on a 64 byte boundary (as suggested by robert prins) did not make a difference.
I'm starting to lean towards the possibility that the command is refused (does not exist). Is there any way to check it ?
0F AE 00 FXSAVE [bx+si] (ditto)
For testing purposes I tend to go with the most basic one first, so I took that one.
Remark: I'm on XP 32 bit, so the registers are EBX and ESI respectivily.
Alas, same problem : crash.
Aligning [EBX+ESI] on a 64 byte boundary (as suggested by robert prins) did not make a difference.
you seem to work with 32 bit:
0F AE 07 FXSAVE [edi]
you used 27, so I were confused and had you look at my AMD docs,
you seem to work with 32 bit:
I am. Didn't think it would matter much.
0F AE 07 FXSAVE [edi]
I just tried that one, and it worked ! (got 288 bytes of data though, not 512) As a result I'm now thoroughly confused in regard to the mod, reg, r/m encoding. I tried different ones, but only got crashes.
you used 27, so I were confused and had you look at my AMD docs,That value was suggested by Robert (in his code). And as I didn't get anywhere ...
Oh blimy - I don't know how I did it, but I just noticed that I somehow
mixed up the 16 and 32-bit mod/reg/rm encodings. With the MOD and REG both being zero the by R/M targetted registers are rather different between them. :-|
Bottom line: I made a stupid mistake, created non-working code and got
myself confused as a result. And as I presumptiously forgot to mention the basics of what I was busy with (32-bit coding) I did really help you guys find the cause of it. My apologies for that.
I did really help you guys find the cause of it. My apologies for that.
IIRC we got 288 bytes with FSAVE long, 512 bytes may be just the required buffer size.
I was once there as well :) experience can't be bought!
just fine that we could help,
no need for apology.
Robert,
It might have some safeguards against executing data :)
I've used the "trick" before, so I don't think so. Currently I'm
torn between the posibilities that the processor I'm using might
not be having that command, that I'm simply bungling up or that
there is some kind of memory alignment involved (the latter one
would not be the first time I've run into it).
Actually, I believe the registers are a circular file,It has to be, as my example code works : after the second FLD1 the TOS is 6. But I can still execute a FSTP ST(2) ,which seemingly points at 6+2 = 8.
34 years ago I wrote an extention to MS-DOS DEBUG.COM
to examine the x87.
I'm not sure what you mean with an 'extension' (wasn't aware that
Debug supported such a thing), but years ago I wrote something for it
(using memory patching) so it could deal with a few more opcodes.
Well, please make sure the pointer is correct
(trash easily gets caught in the upper bits in mixed-mode)
Ah, but circularity is achieved by masking, 8=0 when masked at 3bits.
...Well, please make sure the pointer is correct
:-) And how do you propose that should be done ?
It sounds like a great idea, but ...
use MOV.
Zero origin, power-of-two size. Check on both.
Ever wonder why there are so many buffers this way?
Debugging with MOV test (hand-assembled) could have caught.
Robert,
Well, please make sure the pointer is correct
:-) And how do you propose that should be done ?
It sounds like a great idea, but ...
(trash easily gets caught in the upper bits in mixed-mode)
Somewhere along the line I forgot to mention that I was programming in 32-bit mode (under Win XP). So, no mixed mode and no trash in the upper bits.
Ah, but circularity is achieved by masking, 8=0 when masked at 3bits.
Well ... It /can/ be achieved that way, but only under
certain conditions (related to origin and size). :-)
The problem has been located though : I simply used the wrong R/M value
while hand-encoding the FXSAVE command (likely mixing up the 16 bit table with the 32 bit one). IOW, I was providing the target addres in a certain register while the command expected it in another register/form.
Robert
use MOV.Well, please make sure the pointer is correct:-) And how do you propose that should be done ?
It sounds like a great idea, but ...
How would that change anything ? If the target for
an FXSAVE is wrong enough that it causes an exception,
how /wouldn't/ that be in the same way wrong for a MOV ?
(lets forget about alignment for a moment)
It would even be making the problem larger, as you would
than need to pick a REG value too - and wonder if it perhaps
is having a negative influence on the result.
FWI, I tried several R/M values, none of which wanted
to work. Bad luck I guess.
In retrospect I should perhaps have tried loading all the
common registers with the same value and tried all R/M
values until something worked. On success it would be
a case of determining which register is the source, and
than look back at the instruction set to find a match -
and from it figure out what the/my mistake was.
It is a purer memory test.
I thought there was question of whether FXSAVE was available
or supported on your CPU.
This checks opcode encoding too.
All GP registers should be available at all times.
Encoding should not be a guessing game.
x86 has quirky indirect addressing modes that
are unlikely to yield to trial-and-error.
And that is effectivily what happened when Wolfgang supplied me with a working encoding for FXSAVE [EDI] : while trying to match the 0x07 to the mod,reg,r/m tables I had used I realized I had been using the wrong one. It was as simple as that.Use
Use
<https://defuse.ca/online-x86-assembler.htm>
for all your "db" needs.
Robert,
Use
<https://defuse.ca/online-x86-assembler.htm>
for all your "db" needs.
Thank you very much. It will certainly come in handy. :-)
... and it doesn't even need JS to "do its thing". <thumbs up>
I tried mov ax,bx and got
6689D8
I guess x86 means 32bit nowadays!
I guess x86 means 32bit nowadays!
"Kerr-Mudd, John" <admin@nospicedham.127.0.0.1> writes:
I guess x86 means 32bit nowadays!
That's the problem with "x86": People use it to mean any of several
different ISAs. So better avoid that term, and use:
8086 (rarely called IA-16) when you mean that instruction set.
IA-32 when you mean that instruction set (first implementation: 80386)
AMD64 when you mean that instruction set (first implementation: AMD K8
(Opteron, Athlon 64))
And then there are extensions, like the additional 80186 and 80286 instructions (plus the 80286 offers protected mode), or SSE, SSE2,
AVX, ...
Now what does that mean for the name of this newsgroup.
<https://defuse.ca/online-x86-assembler.htm>
I tried mov ax,bx and got
6689D8
I guess x86 means 32bit nowadays!
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 47:55:36 |
Calls: | 6,648 |
Files: | 12,198 |
Messages: | 5,329,983 |