• ICAT 'XCPT_BAD_ACCESS' - what does it mean?

    From Andi B.@21:1/5 to All on Wed Dec 27 11:30:38 2017
    This is a multi-part message in MIME format.
    I'm looking for a problem in xwlan widget init code and while remotely debugging with ICAT
    I came across XCPT_BAD_ACCESS. This is with a strcpy operation which I do not see any
    coding error. strcpy works as expected, pointers seem to be valid and code works as (I)
    expect. Anyway ICAT pops up this exception. ICAT let me run the exception handler and all
    is fine. strcpy operation fills the memory as expected.

    Now I ask myself if this is really a problem with the code or this is the way it works
    (and should work, uncommited memory?). I slowly get the feeling I'm chasing the wrong track.

    Call stack shows _validate_ptr, _get_stack_trace in module memport, _add_item 1a3,
    _int_uheap_verify in module rdbg and _chk_if_heap in module dbgstr. Picture attached but
    in case attaching attaching does not work...

    Can someone confirm this is well expected and nothing I should take care any further?
    Otherwise if this is a real problem in the code I'll show the details of course.

    Regards, Andreas



    iVBORw0KGgoAAAANSUhEUgAAAcwAAAEVCAIAAADiiEI2AAAACXBIWXMAADUPAAAWJQERMCGh AAAPV0lEQVR42u3djXHbNhgGYCmn60LJGF2g09gepht0DGehXu9cN2pUlj8gQAIUAD7P9XKu o1AESL4EQenj9f39/QJAGV90AYCQBWjSbf8ivn37ph8BSoXsp99//11XAqyEbOqY9P3nTbPb 7aYrAdZHsvfc/Ezb97VPHQwT+ZdfftGVAKWmC4xkAaJC9jE+TZo6ELIACdMFMYYpLGQBokJ2 41KELEB8yIbnCqajXTe+ABJC9jNGl3J2dj7BSBYgbbrgPaVwzPV61ZUApaYL/vrrL10JUGq6 QMgCFJwu+PPPP3UlgOkCANMFAKYLhCxAcshuq8AtZAHWQ/Z960MV3fgCiBrJbvPbb7/pSoCp 67tHggMU42m1AEIW4Gwh++2H6c8Vqnz19qxqQ03ruAkQcIs8DEa/OXgmd7gCMW99f31V083h VapwhbO0sY92QfGQnU23w46c4YF6H/WkvnWFB/lS+sgjOG/Ixg++Ugee8e9y/77vI2dHQ+zR Lx//drSQ2dWbXVT25qSeUWJWKdzzS5tp6Y2KNnD1jQJNhkZ9KRcTs0lRwui9hokcCJelIFtK 6tmlpY6mZ3sjfE0dXqX9PZ+rgRl3jCN3HqhoJPvY4084vtgwRxHTk3kXu/PiI9eaSEbYGLJJ B2E3R9pjtiFvIB45zA8XVMvewPC0hgjGdEHOS/jSV6CH5WyWgJgN1mPuv8++afYGnm3HgGeG bGfyJsJw1viwlb8PVKe3AY9cExCypYZOuUZ/PnfZOtMFnM0t+xKHc3xZRkmPj20tLS3pjVaX FgiFSsJ91ITNTTu4gfE7xplvsdKftqtwGdh2Oc61QTFdAECm6YLZSbRjxhqpD82tf4y2ekHd +iC0y6bBeacLAEwXAJx+uuCPP/7QFwDZ/Ttd8PXrV30BUGok+/37d30BUCpkDWbzent7+/XX X/UD8L+PcBnPAuTl0wUAQhZAyAIgZAGOM1+7IGPRz8DjT/pbmv0JWA/Zz9z5+PjIsvS3t7dT LU3OAlEj2UuOj3MNv6p7nqUBDB06J3v/vsO2PwG6GsmOJhAuu+ccP7Pyc8y4+uf1ev28hB/9 PrBWqVzRA3WF7D9Z9mPe8tt115xjfMJertevcTl7j+N/1vD1NWolIl+W+wQDCNmVhP0RaaGc XQ2j1XHrI2E/s3M0ng0Ed9HeGT4lV84CpaYLsoTRUsI+8vSRsJf//z5qxiB9iBrfKDkLbLZy 4+ufWLn+/J/rrqvmcML+m6SJI9mPRJtnCe4/ZPxQLWAk+1/OfruGpgKG0RNI4dFIdpqw//75 8/eROfv5ssimJoXstCGGsUCp6YJAvjwSdjWDYkayw1namISdTc/PJby+vr68vGzLYoCjQ3Y1 YWPSdilhl/6M+XTBw9vPDH0ZBO4jWHN9rQvguJCNnCWYHcnmTdhRtm6bHwAoIc83vmLuCIVH so/5gZlZ2rhvfF3f3j7/G//yB5sZqHEkO/uV/KVHV61+wikwkl39nkJMSz7+Pwm7czA7/div byUAmUM2/CjAR53Axw/hnF0dye7P2YzujRp+/lfCAmVHsvfMnZ0ZiMnZciPZ6YTA6w9Zcnb4 v3YX4LiRbGrOLn1OdmfOFr27NRyt21eAzCEbiJ4NORv5OdmkhD3gppZ4BQ4N2XDoHDknK/6A +h1atPuelZ9/fubj6M/h387+CdDVSDbvI1XOszSA9ZHs2+RT/XucZ2kA6yPZUz3E2yPBgSdM F+TNi/MsDSBqugAAIQsgZAGELAAl/Hfjy4eZALK7ur0OUHwkG/n0ATrwecnyMlfgXD9or/Uv wZwsgJAFELIACFkAIQsgZAEQsgBCFkDIPsUBj6QFTnI8Hr/+tw3r9PHxUaLlJRZLxi2StI1i Xnzfterf7sND4Ax7adMH4yivamhI1CPBn7uiwreGg63E+f/+QPi2EsdooP7mh7fX8et/y77z 3X94HDyj1yx1xPBvp381HVJNlxN4X/aE4PDPmO0Y3svjR7h1bsfV/Tn7cVFVP0yPxw3tanH9 9zThFr9m8e80GgENd7LZ1zwO45jBTmA5Sz+zP2cjT677h721jRwfe2B48iTy5BF/XNQ5gp69 +Eg93htd/7qmC2JeX6LHpWpV0bxzuFHb6HX/DPLw3y4tJ9eBXcMEwrPWP3VQWDo3bhKB2k5d dZ4sZwc4J+yH+te/6I2EDfJ8hGtDY3ww64Ra3OgHD8kdF/VcUeXaKLcsDYhpzOg101t+07nn 0Z2W++/Dy+HgHXc0sZ66n0xnPGvbvpHtnf39nuPiuf2wdE9vejxuO96fe75MXf+dYX2L2cli /mq4c6y+JvKNYuatAqktEPNe6y1t4qTtFbOVa/5Eweb2Rh5HVV1u7zzunrL+qetzwPqbkwWI Hc4LWYCyI/pUCsQAFCRkAYQsgJAFQMgCCFmALqx8hOvt7U0fteXl5WX/QjrY7mfrh9bb2/P6 v/8gm87j9fVVJ5ywH1pvb7vrb7oA4Bwhq9oAIGT3xujmJI38h6WT2pngbGfimMJDszVuGt1V +mjvx0+nC9lwuTCVNOkj2e+V8dp9AFLr7X2sTyWrlKee7Og3Gx6hMXzCx3RjB55oNq2UM/1X SeszeqrV6vIDr+8jWYZPPQr059LrV/tzupyk/j9mcBCoKDotnDqsW9rEE3k7a29tHX4r3cjI Tp99cu/So2eTtnFgn4g5f8TsQxnr+9Z2sI3OndMemD7FdvT6pf4JLCe1/5+eO/3poL2z47bG Qnb6KN1yj+MeRXCJBzvuORMOX3+G4/Cc/Tl77bl00u0mZ9tt7+y5vJORbOnHTmxYvtnevi/H nps7q1dy2ntyX7Jsg1NdSUFgWKe9zV1p1T6Snc7dhM9+07sol8lzkO5/tbScyOUvLSdmLwm8 vst7XHtGN5v7c8Ny9D8t7iS3LE1KupyMf33q77etz87l93qoj25nzf6w5/Wry6lksiKwuWPW v8WPFnTQ3qq6/bZt4N16svTaLqA2t4ZOCMec6ORsx2MKON6Xx8iOk7C5z9kPrbe33fVXhQug /EgWgB5CNtcM3f7lbFuCGcbnOr7/Y+6ez36gpdFdpY/2PgrEGMlCh8l+/5xmu19VaL29j/Wp ZJVuGTdGZKWl+BIHqctZrfC09KZJXx8cve+0hkPk+m9Yz6ccadPiarPrufT61P5ZfX25EhlL RoXBwjvP8H/D/7Barbe3n298LX1jJ9c2CLxyaTmz7xtYn6SEDbzv6Bsmoy+wRa7P6no+a2cd rWF4+862aLZ/AssJ9OezuqXRuDxze6dlV5scyea63Ahs5qSz07aveO7fDEsVwnZ+hbeby8k9 VdPqOc5nrz2XTrrd5Gy77Z09l/cTsrlalTSy2/yd+kKbYed3/Cs//C6nFDnH1+KJU3vLyXPj q0TPFtpa4ZLb9Y8KqWpEf6rzTSvt7efJCDGl7C9zz2WLvHGRupylrIyZdoifgdq8/qvrU7oO b6HRzeZ6o9vaW7Q2PP1dcDQ/Jxu4MZXlYjN1OUnrE/NUzpilBf5hav/U9rzP1R/2vP4S/STU 5/ZSYHPn3a9q2+5Nt7eqbt9YICZQEzbLwP5ZfVTb+gCt21jqMFf01BZewlSXQl6+8QUgZAGE 7MF8+AkQsskRuTM6D0jepLeoqhpQ0d44yUM0VOG6qMKV6Nbu3n/8HZVt1YmmP9Nc7iRtu6X6 DNp7/Pp3+7Xa1SpT0xItqbk5W3YgqTpXoMpUONbDT8Ot83v3m/fUcPWswNfYA1+1VJVKezsb fj0hZFerTI2eM7zhIemz36ea3cYbqkZlaW/4fNDciGZ1O16iq201etmkCleL8x7nrcIV+ZXT Z13m511+u1eOMd/2aeKrwBlzJ+bKWhWuevbePqcLKhy5FO3lPkZqbM6d1Ssb7T05n5N92ghX h3TQHFW47GZPGMmmTgVsmCIYfWxodFUbrs6VfS8JrH+XJaMC7V2tttXuY69odBftdk62dPWp LNW54otFbVt+69ma2qguO+GiCpcqXM8K2dLVqlTD6nWWwBbkbDaGbOlDxaHY2eWbTuC03PgC ELIApw1Z94v1G1DLSHZ/snSfTW0V7lKF66IKlypcQRk+wuW2RsZ+U6yrztxRhauh9vZWhWv6 dYDV6k2RnxMOFHa5RFR7Wv19YD0j97zhvpW6/PjqFeF+KL2nqsJ1UYWrtfb2VoVrtYpEiW0Q We1p9lsGS+u2ecUC7xtY/lK/XTYVfjxyRKAK10kus1Thqmi64PgzSeAcm/T7pPUc7XYxFRpT a8vWtmerwpU0nmj9RNJZe3uuwlXbZcIBIymzqKca3yXtD6pw0cbnZCvZco997lmnx9b3YFW4 tPeEu9ktS2Pipz/i7zWlVm9aunRNrRqVekW/4ZI5qd9qq2KlChcNDcCbn5PdVpUnvtmzc4Kp VYIu+ap/JbVoW1Wq1EUdPPeSvb1NHLFJ+5sqXBW2ouqQPVu1rTqrf7V+c0kVLs7s9twTwhMP uXrGjG2dlk+4/rCHAjEAQhZAyBbifjQgZNczMXtWRhbakdF5N2ve/qx/66jC1WJ7e6vCVZpq VTz9iFWFq6H2dlWFK/DRnPCHgeNbnqvKw/FVu3oave6sthXu59GiKuxnVbjaam8/Vbi2VZ9K PbdkqVZVVdWuFocze6ptBTpq6ZissJ9V4Wp0iNBVFa6Yb17lanbqHvCUql39HXLb+mHDv02t XvaUTgicA1ThqmT9LyeswlXbRNWzqnadMKO3Xcc0kTur7e0sZztubyHZPl0Q37mqhHRvuH07 3taqcNW/+9Vg+0h2/yVkzFxq3nmGmPXPW7Wry1FMUrWt8H5ytnuJHLzrnqsK14ZqPVkKOz29 alcfl/l7NspqLceaO1kVrhbbqwpX/uW7VVXbZZqeh6iQbaUK1wGHtNTQRbCBAjEAQhZAyAIg ZAGELICQBUDIAghZACELgJAFELIAQhYAIQsgZAGELABCFkDIAghZXQAgZAGELABCFkDIAghZ AIQsgJAFELIACFkAIQsgZAEQsgBCFkDIAiBkAYQsgJAFQMgCCFkAhCyAkAUQsgAIWQAhCyBk ARCyAEIWQMgCIGQBhCyAkAVAyAIIWQAhC4CQBRCyAAhZACELIGQBELIAQhZAyAIgZAGELICQ BUDIAghZACELgJAFELIAQhYAIQsgZAEQsgBCFkDIAiBkAYQsgJAFQMgCCFkAIQuAkAUQsgBC FgAhCyBkAYQsAEIWQMgCIGQBhCyAkAVAyAIIWQAhC4CQBRCyAEIWgI3+BgmdSSaVwlnMAAAA AElFTkSuQmCC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lars Erdmann@21:1/5 to Lars Erdmann on Wed Dec 27 13:55:17 2017
    ... or you need to use ".i" to page in the data that is accessed by the
    strcpy routine.

    Lars

    On 27.12.17 13.53, Lars Erdmann wrote:
    I would not worry about it too much.
    ICAT is not a very stable tool which also might be due to uncoordinated changes in the debug kernel.
    Unfortunately, "XCPT_BAD_ACCESS" is not listed as exception in the
    Control Programming guide, therefore it is not exactly clear what it means. It's possible that code or data needs to be paged in.
    In that case, you could also try the ".i" KDB instruction in the
    passthru window. Look at the mixed source/assembly output and pass a
    linear address to ".i" somewhere into the code you intend to run.
    ".i" will always load full 4k pages aligned to 4k addresses.

    Lars


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lars Erdmann@21:1/5 to Andi B. on Wed Dec 27 13:53:27 2017
    I would not worry about it too much.
    ICAT is not a very stable tool which also might be due to uncoordinated
    changes in the debug kernel.
    Unfortunately, "XCPT_BAD_ACCESS" is not listed as exception in the
    Control Programming guide, therefore it is not exactly clear what it means. It's possible that code or data needs to be paged in.
    In that case, you could also try the ".i" KDB instruction in the
    passthru window. Look at the mixed source/assembly output and pass a
    linear address to ".i" somewhere into the code you intend to run.
    ".i" will always load full 4k pages aligned to 4k addresses.

    Lars


    On 27.12.17 11.30, Andi B. wrote:
    I'm looking for a problem in xwlan widget init code and while remotely debugging with ICAT I came across XCPT_BAD_ACCESS. This is with a strcpy operation which I do not see any coding error. strcpy works as expected, pointers seem to be valid and code works as (I) expect. Anyway ICAT pops
    up this exception. ICAT let me run the exception handler and all is
    fine. strcpy operation fills the memory as expected.

    Now I ask myself if this is really a problem with the code or this is
    the way it works (and should work, uncommited memory?). I slowly get the feeling I'm chasing the wrong track.

    Call stack shows _validate_ptr, _get_stack_trace in module memport,
    _add_item 1a3, _int_uheap_verify in module rdbg and _chk_if_heap in
    module dbgstr. Picture attached but in case attaching attaching does not work...

    Can someone confirm this is well expected and nothing I should take care
    any further? Otherwise if this is a real problem in the code I'll show
    the details of course.

    Regards, Andreas



    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Steven Levine@21:1/5 to Andi B. on Wed Dec 27 14:23:15 2017
    On Wed, 27 Dec 2017 10:30:38 UTC, "Andi B." <andi.b@gmx.net> wrote:

    Hi Andi,

    I came across XCPT_BAD_ACCESS. This is with a strcpy operation which I do not see any
    coding error. strcpy works as expected, pointers seem to be valid and code works as (I)
    expect. Anyway ICAT pops up this exception. ICAT let me run the exception handler and all
    is fine. strcpy operation fills the memory as expected.

    There may be no error here. The reference to XCPT_BAD_ACCESS is in gam5lde.msg:

    Message: PMD0210
    XCPT_BAD_ACCESS

    I suspect that the developers got lazy when they created the .msg
    file. Since you mention uncommitted memory, I am going to guess that
    the actual exception was

    #define XCPT_ACCESS_VIOLATION 0xC0000005

    with ExceptionInfo[ 0 ] set to one of:

    /* ExceptionInfo[ 0 ] - Access Code: XCPT_READ_ACCESS
    XCPT_WRITE_ACCESS
    XCPT_SPACE_ACCESS
    XCPT_LIMIT_ACCESS
    XCPT_UNKNOWN_ACCESS */

    To verify this you need to look at the content of the
    ExceptionReportRecord.

    Now I ask myself if this is really a problem with the code or this is the way it works
    (and should work, uncommited memory?). I slowly get the feeling I'm chasing the wrong track.

    It could be you are chasing a red herring. The labels shown in that
    stack trace are all part of the VAC 3.65 runtime, unless I am
    misinterpreting something.

    What is the widget doing when this exception occurs? What is the
    strcpy destination? An uncommitted stack guard page? Is your process
    running when the exception occurs?

    Some of this is just how the kernel debugger works? It appears you
    have the debugger configured to capture/report ring3 exceptions.

    Steven


    --
    ---------------------------------------------------------------------
    Steven Levine <steve53@earthlink.bogus.net>
    DIY/Warp/BlueLion etc. www.scoug.com www.arcanoae.com www.warpcave.com ---------------------------------------------------------------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andi B.@21:1/5 to Steven Levine on Fri Dec 29 10:46:30 2017
    Hi,
    thanks both of you.

    Steven Levine schrieb:
    On Wed, 27 Dec 2017 10:30:38 UTC, "Andi B." <andi.b@gmx.net> wrote:

    Hi Andi,

    I came across XCPT_BAD_ACCESS. This is with a strcpy operation which I do not see any
    coding error. strcpy works as expected, pointers seem to be valid and code works as (I)
    expect. Anyway ICAT pops up this exception. ICAT let me run the exception handler and all
    is fine. strcpy operation fills the memory as expected.

    There may be no error here. The reference to XCPT_BAD_ACCESS is in gam5lde.msg:

    Message: PMD0210
    XCPT_BAD_ACCESS

    I suspect that the developers got lazy when they created the .msg
    file. Since you mention uncommitted memory, I am going to guess that
    the actual exception was

    #define XCPT_ACCESS_VIOLATION 0xC0000005

    with ExceptionInfo[ 0 ] set to one of:

    /* ExceptionInfo[ 0 ] - Access Code: XCPT_READ_ACCESS
    XCPT_WRITE_ACCESS
    XCPT_SPACE_ACCESS
    XCPT_LIMIT_ACCESS
    XCPT_UNKNOWN_ACCESS */

    To verify this you need to look at the content of the
    ExceptionReportRecord.

    Have to learn about.


    Now I ask myself if this is really a problem with the code or this is the way it works
    (and should work, uncommited memory?). I slowly get the feeling I'm chasing the wrong track.

    It could be you are chasing a red herring. The labels shown in that
    stack trace are all part of the VAC 3.65 runtime, unless I am
    misinterpreting something.

    Yes. All in the runtime and I do not find a way back to the caller. Based on my very
    limited knowledge in that area it seems to me the stack is trashed and so ICAT does not
    find the way back to the calling code in my app. But as said guessing. I know that i know
    nothing.


    What is the widget doing when this exception occurs? What is the
    strcpy destination? An uncommitted stack guard page? Is your process running when the exception occurs?

    I can narrow down the problem to code like this -

    typedef struct _DIM {
    HMODULE hmod;
    ULONG ulModuleId;
    CHAR szModuleBaseName[32];
    ULONG ulDriverCount;
    PFNIDS pfnids;
    <SNIP>
    } DIM, *PDIM;

    static PDIM padim = NULL;

    padim = malloc( ulDimTableSize); // actually about 215 bytes in my case
    if (!padim)
    {
    rc = ERROR_NOT_ENOUGH_MEMORY;
    break;
    }
    memset( padim, 0xAA, ulDimTableSize);
    strcpy( padim->szModuleBaseName, "TestStringAB_TEST");

    The strcpy triggers the exception in ICAT.

    I do not see why this should trash the stack so my above assumption is probably wrong.
    Moreover this is old code in xwlan/wlanstat and works since many years. The good thing is
    - while being there I found out that http://trac.netlabs.org/wpstk does no compile with
    VAC anymore and needs attention too. So no problem finding new tasks.


    Some of this is just how the kernel debugger works? It appears you
    have the debugger configured to capture/report ring3 exceptions.

    I've 'set CAT_KDB_INIT="vsf *"'. At the first sight I did not even find any reference to
    vsf and vc except on your page. And not much info in ICAT files. If I ever would find the
    time to learn more about these basics in debugging....

    Maybe I should add exceptq to wlanstat an let your trap tool decode what's going wrong
    then playing endless hours with ICAT and trying to decode myself. Moreover the starting
    problem seems to be unrelated to what I'm looking here anyway. Maybe you want to have a
    look at - http://trac.netlabs.org/xwlan/ticket/46 which is the reason why I'm started to
    play ICAT.

    Andi

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Steven Levine@21:1/5 to Andi B. on Sat Dec 30 12:05:10 2017
    On Fri, 29 Dec 2017 09:46:30 UTC, "Andi B." <andi.b@gmx.net> wrote:

    HI Andi,

    To verify this you need to look at the content of the ExceptionReportRecord.

    Have to learn about.

    The structures are well documented, with the exception of the FP
    specific data. A pointer to the Exception Report Record is passed to
    the handler. If you dump the data as dwords, it's readable if you
    works out the field offsets.

    Yes. All in the runtime and I do not find a way back to the caller.

    Most likely because some of the code is not using standard stack
    frames. It's also possible the stack is corrupted. What you need to
    do in this case is dump the stack as dwords and walk the stack by
    hand.

    I can narrow down the problem to code like this -
    typedef struct _DIM {
    HMODULE hmod;
    ULONG ulModuleId;
    CHAR szModuleBaseName[32];
    ULONG ulDriverCount;
    PFNIDS pfnids;
    <SNIP>
    } DIM, *PDIM;

    static PDIM padim = NULL;

    padim = malloc( ulDimTableSize); // actually about 215 bytes in my case
    if (!padim)
    {
    rc = ERROR_NOT_ENOUGH_MEMORY;
    break;
    }
    memset( padim, 0xAA, ulDimTableSize);
    strcpy( padim->szModuleBaseName, "TestStringAB_TEST");

    The strcpy triggers the exception in ICAT.

    Can I assume that this is the code near src\lib\drvapi\drvaccess.c:86?

    FWIW, I've implemented xwlan fixes in the past so I am somewhat
    familiar with the code.

    The buffer size is defined by:

    ulDimTableSize = ulModuleCount * sizeof( DIM);

    Did you check ulModuleCount? If WtkLoadModules returns 0 modules, the
    memset will succeed, but the strcpy will trap.

    If that's not it, I would switch icat to assembly mode and step though
    the strcpy code. When you get to the movs instruction, look at ESI
    and EDI, the source and destination addresses respectively. ECX will
    be the copy count.

    I do not see why this should trash the stack so my above assumption is probably wrong.

    This looks much more like a heap issue, than a stack issue.

    The good thing is
    - while being there I found out that http://trac.netlabs.org/wpstk does no compile with
    VAC anymore and needs attention too.
    So no problem finding new tasks.

    :-)

    I've 'set CAT_KDB_INIT="vsf *"'. At the first sight I did not even find any reference to
    vsf and vc except on your page.

    The V command is pretty much fully documented in the OS/2 Debugging
    Handbook.

    And not much info in ICAT files.

    I would not expect the ICAT docs to cover this in much detail, since
    it is covered elsewhere. icatfaq.html does show how to use SET
    CAT_KDB_INIT to do what is typically done in kdb.ini

    What you want to use is:

    CAT_KDB_INIT="vsf *;vce"

    to let the kernel handle page faults normally.

    If I ever would find the
    time to learn more about these basics in debugging....

    Necessity is the mother of invention, as they say. :-)

    Maybe I should add exceptq to wlanstat an let your trap tool decode what's going wrong
    then playing endless hours with ICAT and trying to decode myself.

    It's likely to be better in the long run, especially if an issue comes
    up on someone else's system.

    Moreover the starting
    problem seems to be unrelated to what I'm looking here anyway.

    Yes, I tend to agree. If I knew EDI at the time of the trap, I would
    probably know for sure. Since the code is continues normally after
    the exception this is likely. You can see the registers if you do

    r

    in the PassThru window. This will also tell you exactly what the
    kernel debugger thinks the trap is.

    Maybe you want to have a
    look at - http://trac.netlabs.org/xwlan/ticket/46 which is the reason why I'm started to
    play ICAT.

    This one is definityly definitely stack corruption and exceptq will
    help because if you have symbols installed, it will give you a name
    for the EIP address.

    Steven

    --
    ---------------------------------------------------------------------
    Steven Levine <steve53@earthlink.bogus.net>
    DIY/Warp/BlueLion etc. www.scoug.com www.arcanoae.com www.warpcave.com ---------------------------------------------------------------------

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Andi B.@21:1/5 to Steven Levine on Sun Dec 31 12:02:50 2017
    Steven Levine schrieb:
    On Fri, 29 Dec 2017 09:46:30 UTC, "Andi B." <andi.b@gmx.net> wrote:

    HI Andi,

    To verify this you need to look at the content of the
    ExceptionReportRecord.

    Have to learn about.

    The structures are well documented, with the exception of the FP
    specific data. A pointer to the Exception Report Record is passed to
    the handler. If you dump the data as dwords, it's readable if you
    works out the field offsets.

    Yes. All in the runtime and I do not find a way back to the caller.

    Most likely because some of the code is not using standard stack
    frames. It's also possible the stack is corrupted. What you need to
    do in this case is dump the stack as dwords and walk the stack by
    hand.

    I can narrow down the problem to code like this -
    typedef struct _DIM {
    HMODULE hmod;
    ULONG ulModuleId;
    CHAR szModuleBaseName[32];
    ULONG ulDriverCount;
    PFNIDS pfnids;
    <SNIP>
    } DIM, *PDIM;

    static PDIM padim = NULL;

    padim = malloc( ulDimTableSize); // actually about 215 bytes in my case >> if (!padim)
    {
    rc = ERROR_NOT_ENOUGH_MEMORY;
    break;
    }
    memset( padim, 0xAA, ulDimTableSize);
    strcpy( padim->szModuleBaseName, "TestStringAB_TEST");

    The strcpy triggers the exception in ICAT.

    Can I assume that this is the code near src\lib\drvapi\drvaccess.c:86?

    Yes.


    FWIW, I've implemented xwlan fixes in the past so I am somewhat
    familiar with the code.

    The buffer size is defined by:

    ulDimTableSize = ulModuleCount * sizeof( DIM);

    Did you check ulModuleCount? If WtkLoadModules returns 0 modules, the
    memset will succeed, but the strcpy will trap.

    Yes. Moreover I added the strcpy above by myself to reassure. You may notice my comment
    about the 215 bytes which malloc successfully allocated a few lines above in the code I
    posted here (slightly changed to drvaccess). malloc allocates successfully, memset sets it
    correctly, my added strcpy line triggers the exception but letting the exception handler
    running strcpy worked as expected.

    My code now is -

    TraceAB("ulDimTableSize=%d\n", ulDimTableSize);
    _interrupt(3); // ICAT stops here as expected
    padim = malloc( ulDimTableSize); // ulDimTableSize is 216
    if (!padim) // padim is 0x00494130 = valid
    {
    rc = ERROR_NOT_ENOUGH_MEMORY;
    break;
    }
    memset( padim, 0xAA, ulDimTableSize);// padim including the string region is filled
    correctly
    TraceAB("padim=0x%08X\n", padim); // additional trace messages writes to file and com1
    TraceAB("padim->szModuleBaseName=0x%08X\n", padim->szModuleBaseName);
    strcpy( padim->szModuleBaseName, "TestStringAB_TEST"); // <--- this triggers exception
    TraceAB("padim=0x%08X\n", padim);

    I've uploaded the the passtru window content here ' https://www.pic-upload.de/view-34567254/icat_xwlan_trap.png.html ' as the newsgroup does
    not allow attachments. This is when I tried to 'Step over' the strcpy line then in the
    exception dialog 'Examine....' and the reading from passtru. Register monitor and call
    stack window says the same.

    The TraceAB function logs the printf style message to a file and in parallel sends it out
    at com1. So in parallel to running ICAT a see the TraceAB messages in pmdf (or zoc) at the
    same time on the host. Just to assure this is really this special strcpy which triggers
    the problem (with or without running ICAT).

    For completeness here what pmdf says when running the above code (to rule out ICAT)
    including my TraceAB messages -

    wlanDriverAccessInitialize
    Symbols linked (genmac)
    Symbols linked (genprism)
    TrcMsgV len=48 (XWLAN: 0: Loading Driver Modules, count 2 ) WtkLoadModules done
    ulDimTableSize=216
    eax=000b0a6b ebx=00000000 ecx=00485020 edx=000003f8 esi=00000000 edi=00000000 eip=0003fe91 esp=000f7300 ebp=000f756c iopl=0 -- -- -- up ei pl nz ac pe nc cs=005b ss=0053 ds=0053 es=0053 fs=150b gs=0000 cr2=414c5758 cr3=00225000 p=00 005b:0003fe91 cc int 3
    ##g
    padim=0x00494130
    padim->szModuleBaseName=0x00494138
    Trap 13 (0DH) - General Protection Fault 0000
    eax=61767264 ebx=61767264 ecx=61767264 edx=00000008 esi=00000000 edi=00000008 eip=0006427c esp=000f71d0 ebp=000f71ec iopl=0 rf -- -- nv up ei pl nz na po nc cs=005b ss=0053 ds=0053 es=0053 fs=150b gs=0000 cr2=414c5758 cr3=00225000 p=00 005b:0006427c 8a19 mov bl,byte ptr [ecx] ds:61767264=invalid ##k
    005b:000642ef 01010101 01010101 01010101 01010101 _get_stack_trace + 5f 005b:0005ff0e 00480000 004970e4 000f7244 000665c4 _int_uheap_verify + 6de 005b:0005fc1d 00480000 00000000 00081f3c 000001a9 _int_uheap_verify + 3ed 005b:00061fbb 00480000 00081f3c 000001a9 00000001 _chk_if_heap + bb 005b:0005a537 00494138 000f760c 00081f3c 000001a9 _debug_strcpy + 47 005b:73656363 00632e73 6d75645f 6e6f4370 7463656e
    ##


    If that's not it, I would switch icat to assembly mode and step though
    the strcpy code.

    I've done that before and went down _chk_if_heap(dbgstr) / _int_uheap_verify(rdbg) /
    _add_item 1a3(rdbg) / _get_stack_trace(memport) / _validate_ptr(memport). I then decided
    this all is more a 'debug kernel' or debugging (ICAT - pmdf) problem/behavior than a real
    application problem.

    When you get to the movs instruction, look at ESI
    and EDI, the source and destination addresses respectively. ECX will
    be the copy count.

    IIRC these all worked fine (memory display proves the string is copied correctly) but
    afterwards the _validate_ptr thinks there is something wrong (while I think it isn't).

    We can go down this road again if you like. But I think we need realtime IRC chat in
    parallel.


    I do not see why this should trash the stack so my above assumption is probably wrong.

    This looks much more like a heap issue, than a stack issue.

    ok.


    The good thing is
    - while being there I found out that http://trac.netlabs.org/wpstk does no compile with
    VAC anymore and needs attention too.
    So no problem finding new tasks.

    :-)

    I've 'set CAT_KDB_INIT="vsf *"'. At the first sight I did not even find any reference to
    vsf and vc except on your page.

    The V command is pretty much fully documented in the OS/2 Debugging
    Handbook.

    And not much info in ICAT files.

    I would not expect the ICAT docs to cover this in much detail, since
    it is covered elsewhere. icatfaq.html does show how to use SET
    CAT_KDB_INIT to do what is typically done in kdb.ini

    What you want to use is:

    CAT_KDB_INIT="vsf *;vce"

    to let the kernel handle page faults normally.

    I found this on your page and set it that way. Although I still didn't read the debugging
    handbook and don't really understand what the v* command does. But I still hope I can live
    without knowing the deeper details ;-). Idebug has a list box with the various exceptions
    to be selected. Something I didn't find in ICAT.


    If I ever would find the
    time to learn more about these basics in debugging....

    Necessity is the mother of invention, as they say. :-)

    Maybe I should add exceptq to wlanstat an let your trap tool decode what's going wrong
    then playing endless hours with ICAT and trying to decode myself.

    It's likely to be better in the long run, especially if an issue comes
    up on someone else's system.

    Done although not yet tested. Problem is what I see here only happens with the debug
    kernel. And with that I never can run wlanstat to the point where exceptq does its thing.
    Running the same wlanstat app (without the int3) on the retail kernel runs without problems.


    Moreover the starting
    problem seems to be unrelated to what I'm looking here anyway.

    Yes, I tend to agree. If I knew EDI at the time of the trap, I would probably know for sure. Since the code is continues normally after
    the exception this is likely. You can see the registers if you do

    r

    in the PassThru window. This will also tell you exactly what the
    kernel debugger thinks the trap is.

    Maybe you want to have a
    look at - http://trac.netlabs.org/xwlan/ticket/46 which is the reason why I'm started to
    play ICAT.

    This one is definityly definitely stack corruption and exceptq will
    help because if you have symbols installed, it will give you a name
    for the EIP address.

    To my eyes this looks very similar to what I see here.

    Andreas


    Steven


    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lars Erdmann@21:1/5 to Andi B. on Sun Dec 31 21:30:32 2017
    Hi Andy,

    I think you are right about functions not creating a proper stack frame.
    I had these issues with ICAT as well.

    Lars

    On 31.12.17 12.02, Andi B. wrote:
    Steven Levine schrieb:
    On Fri, 29 Dec 2017 09:46:30 UTC, "Andi B." <andi.b@gmx.net> wrote:

    HI Andi,

    To verify this you need to look at the content of the
    ExceptionReportRecord.

    Have to learn about.

    The structures are well documented, with the exception of the FP
    specific data. A pointer to the Exception Report Record is passed to
    the handler. If you dump the data as dwords, it's readable if you
    works out the field offsets.

    Yes. All in the runtime and I do not find a way back to the caller.

    Most likely because some of the code is not using standard stack
    frames. It's also possible the stack is corrupted. What you need to
    do in this case is dump the stack as dwords and walk the stack by
    hand.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Paul Ratcliffe@21:1/5 to Andi B. on Wed Jan 3 19:26:30 2018
    On Sun, 31 Dec 2017 12:02:50 +0100, Andi B. <andi.b@gmx.net> wrote:

    padim=0x00494130
    padim->szModuleBaseName=0x00494138
    Trap 13 (0DH) - General Protection Fault 0000
    eax=61767264 ebx=61767264 ecx=61767264 edx=00000008 esi=00000000 edi=00000008 eip=0006427c esp=000f71d0 ebp=000f71ec iopl=0 rf -- -- nv up ei pl nz na po nc
    cs=005b ss=0053 ds=0053 es=0053 fs=150b gs=0000 cr2=414c5758 cr3=00225000 p=00
    005b:0006427c 8a19 mov bl,byte ptr [ecx] ds:61767264=invalid
    ##k
    005b:000642ef 01010101 01010101 01010101 01010101 _get_stack_trace + 5f 005b:0005ff0e 00480000 004970e4 000f7244 000665c4 _int_uheap_verify + 6de 005b:0005fc1d 00480000 00000000 00081f3c 000001a9 _int_uheap_verify + 3ed 005b:00061fbb 00480000 00081f3c 000001a9 00000001 _chk_if_heap + bb 005b:0005a537 00494138 000f760c 00081f3c 000001a9 _debug_strcpy + 47 005b:73656363 00632e73 6d75645f 6e6f4370 7463656e

    That value in ECX is ASCII from somewhere ("drva").
    You need to work out how it got there from the preceding code and then why.
    I expect there's a buffer overrun somewhere corrupting something, but it can be a nightmare to find, even with a decent debugger.

    FWIW, using strcpy() is almost bound to end up with problems like this at
    some point. You really shouldn't be using it any more.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)