• VMS Basic strings class D vs class S

    From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to All on Sun Feb 25 16:33:33 2024
    If I have understood it correctly then VMS Basic
    strings use class D descriptors.

    That is very nice.

    But what happens if non-Basic code call Basic
    code with a string using a class S descriptor?
    For input/read? For output/write?

    Does the Basic runtime call some STR$ function that
    understands the difference between S and D and handle
    A properly? Or will I get a runtime error due to invalid
    string?

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dave Froble@21:1/5 to All on Sun Feb 25 18:55:25 2024
    On 2/25/2024 4:33 PM, Arne Vajhøj wrote:
    If I have understood it correctly then VMS Basic
    strings use class D descriptors.

    That is very nice.

    But what happens if non-Basic code call Basic
    code with a string using a class S descriptor?
    For input/read? For output/write?

    Does the Basic runtime call some STR$ function that
    understands the difference between S and D and handle
    A properly? Or will I get a runtime error due to invalid
    string?

    Arne

    I don't know the correct answer, but, at a guess, whatever is called to handle the string quite likely will evaluate the descriptor and do "the right thing". That would be my bet. Otherwise, why have descriptors?

    --
    David Froble Tel: 724-529-0450
    Dave Froble Enterprises, Inc. E-Mail: davef@tsoft-inc.com
    DFE Ultralights, Inc.
    170 Grimplin Road
    Vanderbilt, PA 15486

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Dave Froble on Sun Feb 25 19:06:42 2024
    On 2/25/2024 6:55 PM, Dave Froble wrote:
    On 2/25/2024 4:33 PM, Arne Vajhøj wrote:
    If I have understood it correctly then VMS Basic
    strings use class D descriptors.

    That is very nice.

    But what happens if non-Basic code call Basic
    code with a string using a class S descriptor?
    For input/read? For output/write?

    Does the Basic runtime call some STR$ function that
    understands the difference between S and D and handle
    A properly? Or will I get a runtime error due to invalid
    string?

    I don't know the correct answer, but, at a guess, whatever is called to handle the string quite likely will evaluate the descriptor and do "the
    right thing". That would be my bet.  Otherwise, why have descriptors?

    I am pretty sure that Basic will not corrupt anything, but handling it
    and causing an exception would both be sort of acceptable behavior.

    A trivial example works:

    $ type f.for
    program f
    character*80 s
    call b1('ABC')
    call b2(s)
    write(*,*) '|'//trim(s)//'|'
    end
    $ type b.bas
    sub b1(string s)
    print "|" + s + "|"
    end sub
    !
    sub b2(string s)
    s = "ABC"
    end sub
    $ for f
    $ bas b
    $ link f + b
    $ run f
    |ABC|
    |ABC|

    But I can't but wonder if it always works.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to All on Mon Feb 26 03:38:29 2024
    On Sun, 25 Feb 2024 16:33:33 -0500, Arne Vajhøj wrote:

    If I have understood it correctly then VMS Basic strings use class D descriptors.

    The “VAX BASIC User Manual” mentions both dynamic and fixed-length
    strings. Chapter 13 explains that strings are fixed-length when part of
    COMMON, MAP or RECORD statements, otherwise they are dynamic. Fixed-length strings obviously cannot have their storage reallocated.

    In chapter 21, it mentions that, if you pass strings by descriptor from a language that doesn’t understand dynamic strings (e.g. Fortran), then they are passed as fixed-length strings.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Hoffman@21:1/5 to All on Mon Feb 26 16:17:00 2024
    On 2024-02-25 21:33:33 +0000, Arne Vajhj said:

    If I have understood it correctly then VMS Basic strings use class D descriptors.

    That is very nice.

    But what happens if non-Basic code call Basic code with a string using
    a class S descriptor? For input/read? For output/write?

    Does the Basic runtime call some STR$ function that understands the difference between S and D and handle A properly? Or will I get a
    runtime error due to invalid string?

    "It depends."

    Most everything in most of the traditional languages and in the RTLs
    does the right thing with both dynamic and static text strings, though
    the app code involved might not. BASIC app code works pretty well here,
    absent "heroic" efforts by the app developer.

    If the app code assumes a dynamic arriving and gets handed static, the
    RTL will either copy it, or space-pad the results into the static, or
    the RTL will return a string truncation error. BASIC space-pads into
    static string buffers if and as needed. Or truncates with an error.

    Apps written in C, C++, BLISS, MACRO32, and probably some others may or
    may not do the right thing with descriptors, as these languages can
    need to handle descriptors in app code due to the (~lack of) descriptor
    support in those languages. (Yes, I well know about dscdef.h,
    descrip.h, et al., thanks.) Some devs will use RTL calls, and some use
    explicit code.

    As for home-grown descriptor code, few apps (nobody?) implements all of
    the different sorts of descriptors available. Not outside of the RTL
    itself, that is.

    Apps expecting to work with dynamic descriptors might fail with the
    truncation error as mentioned, and apps expecting to massage static
    descriptors directly and not coded sufficiently cautiously around any
    arriving dynamic strings can fail with heap and other errors.

    Of the common languages, Pascal utilizes a wide variety of descriptors.
    Calling into or getting called from Pascal tends to teach much about descriptors and descriptor usage.

    it wouldn't surprise me to learn that BASIC will fail to work correctly
    with 64-bit string descriptors, though. Lots of home-grown app code
    also won't. I also wouldn't expect the RTLs to work with encodings
    other than ASCII and DEC MCS, either. And UTF-8 will fail in the
    expected places, and most searching and sorting tends not to be
    sensitive to the (written) language used within the text string.



    --
    Pure Personal Opinion | HoffmanLabs LLC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Stephen Hoffman on Wed Feb 28 09:38:03 2024
    On 2/26/2024 4:17 PM, Stephen Hoffman wrote:
    On 2024-02-25 21:33:33 +0000, Arne Vajhøj said:

    If I have understood it correctly then VMS Basic strings use class D
    descriptors.

    That is very nice.

    But what happens if non-Basic code call Basic code with a string using
    a class S descriptor? For input/read? For output/write?

    Does the Basic runtime call some STR$ function that understands the
    difference between S and D and handle A properly? Or will I get a
    runtime error due to invalid string?

    "It depends."

    Most everything in most of the traditional languages and in the RTLs
    does the right thing with both dynamic and static text strings, though
    the app code involved might not. BASIC app code works pretty well here, absent "heroic" efforts by the app developer.

    If the app code assumes a dynamic arriving and gets handed static, the
    RTL will either copy it, or space-pad the results into the static, or
    the RTL will return a string truncation error. BASIC space-pads into
    static string buffers if and as needed. Or truncates with an error.

    Apps expecting to work with dynamic descriptors might fail with the truncation error as mentioned, and apps expecting to massage static descriptors directly and not coded sufficiently cautiously around any arriving dynamic strings can fail with heap and other errors.

    Basic cannot stuff 200 bytes into a 100 bytes fixed length string. That
    is fair.

    it wouldn't surprise me to learn that BASIC will fail to work correctly
    with 64-bit string descriptors, though. Lots of home-grown app code also won't.

    I sort of get the impression that using 64 bit descriptors is like
    doing a bungee jump.

    :-)

    I also wouldn't expect the RTLs to work with encodings other than ASCII and DEC MCS, either. And UTF-8 will fail in the expected places,
    and most searching and sorting tends not to be sensitive to the
    (written) language used within the text string.

    I would assume that it works as long as the string is considered
    a sequence of bytes not a sequence of characters.

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Scott Dorsey@21:1/5 to arne@vajhoej.dk on Wed Feb 28 14:56:52 2024
    In article <urnggb$3trjg$1@dont-email.me>,
    =?UTF-8?Q?Arne_Vajh=C3=B8j?= <arne@vajhoej.dk> wrote:
    On 2/26/2024 4:17 PM, Stephen Hoffman wrote:
    On 2024-02-25 21:33:33 +0000, Arne Vajhøj said:

    If I have understood it correctly then VMS Basic strings use class D
    descriptors.

    That is very nice.

    But what happens if non-Basic code call Basic code with a string using
    a class S descriptor? For input/read? For output/write?

    Does the Basic runtime call some STR$ function that understands the
    difference between S and D and handle A properly? Or will I get a
    runtime error due to invalid string?

    "It depends."

    Most everything in most of the traditional languages and in the RTLs
    does the right thing with both dynamic and static text strings, though
    the app code involved might not. BASIC app code works pretty well here,
    absent "heroic" efforts by the app developer.

    If the app code assumes a dynamic arriving and gets handed static, the
    RTL will either copy it, or space-pad the results into the static, or
    the RTL will return a string truncation error. BASIC space-pads into
    static string buffers if and as needed. Or truncates with an error.

    Apps expecting to work with dynamic descriptors might fail with the truncation error as mentioned, and apps expecting to massage static descriptors directly and not coded sufficiently cautiously around any arriving dynamic strings can fail with heap and other errors.

    Basic cannot stuff 200 bytes into a 100 bytes fixed length string. That
    is fair.

    Fortran can! And you likely won't notice that you have damaged some other memory until you get a SIGSEGV in some totally unrelated part of your code. --scott
    --
    "C'est un Nagra. C'est suisse, et tres, tres precis."

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Lawrence D'Oliveiro on Wed Feb 28 09:18:54 2024
    On 2/25/2024 10:38 PM, Lawrence D'Oliveiro wrote:
    On Sun, 25 Feb 2024 16:33:33 -0500, Arne Vajhøj wrote:
    If I have understood it correctly then VMS Basic strings use class D
    descriptors.

    The “VAX BASIC User Manual” mentions both dynamic and fixed-length strings. Chapter 13 explains that strings are fixed-length when part of COMMON, MAP or RECORD statements, otherwise they are dynamic. Fixed-length strings obviously cannot have their storage reallocated.

    And I guess that sort of makes Basic need to handle it
    transparently.

    $ type f.for
    program f
    character*80 s
    call print_class('ABC')
    call b1('ABC')
    call print_class(s)
    call b2(s)
    write(*,*) '|'//trim(s)//'|'
    end
    $ type b.bas
    sub b1(string s)
    print "|" + s + "|"
    end sub
    !
    sub b2(string s)
    s = "ABC"
    end sub
    $ type x.c
    #include <stdio.h>

    #include <descrip.h>

    void print_class(struct dsc$descriptor *s)
    {
    switch(s->dsc$b_class) {
    case DSC$K_CLASS_S:
    printf("Class S\n");
    break;
    case DSC$K_CLASS_VS:
    printf("Class VS\n");
    break;
    case DSC$K_CLASS_D:
    printf("Class D\n");
    break;
    default:
    printf("Unknown class\n");
    break;
    }
    }
    $ for f
    $ bas b
    $ link f + b + x
    $ run f
    Class S
    |ABC|
    Class S
    |ABC|

    has to work, because of:

    $ type f.bas
    program f
    map (blk1) string s1 = 3
    map (blk2) string s2 = 80
    external sub b1(string)
    external sub b2(string)
    external sub print_class(string)
    external string function trim(string)
    s1 = "ABC"
    call print_class(s1)
    call b1(s1)
    call print_class(s2)
    call b2(s2)
    print "|" + trim(s2) + "|"
    end
    !
    function string trim(string s)
    declare integer ix
    ix = len(s)
    while ix > 1 and mid$(s, ix, 1) = " "
    ix = ix - 1
    next
    trim = mid$(s, 1, ix)
    end function
    $ type b.bas
    sub b1(string s)
    print "|" + s + "|"
    end sub
    !
    sub b2(string s)
    s = "ABC"
    end sub
    $ type x.c
    #include <stdio.h>

    #include <descrip.h>

    void print_class(struct dsc$descriptor *s)
    {
    switch(s->dsc$b_class) {
    case DSC$K_CLASS_S:
    printf("Class S\n");
    break;
    case DSC$K_CLASS_VS:
    printf("Class VS\n");
    break;
    case DSC$K_CLASS_D:
    printf("Class D\n");
    break;
    default:
    printf("Unknown class\n");
    break;
    }
    }
    $ bas f
    $ bas b
    $ link f + b + x
    $ run f
    Class S
    |ABC|
    Class S
    |ABC|

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From =?UTF-8?Q?Arne_Vajh=C3=B8j?=@21:1/5 to Scott Dorsey on Wed Feb 28 10:26:12 2024
    On 2/28/2024 9:56 AM, Scott Dorsey wrote:
    In article <urnggb$3trjg$1@dont-email.me>,
    =?UTF-8?Q?Arne_Vajh=C3=B8j?= <arne@vajhoej.dk> wrote:
    On 2/26/2024 4:17 PM, Stephen Hoffman wrote:
    If the app code assumes a dynamic arriving and gets handed static, the
    RTL will either copy it, or space-pad the results into the static, or
    the RTL will return a string truncation error. BASIC space-pads into
    static string buffers if and as needed. Or truncates with an error.

    Apps expecting to work with dynamic descriptors might fail with the
    truncation error as mentioned, and apps expecting to massage static
    descriptors directly and not coded sufficiently cautiously around any
    arriving dynamic strings can fail with heap and other errors.

    Basic cannot stuff 200 bytes into a 100 bytes fixed length string. That
    is fair.

    Fortran can! And you likely won't notice that you have damaged some other memory until you get a SIGSEGV in some totally unrelated part of your code.

    :-)

    Note that VMS Fortran need to be actively mislead to do this.

    $ type bufovr.for
    program bufovr
    character*4 s1, s2
    common /b/s1,s2
    write(*,*) %loc(s1), %loc(s2)
    s2 = 'XXXX'
    call subbo1(s1)
    write(*,*) s1//s2
    call subbo2(s1)
    write(*,*) s1//s2
    end
    c
    subroutine subbo1(s)
    character*(*) s
    s = 'ABCDEFGH'
    end
    c
    subroutine subbo2(s)
    character*8 s
    s = '12345678'
    end
    $ for bufovr
    $ link bufovr
    $ run bufovr
    196608 196612
    ABCDXXXX
    12345678

    Arne

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stephen Hoffman@21:1/5 to All on Wed Mar 6 18:44:29 2024
    On 2024-02-28 14:38:03 +0000, Arne Vajhj said:

    On 2/26/2024 4:17 PM, Stephen Hoffman wrote:

    I also wouldn't expect the RTLs to work with encodings other than ASCII
    and DEC MCS, either. And UTF-8 will fail in the expected places, and
    most searching and sorting tends not to be sensitive to the (written)
    language used within the text string.

    I would assume that it works as long as the string is considered a
    sequence of bytes not a sequence of characters.

    The assumption that one byte is one character is embedded deeply in
    OpenVMS system and app code and APIs.

    I would assume that such code will break in various ways when presented
    with UTF-8.

    Anything assuming a correspondence between string length and displayed
    width is going to fail, for instance.

    That's before discussing sorting and searching and language
    differences, as was mentioned. And normalization.

    OpenVMS has (had) support some of those differences with NCS and with
    ICU, though those APIs aren't (weren't) widely used by apps.


    --
    Pure Personal Opinion | HoffmanLabs LLC

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lawrence D'Oliveiro@21:1/5 to Stephen Hoffman on Fri Mar 8 03:33:58 2024
    On Wed, 6 Mar 2024 18:44:29 -0500, Stephen Hoffman wrote:

    I would assume that such code will break in various ways when presented
    with UTF-8.

    Could be worse. Imagine if you had adopted Unicode at exactly that period
    in the early 1990s, like Windows NT and Java did, when it was still
    supposed to be a fixed-length 16-bit code. Then you would be saddled with
    that albatross known as UTF-16.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)