• Reading unformatted data from file

    From Rudi Gaelzer@21:1/5 to All on Wed Feb 23 14:05:04 2022
    I think I still don't understand unformatted IO in Fortran...

    I wrote a program that writes the values of a complex array (vz) into a file. The file is opened for writing with unformatted and sequential attributes:
    open(unit= 100, file= 'file', form= 'unformatted')
    write(100) vz

    In another run of the program, on the same platform, I want to read that data into an array, without knowing beforehand the number of data records written in the file.
    I open the file again with:
    open(unit= 100, file= 'file', form= 'unformatted')

    Say I have a total of 100 complex numbers written in the file. If I allocate an array vz with 100 or less elements, say
    allocate(vz(10))
    and then read the data with either
    read(100)vz
    or
    read(100) (vz(i), i= 1, 10)
    the data seems to be read correctly.

    However, if I want to read the data and at the same time count the number of complex records, say with:
    integer :: i, npts
    complex :: z
    complex, dimension(:), allocatable :: vz
    npts= 0
    do
    read(100, iostat= i)z
    if(i < 0)exit
    vz= [ vz, z ] ; npts= npts + 1
    end do

    I get an error after the first record is read. Using inquire, the value of POSITION after the first record read is 'UNDEFINED' with the Intel compiler and 'APPEND' with gfortran.

    Does it mean that for an unformatted read, the input list is read only once, or am I doing some stupid blunder here?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rudi Gaelzer@21:1/5 to Rudi Gaelzer on Wed Feb 23 16:14:47 2022
    Well, a workaround I've found is to access:
    use iso_fortran_env, only: file_storage_size
    and then inquire:
    inquire(100, size= file_size)

    The number of records is then file_size/file_storage_size - 1. I guess I have to subtract the last entry which is the EOF record.

    Is there any other solution more elegant?

    On Wednesday, February 23, 2022 at 7:05:09 PM UTC-3, Rudi Gaelzer wrote:
    I think I still don't understand unformatted IO in Fortran...

    I wrote a program that writes the values of a complex array (vz) into a file. The file is opened for writing with unformatted and sequential attributes:
    open(unit= 100, file= 'file', form= 'unformatted')
    write(100) vz

    In another run of the program, on the same platform, I want to read that data into an array, without knowing beforehand the number of data records written in the file.
    I open the file again with:
    open(unit= 100, file= 'file', form= 'unformatted')

    Say I have a total of 100 complex numbers written in the file. If I allocate an array vz with 100 or less elements, say
    allocate(vz(10))
    and then read the data with either
    read(100)vz
    or
    read(100) (vz(i), i= 1, 10)
    the data seems to be read correctly.

    However, if I want to read the data and at the same time count the number of complex records, say with:
    integer :: i, npts
    complex :: z
    complex, dimension(:), allocatable :: vz
    npts= 0
    do
    read(100, iostat= i)z
    if(i < 0)exit
    vz= [ vz, z ] ; npts= npts + 1
    end do

    I get an error after the first record is read. Using inquire, the value of POSITION after the first record read is 'UNDEFINED' with the Intel compiler and 'APPEND' with gfortran.

    Does it mean that for an unformatted read, the input list is read only once, or am I doing some stupid blunder here?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to rgae...@gmail.com on Wed Feb 23 17:23:56 2022
    On Wednesday, February 23, 2022 at 4:14:52 PM UTC-8, rgae...@gmail.com wrote:

    (snip)
    Is there any other solution more elegant?

    I don't know about more elegant, but back to Fortran 66 days:

    WRITE(100) N,(VZ(I),I=1,N)

    then read with:

    READ(100) N,(VZ(I),I=1,N)

    The thing about unformatted is that you are allowed to read less than
    a whole record and, like formatted, the rest of the record is ignored.

    If it is written as one whole record, it has to be read back as one whole
    (or less than whole) record. Not as multiple records.

    In any case, the above is legal back to Fortran 66, and I presume still is.

    If you really need to, (and couldn't do this in Fortran 66):

    READ(100) N
    BACKSPACE 100
    ALLOCATE (VZ(N))
    READ(100) N,(VZ(I),I=1,N)

    instead of doing that, I would write N as a separate record.

    You can make it more interesting with more than one array.

    READ(100) N,(X(I),I=1,N),M,(Y(I),I=1,M)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From FortranFan@21:1/5 to rgae...@gmail.com on Wed Feb 23 21:48:56 2022
    On Wednesday, February 23, 2022 at 5:05:09 PM UTC-5, rgae...@gmail.com wrote:

    ..
    npts= 0
    do
    read(100, iostat= i)z
    if(i < 0)exit
    vz= [ vz, z ] ; npts= npts + 1
    end do

    I get an error after the first record is read. Using inquire, the value of POSITION after the first record read is 'UNDEFINED' with the Intel compiler and 'APPEND' with gfortran.
    ..

    Look into REWIND option in the language standard.

    Note READ instruction in your case above causes an IO transfer to the end of record. Subsequent READ is thus positioned wrongly.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to Rudi Gaelzer on Thu Feb 24 07:49:42 2022
    Rudi Gaelzer <rgaelzer@gmail.com> schrieb:
    I think I still don't understand unformatted IO in Fortran...

    I wrote a program that writes the values of a complex array (vz)
    into a file. The file is opened for writing with unformatted and
    sequential attributes:

    open(unit= 100, file= 'file', form= 'unformatted')
    write(100) vz

    This writes out the whole array.

    In another run of the program, on the same platform, I want
    to read that data into an array, without knowing beforehand the
    number of data records written in the file.

    Terminology: Your write statement wrote out a single record,
    which contained data consisting of several values.

    The compiler has a representation for how it arranges
    unformatted data in the file. Some additional information
    is needed because the length of the record has to be stored.
    You can look at how gfortran does it at

    https://gcc.gnu.org/onlinedocs/gfortran/File-format-of-unformatted-sequential-files.html

    gfortran followed ifort in that respect.

    I open the file again with:
    open(unit= 100, file= 'file', form= 'unformatted')

    Yep.

    Say I have a total of 100 complex numbers written in the file.
    If I allocate an array vz with 100 or less elements, say

    allocate(vz(10))
    and then read the data with either
    read(100)vz
    or
    read(100) (vz(i), i= 1, 10)
    the data seems to be read correctly.

    That is as it should be. You are doing a sequential READ statement,
    and after the READ, the next record is read.

    However, if I want to read the data and at the same time count the number of complex records, say with:
    integer :: i, npts
    complex :: z
    complex, dimension(:), allocatable :: vz
    npts= 0
    do
    read(100, iostat= i)z
    if(i < 0)exit
    vz= [ vz, z ] ; npts= npts + 1
    end do

    I get an error after the first record is read.

    Because you have written only one record, and are advancing to
    the next one, and there isn't one.

    What you are trying to do might be done better with an unformatted
    stream, which has no records (like a binary file in C).

    You can look at

    https://www.programming-idioms.org/idiom/228/copy-a-file

    (click on the Fortran tab) for an example on how to manipulate
    unformatted files, although what that example does is a bit
    different from what you want. It might give you a first
    idea, though.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to FortranFan on Thu Feb 24 02:09:33 2022
    On Thursday, February 24, 2022 at 4:49:01 PM UTC+11, FortranFan wrote:
    On Wednesday, February 23, 2022 at 5:05:09 PM UTC-5, rgae...@gmail.com wrote:

    ..
    npts= 0
    do
    read(100, iostat= i)z
    if(i < 0)exit
    vz= [ vz, z ] ; npts= npts + 1
    end do

    I get an error after the first record is read. Using inquire, the value of POSITION after the first record read is 'UNDEFINED' with the Intel compiler and 'APPEND' with gfortran.
    ..

    Look into REWIND option in the language standard.
    .
    That's irrelevant.
    .
    Note READ instruction in your case above causes an IO transfer to the end of record. Subsequent READ is thus positioned wrongly.
    .
    That's irrelevant. The program is in error to read the individual elements. See earlier postings.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rudi Gaelzer@21:1/5 to Robin Vowels on Thu Feb 24 05:32:18 2022
    On Thursday, February 24, 2022 at 7:09:37 AM UTC-3, Robin Vowels wrote:
    On Thursday, February 24, 2022 at 4:49:01 PM UTC+11, FortranFan wrote:
    On Wednesday, February 23, 2022 at 5:05:09 PM UTC-5, rgae...@gmail.com wrote:

    ..
    npts= 0
    do
    read(100, iostat= i)z
    if(i < 0)exit
    vz= [ vz, z ] ; npts= npts + 1
    end do

    I get an error after the first record is read. Using inquire, the value of POSITION after the first record read is 'UNDEFINED' with the Intel compiler and 'APPEND' with gfortran.
    ..

    Look into REWIND option in the language standard.
    .
    That's irrelevant.
    .
    Note READ instruction in your case above causes an IO transfer to the end of record. Subsequent READ is thus positioned wrongly.
    .
    That's irrelevant. The program is in error to read the individual elements. See earlier postings.
    Thanks for all replies. They confirm that the program writes the whole array as a continuous stream of bits, with the single addition of the EOF record.
    I consider the question as solved.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Arjen Markus@21:1/5 to rgae on Thu Feb 24 07:28:05 2022
    On Thursday, February 24, 2022 at 2:32:23 PM UTC+1, rgae wrote:

    Thanks for all replies. They confirm that the program writes the whole array as a continuous stream of bits, with the single addition of the EOF record.
    I consider the question as solved.

    That is an inaccurate description and it can actually lead to further confusion, as there is a form of I/O that does produce a stream of bytes. You should understand that unformatted I/O uses logical records. After the last record, no information is left
    in the file, definitely not an EOF record. In fact the implementation of unformatted I/O should not be relied upon. It could be anything.

    Instead: simply consider such files to be built up from _logical_ records that have to be read one at a time. You can read the whole record or just a small part, but the READ statement will cause the file pointer to move to the next record. You cannot
    inquire the length of the record, that is information that has to come from somewhere else.

    Regards,

    Arjen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rudi Gaelzer@21:1/5 to arjen.m...@gmail.com on Fri Feb 25 07:28:50 2022
    On Thursday, February 24, 2022 at 12:28:10 PM UTC-3, arjen.m...@gmail.com wrote:
    That is an inaccurate description and it can actually lead to further confusion, as there is a form of I/O that does produce a stream of bytes. You should understand that unformatted I/O uses logical records. After the last record, no information is
    left in the file, definitely not an EOF record. In fact the implementation of unformatted I/O should not be relied upon. It could be anything.


    I am referring to unformatted IO with sequential access. I'm not considering direct or stream access in principle.
    In fact, this is another point I have doubts. If I write a binary data file using sequential access, is it possible to read the data afterwards using either direct or stream access?

    Instead: simply consider such files to be built up from _logical_ records that have to be read one at a time. You can read the whole record or just a small part, but the READ statement will cause the file pointer to move to the next record. You cannot
    inquire the length of the record, that is information that has to come from somewhere else.

    Regards,

    Arjen

    Well, that may be so, but I can (and did) INQUIRE the size of the file and in all the tests I did, the result of file_size/file_storage_size (file_storage_size = 8) is equal to the actual number of complex (single precision) records written in binary
    form into the file PLUS one chunk of 8 bits.
    If I allocate the array with the correct number of complex elements and read the whole array with
    read(unit)vz
    the data is read correctly. So, that additional chunk of 8 bits must sit at the very end of the record.
    If this is not a EOF record, I don't know what it is.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to Rudi Gaelzer on Fri Feb 25 17:14:57 2022
    Rudi Gaelzer <rgaelzer@gmail.com> schrieb:

    Well, that may be so, but I can (and did) INQUIRE the
    size of the file and in all the tests I did, the result of file_size/file_storage_size (file_storage_size = 8) is equal to
    the actual number of complex (single precision) records written
    in binary form into the file PLUS one chunk of 8 bits.

    8 bits or bytes?


    If I allocate the array with the correct number of complex elements and read the whole array with

    read(unit)vz

    the data is read correctly. So, that additional chunk of 8
    bits must sit at the very end of the record. If this is not a
    EOF record, I don't know what it is.

    You're correct, you do not know what it is.

    gfortran's unformatted file format (which I posted upthread) consists,
    on a physical level, of sequnces of

    - one four-byte record marker containing the length of the data to
    follow
    - the data
    - one four-byte record marker containing the length of the record (same
    as the first one), used for BACKSPACE

    End of file is simply detected by the end-of-file condition from the
    operating system.

    On a logical level (as seen by the application program), the
    bytes in the file system are intereted as records, including an
    ENDFILE record.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Arjen Markus@21:1/5 to rgae on Fri Feb 25 09:19:33 2022
    On Friday, February 25, 2022 at 4:28:53 PM UTC+1, rgae wrote:

    Well, that may be so, but I can (and did) INQUIRE the size of the file and in all the tests I did, the result of file_size/file_storage_size (file_storage_size = 8) is equal to the actual number of complex (single precision) records written in binary
    form into the file PLUS one chunk of 8 bits.
    If I allocate the array with the correct number of complex elements and read the whole array with
    read(unit)vz
    the data is read correctly. So, that additional chunk of 8 bits must sit at the very end of the record.
    If this is not a EOF record, I don't know what it is.

    No, I can understand your conclusion, but as a matter of fact, there is a four bytes marker at the start of the record and the same one at the end. A program needs to be able to read to the end of the record (hence the first marker) and it needs to be
    able to go back one record - the BACKSPACE statement, although that was most useful for tape drives - which leads to the second marker. A file can have a many records as you like and the records may contain any combination of bytes, so the extra bytes
    make this possible (they are never part of the data).

    Note: this is not the only possible scheme - I have seen several others over the years. And the scheme we are considering now is actually more complicated: records longer than 2 or 4 GB require a different approach!

    Regards,

    Arjen

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rudi Gaelzer@21:1/5 to Thomas Koenig on Fri Feb 25 09:46:30 2022
    On Friday, February 25, 2022 at 2:15:01 PM UTC-3, Thomas Koenig wrote:
    8 bits or bytes?

    8 bits, according to the standard:
    16.10.2.11 FILE_STORAGE_SIZE
    1 The value of the default integer scalar constant FILE_STORAGE_SIZE is the size expressed in bits of the file
    storage unit (12.3.5).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Rudi Gaelzer@21:1/5 to Thomas Koenig on Fri Feb 25 11:17:04 2022
    On Friday, February 25, 2022 at 2:15:01 PM UTC-3, Thomas Koenig wrote:
    You're correct, you do not know what it is.

    gfortran's unformatted file format (which I posted upthread) consists,
    on a physical level, of sequnces of

    - one four-byte record marker containing the length of the data to
    follow
    - the data
    - one four-byte record marker containing the length of the record (same
    as the first one), used for BACKSPACE

    End of file is simply detected by the end-of-file condition from the operating system.

    On a logical level (as seen by the application program), the
    bytes in the file system are intereted as records, including an
    ENDFILE record.
    Indeed, I did not pay due attention to your previous explanation or to the manual, and so misinterpreted the results from my tests.
    If I understood correctly this time, by writing the whole array with write(unit)vz
    the total record is composed by 3 subrecords:
    1. the leading record marker, a 4 bytes integer with the number of bytes of the actual data
    2. the values of the data in binary form
    3. the trailing record marker, another integer of 4 bytes with the same value of the leading.
    Correct?
    If I write the array using an implied-do loop:
    write(unit) (vz(i)= 1, n)
    the record will contain the same 3 subrecords, or will be composed by several subrecords?

    Then the value in the leading marker is read by the statement
    inquire(unit, size= <size>)
    to return the total amount of data written in the file, in terms of the unit storage size?
    While value in the trailing marker is employed by the backspace and rewind statements to go back in the file.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Steve Lionel@21:1/5 to Arjen Markus on Fri Feb 25 16:33:28 2022
    On 2/25/2022 12:19 PM, Arjen Markus wrote:
    Note: this is not the only possible scheme - I have seen several others over the years. And the scheme we are considering now is actually more complicated: records longer than 2 or 4 GB require a different approach!

    At least two compilers (Intel Fortran and gfortran) use a different
    approach, with the MSB of the 32-bit length reserved to indicate a
    subrecord. This preserves compatibility with files that have records
    shorter than 2**31 bytes. The scheme is described in the section "Variable-Length Records" at https://intel.ly/3Ho9pKy
    --
    Steve Lionel
    ISO/IEC JTC1/SC22/WG5 (Fortran) Convenor
    Retired Intel Fortran developer/support
    Email: firstname at firstnamelastname dot com
    Twitter: @DoctorFortran
    LinkedIn: https://www.linkedin.com/in/stevelionel
    Blog: https://stevelionel.com/drfortran
    WG5: https://wg5-fortran.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From JCampbell@21:1/5 to rgae...@gmail.com on Fri Feb 25 22:50:08 2022
    On Saturday, February 26, 2022 at 6:17:07 AM UTC+11, rgae...@gmail.com wrote:

    If I understood correctly this time, by writing the whole array with write(unit)vz
    the total record is composed by 3 subrecords:
    1. the leading record marker, a 4 bytes integer with the number of bytes of the actual data
    2. the values of the data in binary form
    3. the trailing record marker, another integer of 4 bytes with the same value of the leading.
    Correct?

    Yes, Fortran "unformatted" files consist of RECORDS. Each record is as you describe (up to records of 2^31 bytes).
    The important distinction is they are records, dressed in 3 components : record size header / data / record size trailer.
    Most compilers have the record size header as size in bytes as a 4-byte integer, but not all. The header is not defined.

    Fortran unformatted binary files are NOT portable between compilers or operating systems.
    (This is worse than the case of formatted text files between Dos / unix / linux OS)

    It can be very disappointing to find some compilers are different, which I regard as a failing of the standard.
    Just look at the myriad of data portability approaches that have been developed because of this failing of the many Fortran Standards to address data portability.
    Imagine if the standardised Fortran unformatted file format included some identification of the intrinsic data types in the header/footer.
    Unfortunately Fortran unformatted-files are just temporary files.

    Note:
    gFortran and iFort use a 4-byte header/footer to indicate the record size, (but what is size?)
    For large records, larger than 2^31 bytes, gFortran adopts "sub records" of smaller than 2^31 - 9 bytes(?) with special -ve record size header values.
    iFort might use a different approach for larger records and might use words instead of bytes for size.
    Silverfrost FTN95 uses a 1-byte header/footer for records smaller than 256 bytes, then a 5-byte header/footer for larger records.
    If only there was an adopted standard, unformatted files might be more portable !!

    If I write the array using an implied-do loop:
    write(unit) (vz(i)= 1, n)
    the record will contain the same 3 subrecords, or will be composed by several subrecords?

    Yes, this would be the same, if "n" was the dimension of vz. There is 1 record.

    Your earlier example is different.
    complex :: z
    do
    read(100, iostat= i)z
    if(i < 0)exit
    end do
    In this case, each read is reading the next record. It could read the first 8 bytes of the data in the record, then step to the next logical record.

    Both "fixed length record direct access" files and "stream I/O" files do not dress the data in record header/footer.
    As such these are more portable, although can vary due to endian format.
    I use "fixed length record direct access" files for transferring data between codes that are compiled with different compilers,
    mainly for integer and real arrays.

    I have no experience of how this record dressing could apply to derived type data structures.
    I will write out the intrinsic type components as individual elements of an I/O list or in seperate write statement records.
    This approach provides clarity to the structure of the records being written.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to JCampbell on Sat Feb 26 00:12:19 2022
    On Saturday, February 26, 2022 at 5:50:10 PM UTC+11, JCampbell wrote:
    On Saturday, February 26, 2022 at 6:17:07 AM UTC+11, rgae...@gmail.com wrote:

    If I understood correctly this time, by writing the whole array with write(unit)vz
    the total record is composed by 3 subrecords:
    1. the leading record marker, a 4 bytes integer with the number of bytes of the actual data
    2. the values of the data in binary form
    3. the trailing record marker, another integer of 4 bytes with the same value of the leading.
    Correct?
    Yes, Fortran "unformatted" files consist of RECORDS. Each record is as you describe (up to records of 2^31 bytes).
    The important distinction is they are records, dressed in 3 components : record size header / data / record size trailer.
    Most compilers have the record size header as size in bytes as a 4-byte integer, but not all. The header is not defined.

    Fortran unformatted binary files are NOT portable between compilers or operating systems.
    (This is worse than the case of formatted text files between Dos / unix / linux OS)

    It can be very disappointing to find some compilers are different, which I regard as a failing of the standard.
    Just look at the myriad of data portability approaches that have been developed because of this failing of the many Fortran Standards to address data portability.
    Imagine if the standardised Fortran unformatted file format included some identification of the intrinsic data types in the header/footer.
    Unfortunately Fortran unformatted-files are just temporary files.
    .
    What?
    .
    Note:
    gFortran and iFort use a 4-byte header/footer to indicate the record size, (but what is size?)
    For large records, larger than 2^31 bytes, gFortran adopts "sub records" of smaller than 2^31 - 9 bytes(?) with special -ve record size header values.
    iFort might use a different approach for larger records and might use words instead of bytes for size.
    Silverfrost FTN95 uses a 1-byte header/footer for records smaller than 256 bytes, then a 5-byte header/footer for larger records.
    If only there was an adopted standard, unformatted files might be more portable !!
    If I write the array using an implied-do loop:
    write(unit) (vz(i)= 1, n)
    the record will contain the same 3 subrecords, or will be composed by several subrecords?

    Yes, this would be the same, if "n" was the dimension of vz. There is 1 record.

    Your earlier example is different.
    complex :: z
    do
    read(100, iostat= i)z
    if(i < 0)exit
    end do
    In this case, each read is reading the next record. It could read the first 8 bytes of the data in the record, then step to the next logical record.

    Both "fixed length record direct access" files and "stream I/O" files do not dress the data in record header/footer.
    .
    They both have unspecified leading bytes at the start of the file.
    .
    As such these are more portable, although can vary due to endian format.
    I use "fixed length record direct access" files for transferring data between codes that are compiled with different compilers,
    mainly for integer and real arrays.

    I have no experience of how this record dressing could apply to derived type data structures.
    I will write out the intrinsic type components as individual elements of an I/O list or in seperate write statement records.
    This approach provides clarity to the structure of the records being written.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to JCampbell on Sat Feb 26 08:36:17 2022
    JCampbell <campbelljohnd01@gmail.com> schrieb:
    On Saturday, February 26, 2022 at 6:17:07 AM UTC+11, rgae...@gmail.com wrote:

    If I understood correctly this time, by writing the whole array with
    write(unit)vz
    the total record is composed by 3 subrecords:
    1. the leading record marker, a 4 bytes integer with the number of bytes of the actual data
    2. the values of the data in binary form
    3. the trailing record marker, another integer of 4 bytes with the same value of the leading.
    Correct?

    Yes, Fortran "unformatted" files consist of RECORDS. Each record
    is as you describe (up to records of 2^31 bytes).

    The important distinction is they are records, dressed in 3
    components : record size header / data / record size trailer.

    In general.

    Most compilers have the record size header as size in bytes as
    a 4-byte integer, but not all.

    In practice: Which ones don't?

    gfortran used a "long int" (which meant that file formats were
    different in 32-bit and 64-bit systems, which was clearly bad).
    In 2006, this was changed as a 4-byte integer with a negative
    value indicating continuation, following ifort's lead.

    The header is not defined.

    Not by the standard.

    Fortran unformatted binary files are NOT portable between compilers or operating systems.

    $ cat write.f90
    program main
    implicit none
    open (10,file="tst.dat",form="unformatted",action="write")
    write (10) sqrt(2.)
    end program main
    $ nagfor write.f90 && ./a.out && od -t x1 tst.dat
    NAG Fortran Compiler Release 7.1(Hanzomon) Build 7101
    [NAG Fortran Compiler normal termination]
    0000000 04 00 00 00 f3 04 b5 3f 04 00 00 00
    0000014
    $ gfortran write.f90 && ./a.out && od -t x1 tst.dat
    0000000 04 00 00 00 f3 04 b5 3f 04 00 00 00
    0000014

    and, on a POWER9 little-endian machine,

    $ xlf write.f90 && ./a.out && od -t x1 tst.dat
    ** main === End of Compilation 1 ===
    1501-510 Compilation successful for file write.f90.
    0000000 04 00 00 00 f3 04 b5 3f 04 00 00 00
    0000014

    As long as you are sticking to record lengths below
    2GiB-9 bytes, you should see no problem.

    (This is worse than the case of formatted text files between Dos /
    unix / linux OS)

    Not really (IMHO).

    It can be very disappointing to find some compilers are different, which I regard as a failing of the standard.
    Just look at the myriad of data portability approaches that have been developed because of this failing of the many Fortran Standards to address data portability.
    Imagine if the standardised Fortran unformatted file format included some identification of the intrinsic data types in the header/footer.
    Unfortunately Fortran unformatted-files are just temporary files.

    Note:
    gFortran and iFort use a 4-byte header/footer to indicate the
    record size, (but what is size?)

    Documented for both compilers, it's bytes.

    For large records, larger than 2^31 bytes, gFortran adopts "sub
    records" of smaller than 2^31 - 9 bytes(?) with special -ve record
    size header values.

    iFort might use a different approach for larger records and might use words instead of bytes for size.

    I doesn't. This is also not an accident: When the current scheme
    for gfortran was adopted, the implementor looked at existing
    implementations and chose ifort's scheme, because it offered the
    maximum flexibility for larger records and was compatible with
    4-byte schemes adopted by other compilers as long as the record
    length was small enough.

    Silverfrost FTN95 uses a 1-byte header/footer for records smaller
    than 256 bytes, then a 5-byte header/footer for larger records.

    Interesting to see that they still are in business, haven't heard
    from that compiler for a long time.

    Googling a bit found for the actual format, it seems a rather
    trivial matter to write a conversion program between the
    two formats. However, there is a better solution.

    If only there was an adopted standard, unformatted files might
    be more portable !!

    There is a way to generate portable unformatted files in Fortran,
    and has been since the adoption of the Fortran 2003 standard:
    Unformatted streams.

    These are much like C's binary files, there is no record structure,
    and you can read and write to your heart's content, and there is
    no problem interchanging the data (unless you run into big/little
    endian issues). Just be sure to remember that writes don't truncate :-)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to JCampbell on Sat Feb 26 06:25:03 2022
    On Friday, February 25, 2022 at 10:50:10 PM UTC-8, JCampbell wrote:

    (snip)

    Fortran unformatted binary files are NOT portable between compilers or operating systems.
    (This is worse than the case of formatted text files between Dos / unix / linux OS)

    It can be very disappointing to find some compilers are different, which I regard as a failing of the standard.
    Just look at the myriad of data portability approaches that have been developed
    because of this failing of the many Fortran Standards to address data portability.
    Imagine if the standardised Fortran unformatted file format included some identification of the intrinsic data types in the header/footer. Unfortunately Fortran unformatted-files are just temporary files.

    To standardize them, you would need to standardize the underlying file system.

    Not all OS supply a file system where files are an unorganized sequence of bytes.
    (Well, even more, not all use bytes, but that is a different question.)

    OS/360, and still with current z/OS, the file system itself has records and blocks,
    and there is a special format that seems to have been designed for Fortran, though
    usable by other languages. (Though maybe not C.) It is called VBS, for variable
    blocked spanned. The OS itself keeps track of records and blocks, and supplies the record and block headers. (No trailers, as they aren't needed to BACKSPACE.)

    For Unix, the OS keeps track of blocking, such that files look like an unstructured
    series of bytes, but for tapes, files do have blocks. When you read a tape on Unix,
    you ask for some number of bytes, and the OS gives you up to that many, ignoring any of the rest in the block, and returns the number read.

    I have not tried UNFORMATTED on tapes on a recent OS, so I am not sure
    how it works.

    Back close to the beginning of tapes, tape drives know how to read backwards. (Many modern drives don't know how to do that.) With read backwards, the
    OS (and hardware) fill the buffer from the end toward the beginning,
    stopping at the end of the block, or the beginning of the buffer. That was mostly used for sort algorithms, as it saved the rewind time for ones
    that used multiple passes through the data. (That is, external sorts.)

    As far as I know, Fortran still allows for systems with word sizes
    not a multiple of 8 bits. There is modern hardware that will run the
    TOPS-20 36 bit OS, though I am not so sure how modern the Fortran
    compilers are for it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Thomas Koenig on Thu Mar 3 16:08:18 2022
    On Wednesday, February 23, 2022 at 11:49:46 PM UTC-8, Thomas Koenig wrote:

    (snip)

    What you are trying to do might be done better with an unformatted
    stream, which has no records (like a binary file in C).

    You can look at

    https://www.programming-idioms.org/idiom/228/copy-a-file

    (click on the Fortran tab) for an example on how to manipulate
    unformatted files, although what that example does is a bit
    different from what you want. It might give you a first
    idea, though.

    Note that with UNFORMATTED (or, for that matter, FORMATTED),
    and STREAM or not, the program reading the data in needs to know how
    long it is. In some cases, that is always known, but it most it is best to write the length into the file.

    As I noted above, and it should work for any type of Fortran I/O,
    you can read the length and data in the same statement:

    READ(5,*) N,(X(I), I=1,N),(Y(I), I=1,N)

    In Fortran 66 and 77 days, before dynamic allocation, that was fine.

    If you want to allocate before the read, you can write the length on
    a separate record.

    READ(5,*) N
    ALLOCATE (X(N),Y(N))
    READ(5,*) (X(I), I=1,N),(Y(I), I=1,N)

    and that works about as well for UNFORMATTED, and STREAM or not.

    For some file formats, you can write all the lengths out at first,
    and then all the data using lengths:

    READ(5,*) L, M, N
    READ(5,*) (X(I), I=1,L),(Y(I), I=1,M),(Z(I),I=1,N)

    or almost any other combination of reading that you like.

    One that you can't do, that I tried in Fortran-77 days, and even
    had the form to report bugs to DEC:

    READ(5,*,END=10) (X(I),Y(I),I=1,M)
    I=M+1
    10 N=I-1

    The standard does not make any guarantees about the
    data read in, in the case that END= is taken.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron Shepard@21:1/5 to All on Fri Mar 4 00:39:56 2022
    On 3/3/22 6:08 PM, gah4 wrote:
    [...]
    If you want to allocate before the read, you can write the length on
    a separate record.

    READ(5,*) N
    ALLOCATE (X(N),Y(N))
    READ(5,*) (X(I), I=1,N),(Y(I), I=1,N)

    These days one would simply use

    read(5,*) x, y

    or

    read(5,*) x(1:n), y(1:n)


    and that works about as well for UNFORMATTED, and STREAM or not.

    For some file formats, you can write all the lengths out at first,
    and then all the data using lengths:

    READ(5,*) L, M, N
    READ(5,*) (X(I), I=1,L),(Y(I), I=1,M),(Z(I),I=1,N)

    Same here

    read(5,*) x(1:L), y(1:M), z(1:N)

    or something similar.

    or almost any other combination of reading that you like.

    One that you can't do, that I tried in Fortran-77 days, and even
    had the form to report bugs to DEC:

    READ(5,*,END=10) (X(I),Y(I),I=1,M)
    I=M+1
    10 N=I-1

    The standard does not make any guarantees about the
    data read in, in the case that END= is taken.

    This is because some systems buffer the records, and if a hardware eof
    is encountered, then the status of the arrays and of the implied do
    variable might be in some undefined state. The i/o library might have
    been trying to read the blocks in reverse order, for example, or with
    cylinder reads on a hard disk which would fill in the arrays with gaps.
    So the standard just punted and said that nothing is defined in order to
    give the i/o library the maximum flexibility.

    $.02 -Ron Shepard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ron Shepard on Fri Mar 4 01:09:29 2022
    On Thursday, March 3, 2022 at 10:40:00 PM UTC-8, Ron Shepard wrote:
    On 3/3/22 6:08 PM, gah4 wrote:

    If you want to allocate before the read, you can write the length on
    a separate record.

    READ(5,*) N
    ALLOCATE (X(N),Y(N))
    READ(5,*) (X(I), I=1,N),(Y(I), I=1,N)
    These days one would simply use

    read(5,*) x, y

    Only if you are reading the whole array, but yes in this case.
    I had previously written it without the allocate, as the arrays
    could already be big enough.

    or

    read(5,*) x(1:n), y(1:n)

    Does this work exactly the same? I suppose so.
    I was too used to doing this back to Fortran 66 days.

    and that works about as well for UNFORMATTED, and STREAM or not.

    For some file formats, you can write all the lengths out at first,
    and then all the data using lengths:

    READ(5,*) L, M, N
    READ(5,*) (X(I), I=1,L),(Y(I), I=1,M),(Z(I),I=1,N)
    Same here

    read(5,*) x(1:L), y(1:M), z(1:N)

    or something similar.
    or almost any other combination of reading that you like.

    One that I thought of, but never did, in the olden days:

    READ(5,*) L, (X(MIN(I,L)),I=1,L)

    which avoids going outside the array, if the L read in is
    larger than expected. Not so easy to write that the other way.

    One that you can't do, that I tried in Fortran-77 days, and even
    had the form to report bugs to DEC:

    READ(5,*,END=10) (X(I),Y(I),I=1,M)
    I=M+1
    10 N=I-1

    The standard does not make any guarantees about the
    data read in, in the case that END= is taken.

    This is because some systems buffer the records, and if a hardware eof
    is encountered, then the status of the arrays and of the implied do
    variable might be in some undefined state. The i/o library might have
    been trying to read the blocks in reverse order, for example, or with cylinder reads on a hard disk which would fill in the arrays with gaps.
    So the standard just punted and said that nothing is defined in order to
    give the i/o library the maximum flexibility.

    Yes. That was before I ever saw a real copy of the standard.
    The only thing I had was the VAX/VMS manual, which might have said
    that if I read it close enough. I do know that I never sent in the bug
    report, but now I am not sure why.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to All on Fri Mar 4 02:31:21 2022
    On Friday, March 4, 2022 at 8:09:32 PM UTC+11, gah4 wrote:
    On Thursday, March 3, 2022 at 10:40:00 PM UTC-8, Ron Shepard wrote:
    On 3/3/22 6:08 PM, gah4 wrote:

    If you want to allocate before the read, you can write the length on
    a separate record.

    READ(5,*) N
    ALLOCATE (X(N),Y(N))
    READ(5,*) (X(I), I=1,N),(Y(I), I=1,N)
    These days one would simply use

    read(5,*) x, y
    Only if you are reading the whole array, but yes in this case.
    I had previously written it without the allocate, as the arrays
    could already be big enough.
    or

    read(5,*) x(1:n), y(1:n)
    Does this work exactly the same? I suppose so.
    I was too used to doing this back to Fortran 66 days.
    and that works about as well for UNFORMATTED, and STREAM or not.

    For some file formats, you can write all the lengths out at first,
    and then all the data using lengths:

    READ(5,*) L, M, N
    READ(5,*) (X(I), I=1,L),(Y(I), I=1,M),(Z(I),I=1,N)
    Same here

    read(5,*) x(1:L), y(1:M), z(1:N)

    or something similar.
    or almost any other combination of reading that you like.
    One that I thought of, but never did, in the olden days:

    READ(5,*) L, (X(MIN(I,L)),I=1,L)

    which avoids going outside the array,
    .
    It does? How does it do that?
    .
    It still goes outside the array.
    .
    if the L read in is
    larger than expected. Not so easy to write that the other way.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to All on Fri Mar 4 02:25:49 2022
    On Friday, March 4, 2022 at 11:08:20 AM UTC+11, gah4 wrote:
    On Wednesday, February 23, 2022 at 11:49:46 PM UTC-8, Thomas Koenig wrote:

    (snip)
    What you are trying to do might be done better with an unformatted
    stream, which has no records (like a binary file in C).

    You can look at

    https://www.programming-idioms.org/idiom/228/copy-a-file

    (click on the Fortran tab) for an example on how to manipulate
    unformatted files, although what that example does is a bit
    different from what you want. It might give you a first
    idea, though.
    Note that with UNFORMATTED (or, for that matter, FORMATTED),
    and STREAM or not, the program reading the data in needs to know how
    long it is. In some cases, that is always known, but it most it is best to write the length into the file.

    As I noted above, and it should work for any type of Fortran I/O,
    you can read the length and data in the same statement:

    READ(5,*) N,(X(I), I=1,N),(Y(I), I=1,N)

    In Fortran 66 and 77 days, before dynamic allocation, that was fine.

    If you want to allocate before the read, you can write the length on
    a separate record.

    READ(5,*) N
    ALLOCATE (X(N),Y(N))
    READ(5,*) (X(I), I=1,N),(Y(I), I=1,N)

    and that works about as well for UNFORMATTED, and STREAM or not.

    For some file formats, you can write all the lengths out at first,
    and then all the data using lengths:

    READ(5,*) L, M, N
    READ(5,*) (X(I), I=1,L),(Y(I), I=1,M),(Z(I),I=1,N)

    or almost any other combination of reading that you like.

    One that you can't do, that I tried in Fortran-77 days, and even
    had the form to report bugs to DEC:

    READ(5,*,END=10) (X(I),Y(I),I=1,M)
    I=M+1
    10 N=I-1

    The standard does not make any guarantees about the
    data read in, in the case that END= is taken.

    If you want to so something like that, PL/I offers:
    ON ENDFILE (SYSIN) GO TO ...
    GET LIST ( (X(I) DO I = 1 TO N) );
    and the value of I can be used to tell you how many values were read. Alternatively, the built-in function COUNT tells you how many values were read in.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron Shepard@21:1/5 to All on Fri Mar 4 10:49:29 2022
    On 3/4/22 3:09 AM, gah4 wrote:
    One that I thought of, but never did, in the olden days:

    READ(5,*) L, (X(MIN(I,L)),I=1,L)

    which avoids going outside the array, if the L read in is
    larger than expected. Not so easy to write that the other way.

    By "outside" the array, I'm assuming you mean negative values of L.
    However, the implied do loop itself prevents those array elements from
    being accessed, so I do not think the min(I,L) expression actually does anything in that i/o list. If it were min(I,M) where M was the declared
    array upper bound, then yes, it would prevent accesses outside the array
    while still processing the correct number of elements during the data
    transfer.

    One can also skip over data within a record.

    READ(5,*) L, X(1:L), M, (IDUM,i=1,M), N

    This reads the indicated X(1:L) elements, then it reads M the number of elements to skip over, and then it gets the correct N value from the
    correct position within the input record. Like magic. I have also seen
    this written as

    READ(5,*) L, X(1:L), N, (N,i=1,N), N

    or as

    READ(5,*) L, X(1:L), N, (N,i=1,N+1)

    which is much more obscure, but accomplishes the same thing. As far as I
    know, that is legal and the operations are well defined, but it does
    cause some head scratching the first time you see it to figure out which
    N is changing and which N is not within that implied do loop.

    I also think that the array slice notation works for i/o the same as the implied do loop. Consider the following:

    implicit none
    integer :: n, a(3)
    a = -1
    read(*,*) n, a(1:n)
    write(*,*) n, a
    end

    Here, the (1:n) that is active during the read statement is the n that
    was just read. If n is <=0, then nothing is transferred, and if n>3,
    then elements that are out of bounds are accessed, generating a runtime
    i/o error. That is the same thing that happens with the equivalent
    implied do loop expression. However, I don't know how to skip over data
    in the input record with array slice notation, I think you must use the
    implied do loop for that.

    Such i/o statements can result in very complicated logic. Consider an
    i/o list like

    n, a(1:n), m, b(n:m)

    That is legal and works just like you would expect it to work. That is,
    the number of elements that are transferred and where they go depend on
    the values that are read within that data transfer. In most other
    languages, the programmer would need manually buffer the data during the
    i/o statement, and then transfer that information into a(:) and b(:) afterwards, or execute the parts of the statement separately with a
    series of i/o operations.

    $.02 -Ron Shepard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to Ron Shepard on Fri Mar 4 17:52:12 2022
    On Saturday, March 5, 2022 at 3:49:35 AM UTC+11, Ron Shepard wrote:
    On 3/4/22 3:09 AM, gah4 wrote:
    One that I thought of, but never did, in the olden days:

    READ(5,*) L, (X(MIN(I,L)),I=1,L)

    which avoids going outside the array, if the L read in is
    larger than expected. Not so easy to write that the other way.
    By "outside" the array, I'm assuming you mean negative values of L.
    .
    No.
    In the golden olden days, a loop is executed at least once.
    So, if L were negative, say, -3, C(-3) would be stored.
    .
    However, the implied do loop itself prevents those array elements from
    being accessed, so I do not think the min(I,L) expression actually does anything in that i/o list. If it were min(I,M) where M was the declared
    array upper bound, then yes, it would prevent accesses outside the array while still processing the correct number of elements during the data transfer.

    One can also skip over data within a record.

    READ(5,*) L, X(1:L), M, (IDUM,i=1,M), N

    This reads the indicated X(1:L) elements, then it reads M the number of elements to skip over, and then it gets the correct N value from the
    correct position within the input record. Like magic. I have also seen
    this written as

    READ(5,*) L, X(1:L), N, (N,i=1,N), N

    or as

    READ(5,*) L, X(1:L), N, (N,i=1,N+1)

    which is much more obscure, but accomplishes the same thing. As far as I know, that is legal and the operations are well defined, but it does
    cause some head scratching the first time you see it to figure out which
    N is changing and which N is not within that implied do loop.

    I also think that the array slice notation works for i/o the same as the implied do loop. Consider the following:

    implicit none
    integer :: n, a(3)
    a = -1
    read(*,*) n, a(1:n)
    write(*,*) n, a
    end

    Here, the (1:n) that is active during the read statement is the n that
    was just read. If n is <=0, then nothing is transferred, and if n>3,
    then elements that are out of bounds are accessed, generating a runtime
    i/o error.
    .
    Perhaps, but not necessarily. Fortran does not have automatic
    array bounds checking as does PL/I.
    And in any case, it would not be an I/O error, but a storage error.
    .
    That is the same thing that happens with the equivalent
    implied do loop expression. However, I don't know how to skip over data
    in the input record with array slice notation, I think you must use the implied do loop for that.

    Such i/o statements can result in very complicated logic. Consider an
    i/o list like

    n, a(1:n), m, b(n:m)

    That is legal and works just like you would expect it to work. That is,
    the number of elements that are transferred and where they go depend on
    the values that are read within that data transfer. In most other
    languages, the programmer would need manually buffer the data during the
    i/o statement, and then transfer that information into a(:) and b(:) afterwards, or execute the parts of the statement separately with a
    series of i/o operations.

    Not so with PL/I. What you see is what you get.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ron Shepard on Fri Mar 4 21:57:32 2022
    On Friday, March 4, 2022 at 8:49:35 AM UTC-8, Ron Shepard wrote:
    On 3/4/22 3:09 AM, gah4 wrote:
    One that I thought of, but never did, in the olden days:

    READ(5,*) L, (X(MIN(I,L)),I=1,L)

    which avoids going outside the array, if the L read in is
    larger than expected. Not so easy to write that the other way.

    By "outside" the array, I'm assuming you mean negative values of L.
    However, the implied do loop itself prevents those array elements from
    being accessed, so I do not think the min(I,L) expression actually does anything in that i/o list. If it were min(I,M) where M was the declared
    array upper bound, then yes, it would prevent accesses outside the array while still processing the correct number of elements during the data transfer.

    Oh, yes, that is what I meant.

    I think now you can:

    READ(1) L, (X(I),I=1,MIN(L,M))

    But not in Fortran 66, and I think not in Fortran 77 days.
    (and back to UNFORMATTED, again)

    Also, Fortran 66 doesn't allow DO or implied DO loops for 0 items.

    And for those who worry about array bounds, and also do C programming,
    how big does S need to be, not to overflow?

    sprintf(s, "%f", x);

    for double precision x? (On any machine.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Steve Lionel@21:1/5 to All on Sat Mar 5 12:40:24 2022
    On 3/5/2022 12:57 AM, gah4 wrote:
    Also, Fortran 66 doesn't allow DO or implied DO loops for 0 items.

    The wording of FORTRAN 66 is:

    "The control variable is assigned the value represented by the initial parameter. This value must be less than or equal to the value
    represented by the terminal parameter."

    In other words, if the loop would execute zero times, the program is non-conformant to the standard. Practically speaking, F66 compilers
    didn't bother checking this, so the behavior ended up being 1-trip and
    most programmers assumed that's how the language defined it.

    F77 does have zero-trip DO loops, but not for data-implied-DO.

    --
    Steve Lionel
    ISO/IEC JTC1/SC22/WG5 (Fortran) Convenor
    Retired Intel Fortran developer/support
    Email: firstname at firstnamelastname dot com
    Twitter: @DoctorFortran
    LinkedIn: https://www.linkedin.com/in/stevelionel
    Blog: https://stevelionel.com/drfortran
    WG5: https://wg5-fortran.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Steve Lionel on Sat Mar 5 10:35:29 2022
    On Saturday, March 5, 2022 at 9:40:29 AM UTC-8, Steve Lionel wrote:
    On 3/5/2022 12:57 AM, gah4 wrote:
    Also, Fortran 66 doesn't allow DO or implied DO loops for 0 items.
    The wording of FORTRAN 66 is:

    "The control variable is assigned the value represented by the initial parameter. This value must be less than or equal to the value
    represented by the terminal parameter."

    In other words, if the loop would execute zero times, the program is non-conformant to the standard. Practically speaking, F66 compilers
    didn't bother checking this, so the behavior ended up being 1-trip and
    most programmers assumed that's how the language defined it.

    IBM compilers would check it for constants, but not variables.
    Also, the constants had to be greater than zero.

    I never tried setting a variable right before the DO statement,
    where an optimizing compiler might figure it out.

    F77 does have zero-trip DO loops, but not for data-implied-DO.

    This would be io-implied-do.

    Also, for those who forgot by now, Fortran 66 doesn't allow constants
    in I/O lists. I had variables with names like ZERO and ONE, to write those values out. One of my favorite new features in Fortran 77 was expressions
    in I/O lists.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to All on Sat Mar 5 23:45:35 2022
    On Sunday, March 6, 2022 at 5:35:32 AM UTC+11, gah4 wrote:
    On Saturday, March 5, 2022 at 9:40:29 AM UTC-8, Steve Lionel wrote:
    On 3/5/2022 12:57 AM, gah4 wrote:
    Also, Fortran 66 doesn't allow DO or implied DO loops for 0 items.
    The wording of FORTRAN 66 is:

    "The control variable is assigned the value represented by the initial parameter. This value must be less than or equal to the value
    represented by the terminal parameter."

    In other words, if the loop would execute zero times, the program is non-conformant to the standard. Practically speaking, F66 compilers
    didn't bother checking this, so the behavior ended up being 1-trip and
    most programmers assumed that's how the language defined it.
    IBM compilers would check it for constants, but not variables.
    Also, the constants had to be greater than zero.

    I never tried setting a variable right before the DO statement,
    where an optimizing compiler might figure it out.
    F77 does have zero-trip DO loops, but not for data-implied-DO.
    This would be io-implied-do.
    .
    No, it is an implied-DO.
    .
    An implied-DO can exist only in an I/O list.
    .
    Also, for those who forgot by now, Fortran 66 doesn't allow constants
    in I/O lists.
    I had variables with names like ZERO and ONE, to write those
    values out. One of my favorite new features in Fortran 77 was expressions
    in I/O lists.
    .
    Expressions in I/O lists had been already available in PL/I for more than
    a decade.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron Shepard@21:1/5 to Robin Vowels on Sun Mar 6 11:43:52 2022
    On 3/6/22 1:45 AM, Robin Vowels wrote:
    On Sunday, March 6, 2022 at 5:35:32 AM UTC+11, gah4 wrote:
    On Saturday, March 5, 2022 at 9:40:29 AM UTC-8, Steve Lionel wrote:
    [...]
    F77 does have zero-trip DO loops, but not for data-implied-DO.
    This would be io-implied-do.
    .
    No, it is an implied-DO.
    .
    An implied-DO can exist only in an I/O list.

    As Steve Lionel points out, implied do was also used in data statements
    in F66. F66 did not have parameters, so any such data statements had to
    be defined with literal constants, so it was straightforward for the
    programmer to ensure the loop was executed at least once. But implied do
    in an i/o list could have variables, even variables that were defined
    within the i/o operation itself. F77 introduced parameters, and those parameters could be used in data statements, both as values and as the
    limits for implied do loops in data statements, so then the possibility
    of zero-trip loops was addressed. F90 extended the use of implied do
    loops to array constructors.

    Speaking of parameters in data statements in F77, I remember also that expressions were not allowed. So if ONE was a parameter, then a data
    statement like

    DATA X/-ONE/

    was not allowed. That was an expression, not a constant. The programmer
    was required to do something like

    PARAMETER (MONE=-ONE)
    DATA X/MONE/

    .
    Also, for those who forgot by now, Fortran 66 doesn't allow constants
    in I/O lists.

    Or in DO loop ranges.

    I had variables with names like ZERO and ONE, to write those
    values out. One of my favorite new features in Fortran 77 was expressions
    in I/O lists.

    Along with expressions in DO loops. In F77, there were other reasons to
    use parameters such as ZERO, IZERO, ONE, and IONE, among others, rather
    than literal constants, so that programming convention continued on
    until the present day. Now in F90+ with the ability to specify the kind
    of a literal constant, this use of parameters is less critical.

    .
    Expressions in I/O lists had been already available in PL/I for more than
    a decade.

    And along with expressions in DO loops, this was also a common extension
    to F66, which caused portability issues with codes moving between
    different vendors. Portability of PL/I codes between different vendors
    was obviously less of an issue since many vendors did not even attempt
    to support the language, so the issue never arose.

    $.02 -Ron Shepard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Steve Lionel@21:1/5 to Robin Vowels on Sun Mar 6 14:45:12 2022
    On 3/6/2022 2:45 AM, Robin Vowels wrote:
    An implied-DO can exist only in an I/O list.

    Not so.

    R840 data-implied-do is ( data-i-do-object-list , [ integer-type-spec :: ] data-i-do-variable =
    scalar-int-constant-expr ,
    scalar-int-constant-expr
    [ , scalar-int-constant-expr ]

    8.6.7 DATA statement

    and...

    R774 ac-implied-do is ( ac-value-list , ac-implied-do-control )

    7.8 Construction of array values

    --
    Steve Lionel
    ISO/IEC JTC1/SC22/WG5 (Fortran) Convenor
    Retired Intel Fortran developer/support
    Email: firstname at firstnamelastname dot com
    Twitter: @DoctorFortran
    LinkedIn: https://www.linkedin.com/in/stevelionel
    Blog: https://stevelionel.com/drfortran
    WG5: https://wg5-fortran.org

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ron Shepard on Sun Mar 6 14:21:32 2022
    On Sunday, March 6, 2022 at 9:43:56 AM UTC-8, Ron Shepard wrote:

    (snip)

    Speaking of parameters in data statements in F77, I remember also that expressions were not allowed. So if ONE was a parameter, then a data statement like

    DATA X/-ONE/

    was not allowed. That was an expression, not a constant. The programmer
    was required to do something like

    PARAMETER (MONE=-ONE)
    DATA X/MONE/

    I almost forgot about that one.

    Note that in ordinary expressions, there are no negative constants.
    There are positive constants and a unary - operator.

    But in DATA statements, there are negative constants.

    Besides that, in Fortran 66 Hollerith constants were only allowed
    in DATA statements.

    And in IBM Fortran IV, Z constants only in DATA statements.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to Steve Lionel on Sun Mar 6 17:56:28 2022
    On Monday, March 7, 2022 at 6:45:16 AM UTC+11, Steve Lionel wrote:
    On 3/6/2022 2:45 AM, Robin Vowels wrote:
    An implied-DO can exist only in an I/O list.
    Not so.
    .
    False.
    In F66, implied-DO existed only in I/O lists.
    See 7.2.2 in the standard.

    The following is from some later version of the standard, not F66.

    R840 data-implied-do is ( data-i-do-object-list , [ integer-type-spec :: ] data-i-do-variable =
    scalar-int-constant-expr ,
    scalar-int-constant-expr
    [ , scalar-int-constant-expr ]

    8.6.7 DATA statement

    and...

    R774 ac-implied-do is ( ac-value-list , ac-implied-do-control )

    7.8 Construction of array values
    --
    Steve Lionel

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Robin Vowels on Sun Mar 6 18:09:08 2022
    On Sunday, March 6, 2022 at 5:51:15 PM UTC-8, Robin Vowels wrote:
    On Monday, March 7, 2022 at 4:43:56 AM UTC+11, Ron Shepard wrote:

    (snip)

    As Steve Lionel points out, implied do was also used in data statements
    in F66.

    No it wasn't. See 7.2.2 of the standard.

    F66 did not have parameters, so any such data statements had to
    be defined with literal constants, so it was straightforward for the programmer to ensure the loop was executed at least once.

    And, if you go back and look, Steve Lionel didn't mention it as being
    in Fortran 66. data-implied-do has since been added, and so the statement
    that implied-do is only in I/O statements is (currently) false.

    In your imagination.
    DATA statements in F66 could not have implied-DO.

    But implied do
    in an i/o list could have variables, even variables that were defined within the i/o operation itself. F77 introduced parameters, and those parameters could be used in data statements, both as values and as the limits for implied do loops in data statements, so then the possibility
    of zero-trip loops was addressed. F90 extended the use of implied do
    loops to array constructors.

    (snip)

    What?
    See 7.1.2.8, which states unequivocally that m1, m2, and m2 can be an "integer constant or integer variable".

    They can in I/O statements, but not in DATA statements, as DATA statements
    have to be resolved at compile time. Remember, the variables have the SAVE attribute.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to Ron Shepard on Sun Mar 6 17:51:12 2022
    On Monday, March 7, 2022 at 4:43:56 AM UTC+11, Ron Shepard wrote:
    On 3/6/22 1:45 AM, Robin Vowels wrote:
    On Sunday, March 6, 2022 at 5:35:32 AM UTC+11, gah4 wrote:
    On Saturday, March 5, 2022 at 9:40:29 AM UTC-8, Steve Lionel wrote:
    [...]
    F77 does have zero-trip DO loops, but not for data-implied-DO.
    This would be io-implied-do.
    .
    No, it is an implied-DO.
    .
    An implied-DO can exist only in an I/O list.
    .
    As Steve Lionel points out, implied do was also used in data statements
    in F66.
    .
    No it wasn't. See 7.2.2 of the standard.
    .
    F66 did not have parameters, so any such data statements had to
    be defined with literal constants, so it was straightforward for the programmer to ensure the loop was executed at least once.
    .
    In your imagination.
    DATA statements in F66 could not have implied-DO.
    .
    But implied do
    in an i/o list could have variables, even variables that were defined
    within the i/o operation itself. F77 introduced parameters, and those parameters could be used in data statements, both as values and as the
    limits for implied do loops in data statements, so then the possibility
    of zero-trip loops was addressed. F90 extended the use of implied do
    loops to array constructors.

    Speaking of parameters in data statements in F77, I remember also that expressions were not allowed. So if ONE was a parameter, then a data statement like

    DATA X/-ONE/

    was not allowed. That was an expression, not a constant. The programmer
    was required to do something like

    PARAMETER (MONE=-ONE)
    DATA X/MONE/
    .
    Also, for those who forgot by now, Fortran 66 doesn't allow constants
    in I/O lists.
    .
    Or in DO loop ranges.
    .
    What?
    See 7.1.2.8, which states unequivocally that m1, m2, and m2 can be an
    "integer constant or integer variable".
    .
    I had variables with names like ZERO and ONE, to write those
    values out. One of my favorite new features in Fortran 77 was expressions >> in I/O lists.
    Along with expressions in DO loops. In F77, there were other reasons to
    use parameters such as ZERO, IZERO, ONE, and IONE, among others, rather
    than literal constants, so that programming convention continued on
    until the present day. Now in F90+ with the ability to specify the kind
    of a literal constant, this use of parameters is less critical.
    .
    Expressions in I/O lists had been already available in PL/I for more than
    a decade.
    .
    And along with expressions in DO loops, this was also a common extension
    to F66, which caused portability issues with codes moving between
    different vendors. Portability of PL/I codes between different vendors
    was obviously less of an issue since many vendors did not even attempt
    to support the language, so the issue never arose.
    .
    IBM was the original implementor of PL/I.
    Later, CDC, DEC, Burroughs, UNIVAC, Honeywell, and various universities produced PL/I compilers -- the most notable of which was the
    very fast compiler, PL/C (from Cornell University).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Robin Vowels@21:1/5 to All on Sun Mar 6 19:01:32 2022
    On Monday, March 7, 2022 at 1:09:10 PM UTC+11, gah4 wrote:
    On Sunday, March 6, 2022 at 5:51:15 PM UTC-8, Robin Vowels wrote:
    On Monday, March 7, 2022 at 4:43:56 AM UTC+11, Ron Shepard wrote:
    (snip)
    As Steve Lionel points out, implied do was also used in data statements in F66.
    No it wasn't. See 7.2.2 of the standard.
    F66 did not have parameters, so any such data statements had to
    be defined with literal constants, so it was straightforward for the programmer to ensure the loop was executed at least once.
    .
    And, if you go back and look, Steve Lionel didn't mention it as being
    in Fortran 66.
    .
    If you go back, you will see that we were talking about F66.
    Please try to keep your eye on the ball.
    .
    data-implied-do has since been added, and so the statement
    that implied-do is only in I/O statements is (currently) false.
    .
    Try to watch the ball.
    .
    In your imagination.
    .
    DATA statements in F66 could not have implied-DO.
    But implied do
    in an i/o list could have variables, even variables that were defined within the i/o operation itself. F77 introduced parameters, and those parameters could be used in data statements, both as values and as the limits for implied do loops in data statements, so then the possibility of zero-trip loops was addressed. F90 extended the use of implied do loops to array constructors.
    (snip)
    What?
    See 7.1.2.8, which states unequivocally that m1, m2, and m2 can be an "integer constant or integer variable".
    .
    They can in I/O statements, but not in DATA statements,
    .
    At the risk of repeating myself, in F66 DATA statements could not contain implied-DO.
    .
    as DATA statements
    have to be resolved at compile time. Remember, the variables have the SAVE attribute.
    .
    In F66, SAVE attribute did not exist.
    .
    Please pay attention.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Robin Vowels on Sun Mar 6 21:02:19 2022
    On Sunday, March 6, 2022 at 7:01:34 PM UTC-8, Robin Vowels wrote:

    (snip)
    And, if you go back and look, Steve Lionel didn't mention it as being
    in Fortran 66.

    If you go back, you will see that we were talking about F66.
    Please try to keep your eye on the ball.

    And if you go back, you will see that it was a different post
    that was talking about Fortran 66. People can at least change
    subject between posts, and often with different paragraphs
    within the same post.

    And most people can figure out that post was not talking
    about Fortran 66, but maybe not everyone.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron Shepard@21:1/5 to Robin Vowels on Mon Mar 7 11:54:04 2022
    On 3/6/22 7:51 PM, Robin Vowels wrote:
    On Monday, March 7, 2022 at 4:43:56 AM UTC+11, Ron Shepard wrote:
    [...]
    In your imagination.
    DATA statements in F66 could not have implied-DO.

    You are right about this, so I stand corrected. My memory was off. When
    I used these, it must have been either an extension to f66 or perhaps it
    was an early f77 compiler. The common use for this was to initialize
    subsets of the elements of larger arrays. The different subsets could
    appear in separate data statements, keeping the number of continuation
    lines manageable. There was also the R* repeat counts in the data lists
    that were allowed to shorten the data statements.


    I also remember using sometimes extensive equivalence statements in
    order to initialize subsets of elements of large arrays in data
    statements. Each of the equivalenced arrays could appear in its own data statement.

    $.02 -Ron Shepard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ron Shepard on Mon Mar 7 20:38:32 2022
    On Monday, March 7, 2022 at 9:54:16 AM UTC-8, Ron Shepard wrote:

    (snip)

    In your imagination.
    DATA statements in F66 could not have implied-DO.
    You are right about this, so I stand corrected. My memory was off. When
    I used these, it must have been either an extension to f66 or perhaps it
    was an early f77 compiler. The common use for this was to initialize
    subsets of the elements of larger arrays. The different subsets could
    appear in separate data statements, keeping the number of continuation
    lines manageable. There was also the R* repeat counts in the data lists
    that were allowed to shorten the data statements.

    I found out about DATA statements early after I started learning Fortran,
    and in one program initialized a large (for the time) array to zero.

    Then I decided to punch the object program on cards. The whole array
    got punched! Some object formats compress out zeros, but not all
    of them. And if you initialize a large array to non-zero, it will all go
    in to the object program on most other systems.

    I also remember using sometimes extensive equivalence statements in
    order to initialize subsets of elements of large arrays in data
    statements. Each of the equivalenced arrays could appear in its own data statement.

    One Fortran 66 feature that has since been removed, is the ability
    to EQUIVALENCE more than 1D arrays with only one subscript.

    A few years ago, I was running IBM's ECAP on gfortran, and had to
    fix all of those.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to gah4@u.washington.edu on Tue Mar 8 07:03:07 2022
    gah4 <gah4@u.washington.edu> schrieb:
    On Monday, March 7, 2022 at 9:54:16 AM UTC-8, Ron Shepard wrote:

    (snip)

    In your imagination.
    DATA statements in F66 could not have implied-DO.
    You are right about this, so I stand corrected. My memory was off. When
    I used these, it must have been either an extension to f66 or perhaps it
    was an early f77 compiler. The common use for this was to initialize
    subsets of the elements of larger arrays. The different subsets could
    appear in separate data statements, keeping the number of continuation
    lines manageable. There was also the R* repeat counts in the data lists
    that were allowed to shorten the data statements.

    I found out about DATA statements early after I started learning Fortran,
    and in one program initialized a large (for the time) array to zero.

    Then I decided to punch the object program on cards. The whole array
    got punched!

    I can imagine.

    Some object formats compress out zeros, but not all
    of them.

    It is (now) a matter of what segment it goes in.

    And if you initialize a large array to non-zero, it will all go
    in to the object program on most other systems.

    Correct.

    Looking at ELF, it has the .bss and .data (and .data1) sections.
    .bss does not contribute to the file space and is initialized to
    zero on program startup (usually done by mapping a zero page these
    days) The .data sectoin holds initialized data.

    While ELF is quite modern, it took concepts from older object
    formats, so what happened when you initialized the array with
    all zeros, it got put into .data instead of .bss. (The names
    may have been different at the time, although at least the
    name BSS dates back to the IBM 704, but the meaning may
    have been subtly different).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ron Shepard@21:1/5 to Thomas Koenig on Tue Mar 8 09:33:03 2022
    On 3/8/22 1:03 AM, Thomas Koenig wrote:
    gah4 <gah4@u.washington.edu> schrieb:
    On Monday, March 7, 2022 at 9:54:16 AM UTC-8, Ron Shepard wrote:

    (snip)

    In your imagination.
    DATA statements in F66 could not have implied-DO.
    You are right about this, so I stand corrected. My memory was off. When
    I used these, it must have been either an extension to f66 or perhaps it >>> was an early f77 compiler. The common use for this was to initialize
    subsets of the elements of larger arrays. The different subsets could
    appear in separate data statements, keeping the number of continuation
    lines manageable. There was also the R* repeat counts in the data lists
    that were allowed to shorten the data statements.

    I found out about DATA statements early after I started learning Fortran,
    and in one program initialized a large (for the time) array to zero.

    Then I decided to punch the object program on cards. The whole array
    got punched!

    I can imagine.

    I began programming with cards, and it does give you some insight into
    the computing process when you compile your source code from cards,
    punch object decks onto cards, put them in card boxes, and physically
    carry them back and forth to the computer center when you submit your jobs.

    When you do this with tapes or disk storage (or these days SSD or cloud storage), it all becomes a little more abstract.

    As to whether it is good or bad to put initialized arrays in your codes
    or to assign values at run time depends on how many boxes your object
    decks are, whether you have to pay to have the cards read every time you
    run a job, and the relative costs (in money, time, or general hassle) of reading the cards or executing the loops at run time. I also remember
    doing things like taking my cards (both source code and object code) and marking diagonal stripes on the edges with a felt tip marker; just in
    case the box was dropped, that made it possible to put everything back
    in order. That was an era when person time was cheap and computer time
    was expensive.

    $.02 -Ron Shepard

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Koenig@21:1/5 to Ron Shepard on Tue Mar 8 18:41:42 2022
    Ron Shepard <nospam@nowhere.org> schrieb:

    I began programming with cards, and it does give you some insight into
    the computing process when you compile your source code from cards,
    punch object decks onto cards, put them in card boxes, and physically
    carry them back and forth to the computer center when you submit your jobs.

    I can imagine.

    I never used punched cards, I started using large systems when terminals
    were already there. I still had to walk down and get the printout,
    though.

    When you do this with tapes or disk storage (or these days SSD or cloud storage), it all becomes a little more abstract.

    Have to watch out for the size of those PDS :-)

    As to whether it is good or bad to put initialized arrays in your codes
    or to assign values at run time depends on how many boxes your object
    decks are, whether you have to pay to have the cards read every time you
    run a job, and the relative costs (in money, time, or general hassle) of reading the cards or executing the loops at run time. I also remember
    doing things like taking my cards (both source code and object code) and marking diagonal stripes on the edges with a felt tip marker; just in
    case the box was dropped, that made it possible to put everything back
    in order. That was an era when person time was cheap and computer time
    was expensive.

    Since FORTRAN was done on the IBM 704, which could only read 72
    characters from an 80-column punched card (a computer limitation
    influencing language design if there ever was one), the last eight
    columns were in effect comments and could be filled with sequences
    numbers. I believe there were sorters for this, but maybe these
    were not available to you.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Thomas Koenig on Tue Mar 8 10:48:49 2022
    On Monday, March 7, 2022 at 11:03:11 PM UTC-8, Thomas Koenig wrote:

    (snip)

    Looking at ELF, it has the .bss and .data (and .data1) sections.
    .bss does not contribute to the file space and is initialized to
    zero on program startup (usually done by mapping a zero page these
    days) The .data sectoin holds initialized data.

    While ELF is quite modern, it took concepts from older object
    formats, so what happened when you initialized the array with
    all zeros, it got put into .data instead of .bss. (The names
    may have been different at the time, although at least the
    name BSS dates back to the IBM 704, but the meaning may
    have been subtly different).

    The OS/360, and still works in z/OS, object format has 80 byte records,
    each with a starting address and length. Uninitialized data doesn't have
    any record covering it. If you switch between DC and DS, it outputs
    some data and starts the next one when needed. But the output of
    the linkage editor has longer records. It will fill in smaller holes (uninitialized data) but leave larger ones between its output records. (Traditionally, they got filled with whatever was in the buffer at the time.)

    Then when the program is loaded into memory, uninitialized data gets
    filled with whatever is there. In the OS/360 days, that was whatever was
    left, but later was cleared to zero. (Or, one system I knew, X'81')

    If you want to be sure of a specific value, you have to change both the
    linkage editor and program fetch. In any case, the object format
    has no special case for zero.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)