• does the OPEN function in GNU Fortran support Unicode file names ?

    From Lynn McGuire@21:1/5 to All on Wed Jul 14 21:24:37 2021
    Does the OPEN function in GNU Fortran support Unicode file names ?

    And Unicode file paths ?

    Thanks,
    Lynn

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Lynn McGuire on Wed Jul 14 22:18:45 2021
    On Wednesday, July 14, 2021 at 7:24:44 PM UTC-7, Lynn McGuire wrote:
    Does the OPEN function in GNU Fortran support Unicode file names ?

    And Unicode file paths ?

    Using trial and no error, it seems that 4.1.2 and 6.3.0 can do it.

    I presume you have to use it on a file system that supports them.

    open(unit=1,file='🦛🐦🐑🐑')
    write(1,*) '🦛🐦🐑🐑'
    end

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From FortranFan@21:1/5 to All on Thu Jul 15 07:15:33 2021
    On Thursday, July 15, 2021 at 1:18:47 AM UTC-4, gah4 wrote:

    On Wednesday, July 14, 2021 at 7:24:44 PM UTC-7, Lynn McGuire wrote:
    Does the OPEN function in GNU Fortran support Unicode file names ?

    And Unicode file paths ?
    Using trial and no error, it seems that 4.1.2 and 6.3.0 can do it.

    I presume you have to use it on a file system that supports them.

    open(unit=1,file='🦛🐦🐑🐑')
    write(1,*) '🦛🐦🐑🐑'
    end

    With modern Fortran, it'll be better to do along the following the lines:

    integer, parameter :: ck = selected_char_kind('ISO_10646')
    integer :: lun
    .
    open( unit=lun, file=ck_'🦛🐦🐑🐑', encoding='utf-8', ..)
    write( lun, fmt=.. ) ck_'🦛🐦🐑🐑'

    Please consider just a quick snippet to point out language facilities; for details, please consult the Fortran standard and compiler documentation for the supported options from the standard.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn McGuire@21:1/5 to All on Thu Jul 15 14:19:28 2021
    On 7/15/2021 12:18 AM, gah4 wrote:
    On Wednesday, July 14, 2021 at 7:24:44 PM UTC-7, Lynn McGuire wrote:
    Does the OPEN function in GNU Fortran support Unicode file names ?

    And Unicode file paths ?

    Using trial and no error, it seems that 4.1.2 and 6.3.0 can do it.

    I presume you have to use it on a file system that supports them.

    open(unit=1,file='🦛🐦🐑🐑')
    write(1,*) '🦛🐦🐑🐑'
    end

    Thanks !

    Trying to port our calculation engine to Simply Fortran is rising much
    higher on my list of things to do.
    http://simplyfortran.com/

    Simply Fortran includes GNU Fortran 10.2.0 and GCC.

    Lynn

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to FortranFan on Thu Jul 15 17:32:35 2021
    On Thursday, July 15, 2021 at 7:15:35 AM UTC-7, FortranFan wrote:
    On Thursday, July 15, 2021 at 1:18:47 AM UTC-4, gah4 wrote:

    (after I wrote)

    (snip)

    With modern Fortran, it'll be better to do along the following the lines:

    integer, parameter :: ck = selected_char_kind('ISO_10646')
    integer :: lun
    .
    open( unit=lun, file=ck_'🦛🐦🐑🐑', encoding='utf-8', ..)
    write( lun, fmt=.. ) ck_'🦛🐦🐑🐑'

    OK, now we have to actually figure it out.
    (Including what the question left out.)

    And as I noted, it depends on the file system.

    The computer I tried first uses the ext2 file system. Like many Unix and Unix-like
    file systems, allows all bytes except '/' and '\0' in names. So, UTF-8 bytes will
    go through like any other bytes. (And, maybe important, illegal UTF-8 byte combinations will go through, too.)

    More often I work on an NFS mounted file system, where the server seems to
    be using ext4. (I never checked this before.) As above, all bytes except '/' and '\0'.
    I presume NFS will pass them through.

    The computer I write this on has hfs+, which it seems uses UTF-16 for names. This computer also mounts the above mentioned NFS partition, which is where
    I tried compiling the second time. hfs+ does allow all characters, including '\0'
    and '/', though the OS might disallow the latter. I don't think Unix-like systems
    can let it though, though NFS should be able to write it.

    It seems that newer OS X systems use APFS, which uses Unicode 9.0 stored
    as UTF-8. I don't have gfortran on that one. I think that means that it will disallow byte combinations that are not legal UTF-8. And again, no restrictions
    on '\0' or '/'.

    So, the question is not only what gfortran allows, but what the underlying
    file system allows, and whether gfortran adjusts based on the file system.

    And since we are writing them here, what the web interface allows through
    when we post things.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Lynn McGuire@21:1/5 to All on Thu Jul 15 21:36:56 2021
    On 7/15/2021 7:32 PM, gah4 wrote:
    On Thursday, July 15, 2021 at 7:15:35 AM UTC-7, FortranFan wrote:
    On Thursday, July 15, 2021 at 1:18:47 AM UTC-4, gah4 wrote:

    (after I wrote)

    (snip)

    With modern Fortran, it'll be better to do along the following the lines:

    integer, parameter :: ck = selected_char_kind('ISO_10646')
    integer :: lun
    .
    open( unit=lun, file=ck_'🦛🐦🐑🐑', encoding='utf-8', ..)
    write( lun, fmt=.. ) ck_'🦛🐦🐑🐑'

    OK, now we have to actually figure it out.
    (Including what the question left out.)

    And as I noted, it depends on the file system.

    The computer I tried first uses the ext2 file system. Like many Unix and Unix-like
    file systems, allows all bytes except '/' and '\0' in names. So, UTF-8 bytes will
    go through like any other bytes. (And, maybe important, illegal UTF-8 byte combinations will go through, too.)

    More often I work on an NFS mounted file system, where the server seems to
    be using ext4. (I never checked this before.) As above, all bytes except '/' and '\0'.
    I presume NFS will pass them through.

    The computer I write this on has hfs+, which it seems uses UTF-16 for names. This computer also mounts the above mentioned NFS partition, which is where
    I tried compiling the second time. hfs+ does allow all characters, including '\0'
    and '/', though the OS might disallow the latter. I don't think Unix-like systems
    can let it though, though NFS should be able to write it.

    It seems that newer OS X systems use APFS, which uses Unicode 9.0 stored
    as UTF-8. I don't have gfortran on that one. I think that means that it will
    disallow byte combinations that are not legal UTF-8. And again, no restrictions
    on '\0' or '/'.

    So, the question is not only what gfortran allows, but what the underlying file system allows, and whether gfortran adjusts based on the file system.

    And since we are writing them here, what the web interface allows through when we post things.

    We write our software on Windows 7 Pro x64 PCs running on NTFS. NTFS is
    a UTF-16 system. We are in the process of moving to Windows 10 Pro x64.

    Thanks,
    Lynn

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ev. Drikos@21:1/5 to Lynn McGuire on Fri Jul 16 14:18:57 2021
    On 15/07/2021 05:24, Lynn McGuire wrote:
    Does the OPEN function in GNU Fortran support Unicode file names ?
    ...

    To my understanding, gfortran accepts file names without examining
    if the characters are ASCII or UTF-8.

    The issue of non-ASCII characters in file names has been discussed
    few years ago in c.l.f:

    https://groups.google.com/g/comp.lang.fortran/c/Ac5a0YwuRo8/m/Kcj8PdcZBgAJ

    As mentioned in the last message of the thread, when one compiles in
    Cygwin, UTF-8 files names are transparently converted to UTF-16. But
    clearly such a solution isn't portable to other Fortran compilers.

    Ev. Drikos

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ev. Drikos@21:1/5 to Ev. Drikos on Tue Jul 20 14:09:11 2021
    On 16/07/2021 14:18, Ev. Drikos wrote:
    ...
    To my understanding, gfortran accepts file names without examining
    if the characters are ASCII or UTF-8.
    ...

    @everyone


    Here is a fresh program along with the execution steps (the program
    couldn't echo the '*' to the file below in Windows-8.1).

    Ev. Drikos

    ------------------------------------------------
    Windows PowerShell
    Copyright (C) 2014 Microsoft Corporation. All rights reserved.

    PS C:\Users\suser> .\fecho.exe αστρονομια.txt ';'
    PS C:\Users\suser> cat αστρονομια.txt
    ;
    PS C:\Users\suser>

    ------------------------------------------------
    !
    ! Echo a string to a certain file.
    !
    ! GNU Fortran silently accepts UTF-8 file names,
    ! and Cygwin transparently converts them to UTF-16.
    !
    program file_echo
    implicit none

    integer i , size, pos

    CHARACTER(len=64) :: fname, iname, str

    if ( iargc() /= 2 ) then
    print *, "Usage is: fecho file string"
    error stop -1
    endif


    CALL getarg(1, fname)
    CALL getarg(2, str)
    !write (*,*) fname, str

    do i=1,len(fname)
    if (fname(i:i)=='*') then
    iname(i:i)= ' '
    else
    iname(i:i)=fname(i:i)
    end if
    end do

    open( 20, file = iname, position='append')
    write( 20, * ) trim(str)
    close( 20 )


    end program file_echo

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ev. Drikos on Tue Jul 20 12:45:29 2021
    On Tuesday, July 20, 2021 at 4:09:15 AM UTC-7, Ev. Drikos wrote:
    On 16/07/2021 14:18, Ev. Drikos wrote:
    ...
    To my understanding, gfortran accepts file names without examining
    if the characters are ASCII or UTF-8.
    ...

    @everyone


    Here is a fresh program along with the execution steps (the program
    couldn't echo the '*' to the file below in Windows-8.1).

    I am not sure which '*' it couldn't echo, but if PowerShell is anything
    like a Unix shell, it expands * using the file names in the current
    (or otherwise appropriate) directory. Unless in quotes or apostrophes.

    echo *

    should get you a list of files in the current directory. (Useful sometimes when
    ls doesn't work.)

    I have had problems on unix-like systems in the past, where programs assume files are UTF-8, when they are not. Using wc on an object program, for example,
    but wc -c works. There are byte combinations there are not legal in UTF-8,
    but which are for names in many file systems.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ev. Drikos@21:1/5 to All on Wed Jul 21 07:22:00 2021
    On 20/07/2021 22:45, gah4 wrote:
    ...
    I am not sure which '*' it couldn't echo, but if PowerShell is anything
    like a Unix shell, it expands * using the file names in the current
    (or otherwise appropriate) directory. Unless in quotes or apostrophes.


    Unfortunately, the program can't process this, which works ie in macOS:

    PS C:\Users\suser> .\fecho.exe αστρονομια.txt '*'

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ev. Drikos@21:1/5 to All on Wed Jul 21 09:51:33 2021
    On 21/07/2021 09:34, gah4 wrote:
    As for the actual program, it might be that PowerShell doesn't process
    the '*' right, or it might not get into the argument list the right way,
    or the I/O library might not be able to write one out, or ...

    It turned out that the problem is reproduced in the home directory of
    the user, where the '*' is expanded to all directory files. In my case
    the program had 29 arguments. Likely, it's a bug in a Windows-8.1 app.

    Ev. Drikos

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ev. Drikos on Tue Jul 20 23:34:44 2021
    On Tuesday, July 20, 2021 at 9:22:05 PM UTC-7, Ev. Drikos wrote:
    On 20/07/2021 22:45, gah4 wrote:

    I am not sure which '*' it couldn't echo, but if PowerShell is anything like a Unix shell, it expands * using the file names in the current
    (or otherwise appropriate) directory. Unless in quotes or apostrophes.

    Unfortunately, the program can't process this, which works ie in macOS:

    PS C:\Users\suser> .\fecho.exe αστρονομια.txt '*'

    I mostly don't use PowerShell on Windows systems, but it still leaves way too many questions.

    The quoting conventions on CMD have, in cases I used them, seem very confusing, and I am not sure always work. (Not that they are completely obvious on Unix-like
    systems, especially when expanding variables in shell scripts, as arguments for another program.)

    https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file

    explains the characters not allowed in file names on Windows systems,
    some of which depend on the file system, and some on which part of the OS doesn't allow. * is not allowed in file names, along with way too many
    other characters.

    In the MSDOS days, there was a library routine that you could call,
    (at least from C) which would expand * and ? in argument lists, in the
    way the Unix shells would. (At least close enough most of the time.)

    As for the actual program, it might be that PowerShell doesn't process
    the '*' right, or it might not get into the argument list the right way,
    or the I/O library might not be able to write one out, or ...

    what do

    echo '*'
    echo "*"
    echo *

    do?

    (That is, the hopefully built-in echo command.)

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to Ev. Drikos on Wed Jul 21 00:58:10 2021
    On Tuesday, July 20, 2021 at 11:51:38 PM UTC-7, Ev. Drikos wrote:
    On 21/07/2021 09:34, gah4 wrote:
    As for the actual program, it might be that PowerShell doesn't process
    the '*' right, or it might not get into the argument list the right way,
    or the I/O library might not be able to write one out, or ...
    It turned out that the problem is reproduced in the home directory of
    the user, where the '*' is expanded to all directory files. In my case
    the program had 29 arguments. Likely, it's a bug in a Windows-8.1 app.

    OK, I have an actual Windows 10 system with PowerShell.

    I tried

    echo *

    and got an actual

    *

    so it isn't expanding them at that level.

    This one has Watcom C and Watcom Fortran, but gcc or gfortran, so I can't
    test so many things on it.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cyrmag@21:1/5 to Ev. Drikos on Wed Jul 21 08:15:17 2021
    On 7/21/2021 7:16 AM, Ev. Drikos wrote:
    On 21/07/2021 10:58, gah4 wrote:
    ...
    I tried

    echo *

    and got an actual

    *

    ...

    Just posted a question to the Cygwin forum: https://cygwin.com/pipermail/cygwin/2021-July/248946.html

    Ev. Drikos

    ECHO is a built-in command in Windows. There is also an echo.exe in
    Cygwin, and this one expands arguments with wildcards.

    When you type "echo *" in a Cygwin shell, you will get the expansion you
    appear to want.

    -- CyrMag

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ev. Drikos@21:1/5 to All on Wed Jul 21 15:16:26 2021
    On 21/07/2021 10:58, gah4 wrote:
    ...
    I tried

    echo *

    and got an actual

    *

    ...

    Just posted a question to the Cygwin forum: https://cygwin.com/pipermail/cygwin/2021-July/248946.html

    Ev. Drikos

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From gah4@21:1/5 to CyrMag on Wed Jul 21 12:18:22 2021
    On Wednesday, July 21, 2021 at 6:15:21 AM UTC-7, CyrMag wrote:

    (snip)
    ECHO is a built-in command in Windows. There is also an echo.exe in
    Cygwin, and this one expands arguments with wildcards.

    When you type "echo *" in a Cygwin shell, you will get the expansion you appear to want.

    I think I only used PS one time before, and thought that it was supposed to have Unix-like features. It seems to have some features, but not that one.

    In Unix, the expansion is done by the shell, not by the command.
    But I suppose that would confuse Windows users too much.

    I think MS one time had a port of sh, though.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Ev. Drikos@21:1/5 to All on Thu Jul 22 11:59:55 2021
    On 21/07/2021 22:18, gah4 wrote:
    ...
    I think I only used PS one time before, and thought that it was supposed to have Unix-like features. It seems to have some features, but not that one.

    Obviously, one cannot count on wildcards in some Windows terminals.

    In my case, Powershell removed the quotes from a quoted argument and
    thereafter Cygwin that emulates Bash expansion found a naked star. But
    an expansion took place in some directories, not all (maybe a bug).


    In Unix, the expansion is done by the shell, not by the command.
    But I suppose that would confuse Windows users too much.
    ...

    Maybe you are right but the topic in this thread is Unicode support in
    file names, in business environments more likely multi-language support.

    Well, my Windows installation is likely outdated now (8.1), and so may
    be my feedback. Also I don't claim having deep experience in Powershell.



    Ev. Drikos

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)