• Using "pure" (?) Ada, how to determine whether a file is a "text" file,

    From Kenneth Wolcott@21:1/5 to All on Sat Jul 1 10:15:25 2023
    Hi;

    Another very beginner question here...

    Using "pure" (?) Ada, how to determine whether a file is a "text" file, not a binary?

    Kind of like using the UNIX/Linux "file" command, but doesn't have to be comprehensive (yet). Something like the Perl "-T" feature.

    On the other hand, if there already exists an Ada implementation of the UNIX "file" command as a library, could you point me to that?

    As a side question, how does one read "binary" files in Ada?

    A UNIX/Linux use case for the previous sentence is the concatenation of two (or more) "binary" files that were created using the UNIX/Linux "split" command.

    So I'd be interested in emulating the UNIX "cat" command for "binary" files.

    These are just personal experiments for learning how to do all kinds of Ada I/O...

    Thanks,
    Ken Wolcott

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Jeffrey R.Carter@21:1/5 to Kenneth Wolcott on Sat Jul 1 22:39:27 2023
    On 2023-07-01 19:15, Kenneth Wolcott wrote:

    Using "pure" (?) Ada, how to determine whether a file is a "text" file, not a binary?

    That depends on the definition of a text file. Under Unix and Windows, all files
    are sequences of bytes, and so may be considered sequences of Characters, and so
    text files.

    If you can define what distinguishes text files from binary files, then it should be fairly easy to write Ada to distinguish them.

    For example, if a text file is one in which all the characters, except line terminators, are graphic characters, then it should be clear how to determine whether a file meets that definition of a text file.

    As a side question, how does one read "binary" files in Ada?

    Ada has Direct_IO, Sequential_IO, and Stream_IO for reading binary files. Which you would use and how to use it depends on what's in the file and what you need to do with it.

    --
    Jeff Carter
    "Ada is the only language where users are
    happy to have compilation errors!"
    Jean-Pierre Rosen
    166

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenneth Wolcott@21:1/5 to Jeffrey R.Carter on Sat Jul 1 13:54:51 2023
    On Saturday, July 1, 2023 at 1:39:30 PM UTC-7, Jeffrey R.Carter wrote:
    On 2023-07-01 19:15, Kenneth Wolcott wrote:

    Using "pure" (?) Ada, how to determine whether a file is a "text" file, not a binary?
    That depends on the definition of a text file. Under Unix and Windows, all files
    are sequences of bytes, and so may be considered sequences of Characters, and so
    text files.

    If you can define what distinguishes text files from binary files, then it should be fairly easy to write Ada to distinguish them.

    For example, if a text file is one in which all the characters, except line terminators, are graphic characters, then it should be clear how to determine
    whether a file meets that definition of a text file.

    I think that is the definition that I'm going to pursue as the simplest and effective definition.

    As a side question, how does one read "binary" files in Ada?
    Ada has Direct_IO, Sequential_IO, and Stream_IO for reading binary files. Which
    you would use and how to use it depends on what's in the file and what you need
    to do with it.

    Ok, now that seems to be pretty obvious! I'll go and experiment further...

    Thank you!
    Ken

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenneth Wolcott@21:1/5 to Keith Thompson on Sat Jul 1 14:50:32 2023
    On Saturday, July 1, 2023 at 2:39:06 PM UTC-7, Keith Thompson wrote:
    Kenneth Wolcott writes:
    On Saturday, July 1, 2023 at 1:39:30 PM UTC-7, Jeffrey R.Carter wrote:
    On 2023-07-01 19:15, Kenneth Wolcott wrote:
    [...]
    For example, if a text file is one in which all the characters, except line
    terminators, are graphic characters, then it should be clear how to determine
    whether a file meets that definition of a text file.

    I think that is the definition that I'm going to pursue as the
    simplest and effective definition.
    Think about how you want to handle tab characters (non-graphic but
    common in some text) and carriage return characters (non-graphic but
    part of a line terminator for Windows-style text files).

    Also think about the various ways of representing text: ASCII, Latin-1, UTF-8, UTF-16, etc.

    Thanks, Keith!

    It looks like just need to more carefully examine the existing Ada I/O packages and experiment with the possibilities...

    Ken

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Keith Thompson@21:1/5 to Kenneth Wolcott on Sat Jul 1 14:39:02 2023
    Kenneth Wolcott <kennethwolcott@gmail.com> writes:
    On Saturday, July 1, 2023 at 1:39:30 PM UTC-7, Jeffrey R.Carter wrote:
    On 2023-07-01 19:15, Kenneth Wolcott wrote:
    [...]
    For example, if a text file is one in which all the characters, except line >> terminators, are graphic characters, then it should be clear how to determine
    whether a file meets that definition of a text file.

    I think that is the definition that I'm going to pursue as the
    simplest and effective definition.

    Think about how you want to handle tab characters (non-graphic but
    common in some text) and carriage return characters (non-graphic but
    part of a line terminator for Windows-style text files).

    Also think about the various ways of representing text: ASCII, Latin-1,
    UTF-8, UTF-16, etc.

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Will write code for food.
    void Void(void) { Void(); } /* The recursive call of the void */

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Chris Townley@21:1/5 to Kenneth Wolcott on Sun Jul 2 02:08:17 2023
    On 01/07/2023 22:50, Kenneth Wolcott wrote:
    On Saturday, July 1, 2023 at 2:39:06 PM UTC-7, Keith Thompson wrote:
    Kenneth Wolcott writes:
    On Saturday, July 1, 2023 at 1:39:30 PM UTC-7, Jeffrey R.Carter wrote: >>>> On 2023-07-01 19:15, Kenneth Wolcott wrote:
    [...]
    For example, if a text file is one in which all the characters, except line
    terminators, are graphic characters, then it should be clear how to determine
    whether a file meets that definition of a text file.

    I think that is the definition that I'm going to pursue as the
    simplest and effective definition.
    Think about how you want to handle tab characters (non-graphic but
    common in some text) and carriage return characters (non-graphic but
    part of a line terminator for Windows-style text files).

    Also think about the various ways of representing text: ASCII, Latin-1,
    UTF-8, UTF-16, etc.

    Thanks, Keith!

    It looks like just need to more carefully examine the existing Ada I/O packages and experiment with the possibilities...

    Ken

    Maybe worth looking at the unix file utility, docs and source are available
    --
    Chris

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Kenneth Wolcott@21:1/5 to Chris Townley on Sat Jul 1 18:48:55 2023
    On Saturday, July 1, 2023 at 6:08:21 PM UTC-7, Chris Townley wrote:
    On 01/07/2023 22:50, Kenneth Wolcott wrote:
    On Saturday, July 1, 2023 at 2:39:06 PM UTC-7, Keith Thompson wrote:
    Kenneth Wolcott writes:
    On Saturday, July 1, 2023 at 1:39:30 PM UTC-7, Jeffrey R.Carter wrote: >>>> On 2023-07-01 19:15, Kenneth Wolcott wrote:
    [...]
    For example, if a text file is one in which all the characters, except line
    terminators, are graphic characters, then it should be clear how to determine
    whether a file meets that definition of a text file.

    I think that is the definition that I'm going to pursue as the
    simplest and effective definition.
    Think about how you want to handle tab characters (non-graphic but
    common in some text) and carriage return characters (non-graphic but
    part of a line terminator for Windows-style text files).

    Also think about the various ways of representing text: ASCII, Latin-1, >> UTF-8, UTF-16, etc.

    Thanks, Keith!

    It looks like just need to more carefully examine the existing Ada I/O packages and experiment with the possibilities...

    Ken
    Maybe worth looking at the unix file utility, docs and source are available

    Thank you, Chris.

    I have just downloaded the source code for the UNIX/Linus file command and am browsing around...

    Thanks,
    Ken

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)