• Re: Weird behavior of Get character with trailing new lines.

    From Jeffrey R.Carter@21:1/5 to Blady on Fri Sep 22 22:05:55 2023
    On 2023-09-22 21:30, Blady wrote:

    A.10.7 Input-Output of Characters and Strings
    For an item of type Character the following procedures are provided: procedure Get(File : in File_Type; Item : out Character);
    procedure Get(Item : out Character);
    After skipping any line terminators and any page terminators, reads the next character from the specified input file and returns the value of this character
    in the out parameter Item.
    The exception End_Error is propagated if an attempt is made to skip a file terminator.

    As you have quoted, Get (Character) skips line terminators. End_Of_File returns True if there is a single line terminator before the file terminator, but False if there are multiple line terminators before the file terminator. So you either
    have to explicitly skip line terminators, or handle End_Error.

    --
    Jeff Carter
    "Unix and C are the ultimate computer viruses."
    Richard Gabriel
    99

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Blady@21:1/5 to All on Fri Sep 22 21:30:15 2023
    Hello,

    I'm reading a text file with Get character from Text_IO with a while
    loop controlled by End_Of_File.

    % cat test_20230922_get_char.adb
    with Ada.Text_IO; use Ada.Text_IO;
    procedure test_20230922_get_char is
    procedure Get is
    F : File_Type;
    Ch : Character;
    begin
    Open (F, In_File, "test_20230922_get_char.adb");
    while not End_Of_File(F) loop
    Get (F, Ch);
    Put (Ch);
    end loop;
    Close (F);
    Put_Line ("File read with get.");
    end;
    begin
    Get;
    end;



    All will be well, unfortunately not!

    Despite the End_Of_File, I got an END_ERROR exception when there are
    several trailing new lines at the end of the text:

    % test_20230922_get_char
    with Ada.Text_IO; use Ada.Text_IO;procedure test_20230922_get_char is
    procedure Get is F : File_Type; Ch : Character; begin
    Open (F, In_File, "test_20230922_get_char.adb"); while not
    End_Of_File(F) loop Get (F, Ch); Put (Ch); end
    loop; Close (F); Put_Line ("File read with get.");
    end;beginGet;end;

    Execution of ../bin/test_20230922_get_char terminated by unhandled exception raised ADA.IO_EXCEPTIONS.END_ERROR : a-textio.adb:517

    The code is compiled with GNAT, does it comply with the standard?

    A.10.7 Input-Output of Characters and Strings
    For an item of type Character the following procedures are provided:
    procedure Get(File : in File_Type; Item : out Character);
    procedure Get(Item : out Character);
    After skipping any line terminators and any page terminators, reads the
    next character from the specified input file and returns the value of
    this character in the out parameter Item.
    The exception End_Error is propagated if an attempt is made to skip a
    file terminator.

    This seems to be the case, then how to avoid the exception?

    Thanks, Pascal.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Niklas Holsti@21:1/5 to Blady on Fri Sep 22 22:52:21 2023
    On 2023-09-22 22:30, Blady wrote:
    Hello,

    I'm reading a text file with Get character from Text_IO with a while
    loop controlled by End_Of_File.

    % cat test_20230922_get_char.adb
    with Ada.Text_IO; use Ada.Text_IO;
    procedure test_20230922_get_char is
       procedure Get is
          F : File_Type;
          Ch : Character;
       begin
          Open (F, In_File, "test_20230922_get_char.adb");
          while not End_Of_File(F) loop
             Get (F, Ch);
             Put (Ch);
          end loop;
          Close (F);
          Put_Line ("File read with get.");
       end;
    begin
    Get;
    end;



    All will be well, unfortunately not!

    Despite the End_Of_File, I got an END_ERROR exception when there are
    several trailing new lines at the end of the text:

    % test_20230922_get_char
    with Ada.Text_IO; use Ada.Text_IO;procedure test_20230922_get_char is procedure Get is      F : File_Type;      Ch : Character;   begin Open
    (F, In_File, "test_20230922_get_char.adb");      while not End_Of_File(F) loop         Get (F, Ch);         Put (Ch);      end
    loop;      Close (F);      Put_Line ("File read with get."); end;beginGet;end;

    Execution of ../bin/test_20230922_get_char terminated by unhandled
    exception
    raised ADA.IO_EXCEPTIONS.END_ERROR : a-textio.adb:517

    The code is compiled with GNAT, does it comply with the standard?

    A.10.7 Input-Output of Characters and Strings
    For an item of type Character the following procedures are provided: procedure Get(File : in File_Type; Item : out Character);
    procedure Get(Item : out Character);
    After skipping any line terminators and any page terminators, reads the
    next character from the specified input file and returns the value of
    this character in the out parameter Item.
    The exception End_Error is propagated if an attempt is made to skip a
    file terminator.

    This seems to be the case, then how to avoid the exception?


    In Text_IO, a line terminator is not an ordinary character, so you must
    handle it separately, for example like this:

    while not End_Of_File(F) loop
    if End_Of_Line(F) then
    New_Line;
    Skip_Line(F);
    else
    Get (F, Ch);
    Put (Ch);
    end if;

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From J-P. Rosen@21:1/5 to All on Sat Sep 23 09:02:37 2023
    Le 22/09/2023 à 22:05, Jeffrey R.Carter a écrit :
    On 2023-09-22 21:30, Blady wrote:

    A.10.7 Input-Output of Characters and Strings
    For an item of type Character the following procedures are provided:
    procedure Get(File : in File_Type; Item : out Character);
    procedure Get(Item : out Character);
    After skipping any line terminators and any page terminators, reads
    the next character from the specified input file and returns the value
    of this character in the out parameter Item.
    The exception End_Error is propagated if an attempt is made to skip a
    file terminator.

    As you have quoted, Get (Character) skips line terminators. End_Of_File returns True if there is a single line terminator before the file
    terminator, but False if there are multiple line terminators before the
    file terminator. So you either have to explicitly skip line terminators,
    or handle End_Error.

    And this works only if the input file is "well formed", i.e. if it has
    line terminators as the compiler expects them to be (f.e., you will be
    in trouble if the last line has no LF).
    That's why I never check End_Of_File, but handle the End_Error
    exception. It always works.
    --
    J-P. Rosen
    Adalog
    2 rue du Docteur Lombard, 92441 Issy-les-Moulineaux CEDEX
    https://www.adalog.fr https://www.adacontrol.fr

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Niklas Holsti@21:1/5 to J-P. Rosen on Sat Sep 23 11:39:25 2023
    On 2023-09-23 10:02, J-P. Rosen wrote:
    Le 22/09/2023 à 22:05, Jeffrey R.Carter a écrit :
    On 2023-09-22 21:30, Blady wrote:

    A.10.7 Input-Output of Characters and Strings
    For an item of type Character the following procedures are provided:
    procedure Get(File : in File_Type; Item : out Character);
    procedure Get(Item : out Character);
    After skipping any line terminators and any page terminators, reads
    the next character from the specified input file and returns the
    value of this character in the out parameter Item.
    The exception End_Error is propagated if an attempt is made to skip a
    file terminator.

    As you have quoted, Get (Character) skips line terminators.
    End_Of_File returns True if there is a single line terminator before
    the file terminator, but False if there are multiple line terminators
    before the file terminator. So you either have to explicitly skip line
    terminators, or handle End_Error.

    And this works only if the input file is "well formed", i.e. if it has
    line terminators as the compiler expects them to be (f.e., you will be
    in trouble if the last line has no LF).


    Hm. The code I suggested, which handles line terminators separately,
    does work without raising End_Error even if the last line has no line terminator, at least in the context of the OP's program.


    That's why I never check End_Of_File, but handle the End_Error
    exception. It always works.


    True, but it may not be convenient for the overall logic of the program
    that reads the file. That program often wants do to something with the contents, after reading the whole file, and having to enter that part of
    the program through an exception does complicate the code a little.

    On the other hand, past posts on this issue say that using End_Error
    instead of the End_Of_File function is faster, probably because the
    Text_IO code that implements Get cannot know that the program has
    already checked for End_Of_File, so Get has to check for that case
    anyway, redundantly.

    My usual method for reading text files is to use Text_IO.Get_Line, and
    (I admit) usually with End_Error termination.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Niklas Holsti on Sat Sep 23 11:25:05 2023
    On 2023-09-23 10:39, Niklas Holsti wrote:
    On 2023-09-23 10:02, J-P. Rosen wrote:

    That's why I never check End_Of_File, but handle the End_Error
    exception. It always works.

    True, but it may not be convenient for the overall logic of the program
    that reads the file. That program often wants do to something with the contents, after reading the whole file, and having to enter that part of
    the program through an exception does complicate the code a little.

    It rather simplifies the code. You exit the loop and do whatever is
    necessary there.

    Testing for the file end is unreliable and non-portable. Many types of
    files simply do not support that test. In other cases the test is not
    file immutable with the side effects that can change the program logic.

    It is well advised to never ever use it.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Niklas Holsti@21:1/5 to Dmitry A. Kazakov on Sat Sep 23 17:03:14 2023
    On 2023-09-23 12:25, Dmitry A. Kazakov wrote:
    On 2023-09-23 10:39, Niklas Holsti wrote:
    On 2023-09-23 10:02, J-P. Rosen wrote:

    That's why I never check End_Of_File, but handle the End_Error
    exception. It always works.

    True, but it may not be convenient for the overall logic of the
    program that reads the file. That program often wants do to something
    with the contents, after reading the whole file, and having to enter
    that part of the program through an exception does complicate the code
    a little.

    It rather simplifies the code.


    Oh?


    You exit the loop and do whatever is necessary there.

    That is exactly what happens in the "while not End_Of_File" loop.

    If you want to use End_Error instead, you have to add an exception
    handler, and if you want to stay in the subprogram's statement sequence
    without entering the subprogram-level exception handlers, you have to
    add a block to contain the reading loop and make the exception handler
    local to that block.

    To me that looks like adding code -> more complex. Of course not much
    more complex, but a little, as I said.


    Testing for the file end is unreliable and non-portable. Many types
    of files simply do not support that test.In other cases the test is
    not file immutable with the side effects that can change the program
    logic.

    I suppose you are talking about the need for End_Of_File to possibly
    read ahead past a line terminator? If not, please clarify.

    That said, I certainly think that a program reading files should be
    prepared to handle End_Error, especially if a file is read at several
    places in the program (and not in a single loop as in the present program).

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Dmitry A. Kazakov@21:1/5 to Niklas Holsti on Sun Sep 24 09:50:48 2023
    On 2023-09-23 16:03, Niklas Holsti wrote:
    On 2023-09-23 12:25, Dmitry A. Kazakov wrote:

    You exit the loop and do whatever is necessary there.

    That is exactly what happens in the "while not End_Of_File" loop.

    It does not because you must handle I/O errors and close the file.

    If you want to use End_Error instead, you have to add an exception
    handler, and if you want to stay in the subprogram's statement sequence without entering the subprogram-level exception handlers, you have to
    add a block to contain the reading loop and make the exception handler
    local to that block.

    You always have to in order to handle I/O errors.

    To me that looks like adding code -> more complex. Of course not much
    more complex, but a little, as I said.

    No, it is simpler if the code is production code rather than an
    exercise. Consider typical case when looping implements reading some
    message, block etc. You have

    loop
    read something
    read another piece
    read some count
    read a block of count bytes
    ...

    You cannot do it this way if you use end of file test because you must
    protect each minimal input item (e.g. byte) by the test. It is massively obtrusive and would distort program logic. You will end up with nested
    ifs or else gotos.

    Testing for the file end is unreliable and non-portable. Many types
    of files simply do not support that test.In other cases the test is
    not file immutable with the side effects that can change the program
    logic.

    I suppose you are talking about the need for End_Of_File to possibly
    read ahead past a line terminator? If not, please clarify.

    Yes, reading ahead and also issues with blocking and with race condition
    in shared files. Then things like sockets do not have end of file,
    connection drop is indicated by an empty read.

    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Blady@21:1/5 to Blady on Mon Sep 25 21:55:56 2023
    Le 24/09/2023 à 09:50, Dmitry A. Kazakov a écrit :
    On 2023-09-23 16:03, Niklas Holsti wrote:
    On 2023-09-23 10:02, J-P. Rosen wrote: >>> Le 22/09/2023 à 22:05, Jeffrey R.Carter a écrit :>>>> On 2023-09-22
    21:30, Blady wrote:

    A.10.7 Input-Output of Characters and Strings
    For an item of type Character the following procedures are provided:
    procedure Get(File : in File_Type; Item : out Character);
    procedure Get(Item : out Character);
    After skipping any line terminators and any page terminators, reads
    the next character from the specified input file and returns the
    value of this character in the out parameter Item.
    The exception End_Error is propagated if an attempt is made to skip
    a file terminator.

    Thanks all for your helpful answers.

    It actually helps.

    Especially, I was not aware of the particular behavior of End_Of_File
    with a single line terminator before the file terminator.

    In my case, I prefer to reserve exceptions for exceptional situations
    :-) so I've took the code from Niklas example.

    Regards, Pascal.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Randy Brukardt@21:1/5 to J-P. Rosen on Tue Sep 26 00:53:53 2023
    "J-P. Rosen" <rosen@adalog.fr> wrote in message news:uem2id$moia$1@dont-email.me...
    Le 22/09/2023 à 22:05, Jeffrey R.Carter a écrit :
    On 2023-09-22 21:30, Blady wrote:

    A.10.7 Input-Output of Characters and Strings
    For an item of type Character the following procedures are provided:
    procedure Get(File : in File_Type; Item : out Character);
    procedure Get(Item : out Character);
    After skipping any line terminators and any page terminators, reads the
    next character from the specified input file and returns the value of
    this character in the out parameter Item.
    The exception End_Error is propagated if an attempt is made to skip a
    file terminator.

    As you have quoted, Get (Character) skips line terminators. End_Of_File
    returns True if there is a single line terminator before the file
    terminator, but False if there are multiple line terminators before the
    file terminator. So you either have to explicitly skip line terminators,
    or handle End_Error.

    And this works only if the input file is "well formed", i.e. if it has
    line terminators as the compiler expects them to be (f.e., you will be in trouble if the last line has no LF).
    That's why I never check End_Of_File, but handle the End_Error exception.
    It always works.

    Agreed. And if the file might contain a page terminator, things get even
    worse because you would have to mess around with End_of_Page in order to
    avoid hitting a combination that still will raise End_Error. It's not worth
    the mental energy to avoid it, especially in a program that will be used by others. (I've sometimes used the simplest possible way to writing a "quick&dirty" program for my own use; for such programs I skip the error handling as I figure I can figure out what I did wrong by looking at the exception raised. But that's often a bad idea even in that case as such programs have a tendency to get reused years later and then the intended
    usage often isn't clear.)

    Randy.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)