• Dataclasses, immutability(?), and ChatGPT

    From dn@21:1/5 to All on Wed Apr 12 16:58:56 2023
    Are dataclasses (or instances thereof) mutable or immutable?
    - and in what sense?


    Have been experimenting with ChatGPT. In particular: its possibilities
    as a trainer, good ideas for methods of introducing new topics, its
    capability for drawing-up demonstrations or examples, its interpretation
    of assessment questions (ie evaluating my own work, eg do they test
    desired specifics, is it possible/likely to misinterpret the question, ...)


    Conversed with ChatGPT about the differences between (standard)
    custom-classes and dataclasses, and when one might prefer one over the
    other.

    We(?) started discussing how dataclasses might be introduced to trainees currently only familiar with Python's built-ins: "...how they are
    superior to using built-in data-constructs..."

    Amongst the four benefits ChatGPT listed was:
    «
    Immutable instances: By default, instances of dataclasses are immutable,
    which means that once created, their attributes cannot be modified. This
    can help to prevent unintended modifications to the data.
    »

    Huh? If we'd been discussing namedtuples over (say) dictionaries, I'd
    perhaps have accepted the reply.


    The next step was to write some code to double-check my understanding (reproduced below). No change there.
    (given that I'm perfect...cough, cough, splutter)


    Thus, back to ChatGPT to seek clarification:
    «
    You said "Immutable instances: By default, instances of dataclasses are immutable, which means that once created, their attributes cannot be
    modified. This can help to prevent unintended modifications to the
    data." The attribute-values within a data-class can be changed. In what
    way are Python dataclasses immutable?
    »

    ChatGPT did indeed clarify that:
    «
    You're correct that the attribute values within a dataclass can be
    changed, even though the instance itself is considered immutable.
    »

    That's better. The machine is as smart as I am - or I, as smart as it -
    or something(?)


    However, the supporting-explanation then appears to argue against its
    own clarification (and the assertion in the question):
    «
    So to clarify, when we say that instances of dataclasses are immutable,
    we mean that the attributes of an instance cannot be changed after it is created.

    In other words, if you create an instance of a dataclass, its attributes
    cannot be modified by any external code. If you want to change the value
    of an attribute, you need to create a new instance of the dataclass with
    the updated attribute value.

    ... [code excised - see below]

    This is because the instance is considered immutable, and its attributes
    cannot be changed directly. Instead, we need to create a new instance of
    the dataclass with the updated attribute value, like this...

    Now, new_person is a new instance of the Person dataclass with the
    updated attribute values, while the original person instance remains
    unchanged.

    So to summarize, while the attribute values within a dataclass can be
    changed, instances of dataclasses are considered immutable because their attributes cannot be modified directly once they are created. Any
    updates to the attributes require the creation of a new instance of the dataclass.
    »

    [Note]
    In and amongst the above explanatory-paragraphs it gives example
    code-snippets. Those are condensed into:

    from dataclasses import dataclass

    @dataclass
    ... class Person:
    ... name: str
    ... age: int
    ...
    person = Person(name='Alice', age=20)

    person.name = 'Bob' # raises AttributeError: can't set attribute
    person.age = 21 # raises AttributeError: can't set attribute

    - note the erroneous comments about exceptions being raised
    (they weren't/aren't)!


    Here's a home-grown version illustrating all the points made during the investigation, by way of proof/disproof:

    """ PythonTraining:dataclass.py
    Prove/disprove claim made by Chat-GPT.
    """

    __author__ = "dn, IT&T Consultant"
    __python__ = "3.11"
    __created__ = "PyCharm, 12 Apr 2023"
    __copyright__ = "Copyright © 2023~"
    __license__ = "MIT"

    # PSL
    from dataclasses import dataclass


    @dataclass
    class Coordinates():
    """Sample dataclass. """
    x:int
    y:int


    if __name__ == "__main__":
    print( "\nCommencing execution\n" )

    coordinates = Coordinates( 1, 2, )
    print( coordinates, id( coordinates ), )
    coordinates.x = 3
    print( coordinates, id( coordinates ), )
    coordinates.z = 4
    print( coordinates, id( coordinates ), )
    print( coordinates.x, coordinates.y, coordinates.z, )


    ### output:
    Commencing execution

    Coordinates(x=1, y=2) 140436963150928
    Coordinates(x=3, y=2) 140436963150928
    Coordinates(x=3, y=2) 140436963150928
    3 2 4

    Terminating
    ###

    Not only are a dataclass instance's attribute-values mutable, but
    further attributes can be dynamically-added to the object-instance!

    Yes, if the code included:

    coordinates = Coordinates( 5, 6, )

    the new "coordinates" identifier would point to a different id()
    'address', ie a fresh immutable-instance.


    The 'book of words' (https://docs.python.org/3/library/dataclasses.html)
    does mention immutability (wrt to dataclasses) in that it is possible to
    add a __hash__() method (any object defined with is (technically)
    immutable). However, apart from the default_factory argument, there
    doesn't appear to be other discussion of [im]mutability.


    Anything I've 'missed'?
    - or a salutary tale of not depending upon ChatGPT etc?

    --
    Regards,
    =dn

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Roel Schroeven@21:1/5 to All on Wed Apr 12 10:02:51 2023
    Op 12/04/2023 om 6:58 schreef dn via Python-list:
    Are dataclasses (or instances thereof) mutable or immutable?
    - and in what sense?
    Instances of dataclasses are mutable, just like normal classes.
    Dataclasses *are* normal classes, with some extra special methods. They
    are totally different from namedtuples, even though the use cases
    somewhat overlap. They *can* be immutable, I think, if the programmer
    takes care to make them so. I don't think adding __hash__() is enough:
    as I understand it's an indication that a class is immutable, but
    doesn't actually by itself make it so. "Mutability is a complicated
    property that depends on the programmer’s intent, the existence and
    behavior of |__eq__()|, and the values of the |eq| and |frozen| flags in
    the |dataclass()| <https://docs.python.org/3.10/library/dataclasses.html#dataclasses.dataclass> decorator.", says the documentation.
    Amongst the four benefits ChatGPT listed was:
    «
    Immutable instances: By default, instances of dataclasses are
    immutable, which means that once created, their attributes cannot be modified. This can help to prevent unintended modifications to the data.
    »

    Huh? If we'd been discussing namedtuples over (say) dictionaries, I'd
    perhaps have accepted the reply.
    ChatGPT is wrong.
    Anything I've 'missed'?
    - or a salutary tale of not depending upon ChatGPT etc?
    You didn't miss anything, ChatGPT is wrong. The thing to look out for is
    that when ChatGPT is wrong, it sounds just as convincing as when it's
    right; there is no indication in it's tone or style that it's making
    things up. Always double check!

    --
    "A common mistake that people make when trying to design something completely foolproof is to underestimate the ingenuity of complete fools."
    -- Douglas Adams

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Stefan Ram@21:1/5 to Roel Schroeven on Wed Apr 12 09:41:31 2023
    Roel Schroeven <roel@roelschroeven.net> writes:
    You didn't miss anything, ChatGPT is wrong. The thing to look out for is
    that when ChatGPT is wrong, it sounds just as convincing as when it's
    right; there is no indication in it's tone or style that it's making
    things up. Always double check!

    Before ChatGPT became popular, in 2011, there was
    a famous TED talk by Kathryn Schultz, where she said:

    |It does feel like something to be wrong.
    |It feels like being right.
    .

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Grant Edwards@21:1/5 to Roel Schroeven on Wed Apr 12 05:00:36 2023
    On 2023-04-12, Roel Schroeven <roel@roelschroeven.net> wrote:

    Huh? If we'd been discussing namedtuples over (say) dictionaries, I'd
    perhaps have accepted the reply.

    ChatGPT is wrong.

    Anything I've 'missed'?
    - or a salutary tale of not depending upon ChatGPT etc?

    You didn't miss anything, ChatGPT is wrong. The thing to look out for is
    that when ChatGPT is wrong, it sounds just as convincing as when it's
    right; there is no indication in it's tone or style that it's making
    things up.

    Yep, that's how ChatGPT works. It's a program to generate output
    language that sounds right based on a huge training set of
    text. Whether that "right sounding" language agrees with real world
    facts or not is irrelevent to the language-generating algorithm.

    --
    Grant

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Thomas Passin@21:1/5 to dn via Python-list on Wed Apr 12 08:26:47 2023
    On 4/12/2023 12:58 AM, dn via Python-list wrote:
    Are dataclasses (or instances thereof) mutable or immutable?
    - and in what sense?


    Have been experimenting with ChatGPT. In particular: its possibilities
    as a trainer, good ideas for methods of introducing new topics, its capability for drawing-up demonstrations or examples, its interpretation
    of assessment questions (ie evaluating my own work, eg do they test
    desired specifics, is it possible/likely to misinterpret the question, ...)


    Conversed with ChatGPT about the differences between (standard) custom-classes and dataclasses, and when one might prefer one over the
    other.

    We(?) started discussing how dataclasses might be introduced to trainees currently only familiar with Python's built-ins: "...how they are
    superior to using built-in data-constructs..."

    Amongst the four benefits ChatGPT listed was:
    «
    Immutable instances: By default, instances of dataclasses are immutable, which means that once created, their attributes cannot be modified. This
    can help to prevent unintended modifications to the data.
    »

    Huh? If we'd been discussing namedtuples over (say) dictionaries, I'd
    perhaps have accepted the reply.


    The next step was to write some code to double-check my understanding (reproduced below). No change there.
    (given that I'm perfect...cough, cough, splutter)


    Thus, back to ChatGPT to seek clarification:
    «
    You said "Immutable instances: By default, instances of dataclasses are immutable, which means that once created, their attributes cannot be modified. This can help to prevent unintended modifications to the
    data." The attribute-values within a data-class can be changed. In what
    way are Python dataclasses immutable?
    »

    ChatGPT did indeed clarify that:
    «
    You're correct that the attribute values within a dataclass can be
    changed, even though the instance itself is considered immutable.
    »

    That's better. The machine is as smart as I am - or I, as smart as it -
    or something(?)


    However, the supporting-explanation then appears to argue against its
    own clarification (and the assertion in the question):
    «
    So to clarify, when we say that instances of dataclasses are immutable,
    we mean that the attributes of an instance cannot be changed after it is created.

    In other words, if you create an instance of a dataclass, its attributes cannot be modified by any external code. If you want to change the value
    of an attribute, you need to create a new instance of the dataclass with
    the updated attribute value.

    ... [code excised - see below]

    This is because the instance is considered immutable, and its attributes cannot be changed directly. Instead, we need to create a new instance of
    the dataclass with the updated attribute value, like this...

    Now, new_person is a new instance of the Person dataclass with the
    updated attribute values, while the original person instance remains unchanged.

    So to summarize, while the attribute values within a dataclass can be changed, instances of dataclasses are considered immutable because their attributes cannot be modified directly once they are created. Any
    updates to the attributes require the creation of a new instance of the dataclass.
    »

    [Note]
    In and amongst the above explanatory-paragraphs it gives example code-snippets. Those are condensed into:

    from dataclasses import dataclass

    @dataclass
    ... class Person:
    ...     name: str
    ...     age: int
    ...
    person = Person(name='Alice', age=20)

    person.name = 'Bob'  # raises AttributeError: can't set attribute
    person.age = 21      # raises AttributeError: can't set attribute

    - note the erroneous comments about exceptions being raised
    (they weren't/aren't)!


    Here's a home-grown version illustrating all the points made during the investigation, by way of proof/disproof:

    """ PythonTraining:dataclass.py
        Prove/disprove claim made by Chat-GPT.
    """

    __author__ = "dn, IT&T Consultant"
    __python__ = "3.11"
    __created__ = "PyCharm, 12 Apr 2023"
    __copyright__ = "Copyright © 2023~"
    __license__ = "MIT"

    # PSL
    from dataclasses import dataclass


    @dataclass
    class Coordinates():
        """Sample dataclass. """
        x:int
        y:int


    if __name__ == "__main__":
        print( "\nCommencing execution\n" )

        coordinates = Coordinates( 1, 2, )
        print( coordinates, id( coordinates ), )
        coordinates.x = 3
        print( coordinates, id( coordinates ), )
        coordinates.z = 4
        print( coordinates, id( coordinates ), )
        print( coordinates.x, coordinates.y, coordinates.z, )


    ### output:
    Commencing execution

    Coordinates(x=1, y=2) 140436963150928
    Coordinates(x=3, y=2) 140436963150928
    Coordinates(x=3, y=2) 140436963150928
    3 2 4

    Terminating
    ###

    Not only are a dataclass instance's attribute-values mutable, but
    further attributes can be dynamically-added to the object-instance!

    Yes, if the code included:

    coordinates = Coordinates( 5, 6, )

    the new "coordinates" identifier would point to a different id()
    'address', ie a fresh immutable-instance.


    The 'book of words' (https://docs.python.org/3/library/dataclasses.html)
    does mention immutability (wrt to dataclasses) in that it is possible to
    add a __hash__() method (any object defined with is (technically)
    immutable). However, apart from the default_factory argument, there
    doesn't appear to be other discussion of [im]mutability.


    Anything I've 'missed'?
    - or a salutary tale of not depending upon ChatGPT etc?

    People need to remember that ChatGPT-like systems put words together the
    way that many humans usually do. So what they emit usually sounds
    smooth and human-like. If it's code they emit, it will tend to seem
    plausible because lines of code are basically sentences, and learning
    how to construct plausible sentences is what these systems are built to
    do. That's **plausible**, not "logical" or "correct".

    The vast size of these systems means that they can include a larger
    context in figuring out what words to place next compared with earlier,
    smaller systems.

    But consider: what if you wrote code as a stream-of-consciousness
    process? That code might seem plausible, but why would you have any
    confidence in it? Or to put it another way, what if most of ChatGPT's
    exposure to code came from StackOverflow archives?

    On top of that, ChapGPT-like systems do not know your requirements nor
    the reasons behind your requests. They only know that when other people
    put words and phrases together like you did, they tended to make
    responses that sound like what the chatbot emits next. It's basically cargo-culting its responses.

    Apparently researchers have been learning that the more parameters that
    a system like this has, the more likely it is to learn how to emit
    responses that the questioner likes. Essentially, it could become the
    ultimate yes-man!

    So there is some probability that the system will tell you interesting
    or useful things, some probability that it will try to tell you what it
    thinks you want hear, some probability that it will tell you incorrect
    things that other people have repeated, and some probability that it
    will perseverate - simply make things up.

    If I were going to write a novel about an alternate history, I think
    that a chatGPT-like system would be a fantastic writing assistant.
    Code? Not so much.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)