I will be short and *you* will understand what I wrote.
1) Applications are exchanging data/text in a text mode and only in text mode. An application does not know what is a byte and consequently the coding
of characters.
2) It is however possible to pass/transfer "an encoded text" as text. For this one needs a codecs which covers the complete byte range. This is where Python fails.
Small illustration. piping from an "utf-16 output" to a reciever application. The transfer should always succeed and it is up to the reciever to know what
to do with the incoming text, even if it represents bytes.
(U+8161)
'腡'.encode('utf-16-le').decode('latin1')
'a\x81'
'腡'.encode('utf-16-le').decode('cp1252')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "c:\Python311\Lib\encodings\cp1252.py", line 15, in decode
return codecs.charmap_decode(input,errors,decoding_table)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1: character maps to <undefined>
cp1252 beeing the os.device_encoding, the "transfer encoding"
It is very to easy to mimick this behaviour in PowerShell with Python, just by using
a keyboard, understand text.
You may argue, "use bytes". This will no work, or only work with and within Python applications (decoding the sys.stdout.buffer). A non sense, because applications are "speaking text."
---
What I presented has nothing to do with the exemples of the guy who is
not using the coding of characters correctly.
It is similar to the issues, that this "US-girl" (?) presented some time ago. She
had difficulties to "pipe" in PowerShell.
That's my understanding.
--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)