I am trying to use Unicode codepoints along with Unicode UTF8s in a
bash script in order to compare codepoints and their matching UTF8 in
a database of the 1071 Egyptian hieroglyphs.
So what I am trying to do is ensure that each record is accurately
validated in the 1071 lines of my file.
It works in the bash shell e.g.
codepoint=13000
printf "\U000${codepoint}\n"
ð“€€
However, if I put the same code into a script, e.g.
cat > cpt
#!/bin/bash
codepoint=13000
printf "\U000${codepoint}\n"
chmod +x cpt
bash -x ./cpt
+ codepoint=13000
+ printf '\U00013000\n'
\U00013000
So instead of creating the hieroglyph, the script just ignores the
same exact code. Is there any way around this? Thanks for any
help.
bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.
It works in the bash shell e.g.
codepoint=13000
printf "\U000${codepoint}\n"
ð“€€
However, if I put the same code into a script, e.g.
cat > cpt
#!/bin/bash
codepoint=13000
printf "\U000${codepoint}\n"
\U00013000
So instead of creating the hieroglyph, the script just ignores the same exact code.
bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.
paris2venice wrote:
I am trying to use Unicode codepoints along with Unicode UTF8s in a
bash script in order to compare codepoints and their matching UTF8 in
a database of the 1071 Egyptian hieroglyphs.
So what I am trying to do is ensure that each record is accurately validated in the 1071 lines of my file.
It works in the bash shell e.g.It sounds like you're on macOS, so I suspect the interactive shell
codepoint=13000
printf "\U000${codepoint}\n"
ð“€€
you're using may be zsh, not bash - and probably a newer version.
However, if I put the same code into a script, e.g.
cat > cpt
#!/bin/bash
codepoint=13000
printf "\U000${codepoint}\n"
chmod +x cpt
bash -x ./cpt
+ codepoint=13000
+ printf '\U00013000\n'
\U00013000
So instead of creating the hieroglyph, the script just ignores the
same exact code. Is there any way around this? Thanks for any
help.
bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.
That's a very old version of bash. To quote the CHANGES file from a
newer version, "Fixed several bugs with the handling of valid and
invalid unicode character values when used with the \u and \U escape sequences to printf and $'...'." So the old version not having those
fixes might be the problem.
-Rus.
paris2venice wrote:
I am trying to use Unicode codepoints along with Unicode UTF8s in a
bash script in order to compare codepoints and their matching UTF8 in
a database of the 1071 Egyptian hieroglyphs.
So what I am trying to do is ensure that each record is accurately validated in the 1071 lines of my file.
It works in the bash shell e.g.It sounds like you're on macOS, so I suspect the interactive shell
codepoint=13000
printf "\U000${codepoint}\n"
ð“€€
you're using may be zsh, not bash - and probably a newer version.
However, if I put the same code into a script, e.g.
cat > cpt
#!/bin/bash
codepoint=13000
printf "\U000${codepoint}\n"
chmod +x cpt
bash -x ./cpt
+ codepoint=13000
+ printf '\U00013000\n'
\U00013000
So instead of creating the hieroglyph, the script just ignores the
same exact code. Is there any way around this? Thanks for any
help.
bash --versionThat's a very old version of bash. To quote the CHANGES file from a
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.
newer version, "Fixed several bugs with the handling of valid and
invalid unicode character values when used with the \u and \U escape sequences to printf and $'...'." So the old version not having those
fixes might be the problem.
-Rus.
On 2023-11-29, paris2venice wrote:
It works in the bash shell e.g.
codepoint=13000
printf "\U000${codepoint}\n"
ð“€€
However, if I put the same code into a script, e.g.
cat > cpt
#!/bin/bash
codepoint=13000
printf "\U000${codepoint}\n"
\U00013000
So instead of creating the hieroglyph, the script just ignores the same exact code.That doesn't happen. Something isn't like you say it is.
bash --versionPresumably that is the version you use to execute the script.
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.
It is ancient and may not yet support the \U syntax.
What bash are you using interactively?
$ echo $BASH_VERSIONecho $BASH_VERSION
--
Christian "naddy" Weisgerber
Russell Marks wrote:[...]
paris2venice wrote:
[...]So instead of creating the hieroglyph, the script just ignores the
same exact code. Is there any way around this? Thanks for any
help.
bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.
That's a very old version of bash. To quote the CHANGES file from a
newer version, "Fixed several bugs with the handling of valid and
invalid unicode character values when used with the \u and \U escape
sequences to printf and $'...'." So the old version not having those
fixes might be the problem.
That's interesting. Did you see my following comment about trying
it with version 5.0.17? I had the same exact results.
In any case, the UTF-8 does not fail even with the 3.2.57(1)
release:
utf8a=80 utf8b=80
utf8_hg=$( printf "\xF0\x93\x${utf8a}\x${utf8b}" )
echo $utf8_hg
ð“€€
Playing around with this on Linux, one way to nearly replicate your
result with a newer bash is "LC_ALL=C printf '\U00013000\n'" which for
me will output "\u00013000". So I suppose there could be a locale
issue involved.
On 2023-11-30, Russell Marks <zgedneil@spam^H^H^H^Hgmail.com> wrote:
Playing around with this on Linux, one way to nearly replicate your
result with a newer bash is "LC_ALL=C printf '\U00013000\n'" which for
me will output "\u00013000". So I suppose there could be a locale
issue involved.
But if you execute the commands in question first on the command
line, then in a minimal script as shown, the same locale settings
will be used for both.
bash --version GNU bash, version 3.2.57(1)-release
(x86_64-apple-darwin20)
Copyright (C) 2007 Free Software Foundation, Inc.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 376 |
Nodes: | 16 (2 / 14) |
Uptime: | 25:12:42 |
Calls: | 8,035 |
Calls today: | 5 |
Files: | 13,034 |
Messages: | 5,829,274 |