On 04.02.2021 03:02, J Naman wrote:CSV: I easily deal with CSV double dots. Just threw one in to see what printf() did with it. Forget CSV, try GAWK Version:
On Wednesday, 3 February 2021 at 20:56:57 UTC-5, J Naman wrote:I wouldn't expect an error, but two of the lines surprise me as well
I was surprised by printf behavior when coercing strings to numbers. Not saying it is a bug, just surprised me.Woops. The entire second set is correct. CONVFMT %.6g creates the 2.02102e+007
BEGIN{
#CONVFMT="%.6g"; GAWK default
a="202102.01.1234"
printf("s{%s} d{%d} f{%f}\n",a,a,a) # s{202102.01.1234} d{202102} f{202102.010000}
printf("s{%s} d{%d} f{%f}\n",+a,+a,+a) # s{202102} d{202102} f{202102.010000}
printf("s{%s} d{%d} f{%f}\n",0+a,0+a,0+a) # s{202102} d{202102} f{202102.010000}
printf("s{%s} d{%d} f{%f}\n","" a,"" a,"" a) # s{202102.01.1234} d{202102} f{202102.010000}
a="20210201.1234"
printf("s{%s} d{%d} f{%f}\n",a,a,a) # s{20210201.1234} d{20210201} f{20210201.123400}
printf("s{%s} d{%d} f{%f}\n",+a,+a,+a) # s{2.02102e+007} d{20210201} f{20210201.123400}
printf("s{%s} d{%d} f{%f}\n",0+a,0+a,0+a) # s{2.02102e+007} d{20210201} f{20210201.123400}
printf("s{%s} d{%d} f{%f}\n","" a,"" a,"" a) # s{20210201.1234} d{20210201} f{20210201.123400}
exit;} # eoBegin
#============
The first set surprised me a little because there was no error.
in one of these output fields.
But it seems not an issue of the string conversions, you can also see
effects with floats. Obviously depending on the size of the number,
the number of decimals. I played around with floats, reduced number
of decimals, adjusted CONVFMT, etc. etc., like in
awk '
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
awk -v CONVFMT="%.8g" '
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
The behavior and observed output is not spending confidence.
Some CSV had YYYYMM.DD.HHMM sigh ...That's an issue of your data. It can be fixed beforehand or when doing
the awk processing.
Janis
But it seems not an issue of the string conversions,
you can also see
effects with floats. Obviously depending on the size of the number,
the number of decimals. I played around with floats, reduced number
of decimals, adjusted CONVFMT, etc. etc., like in
awk '
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
awk -v CONVFMT="%.8g" '
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
The behavior and observed output is not spending confidence.
Janis Papanagnou <janis_pa...@hotmail.com> writes:
Excellent insight: when (and where) conversions happen versus when they are printed. I conflated them.But it seems not an issue of the string conversions,I think it is to do with string conversion -- specifically when they
occur and when the default numeric output format is used instead.
you can also see
effects with floats. Obviously depending on the size of the number,
the number of decimals. I played around with floats, reduced number
of decimals, adjusted CONVFMT, etc. etc., like in
awk 'All produce 20219 which is the expected result using the default output
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
and conversion formats: "%.6g".
awk -v CONVFMT="%.8g" 'Only the last uses CONVFMT. All the others use OFMT which remains at
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
the default value of "%.6g".
The behavior and observed output is not spending confidence.It surprising that there are two formats -- one for printing and one for conversion to a string -- but having only one might not be so convenient.
The manual is not perfect here. Someone using it for reference would
see that is says that numbers are converted to strings for printing, and
the sections on "How awk Converts Between Strings and Numbers" refers (naturally enough) to CONVFMT. You have to read a few sections further
in the explanation of print to see that it has its own format.
--
Ben.
Awk has been doing well for 44 years without any(?) complaints about
what I stumbled across.
On Thursday, 4 February 2021 at 17:16:52 UTC-5, Ben Bacarisse wrote:
Janis Papanagnou <janis_pa...@hotmail.com> writes:Excellent insight: when (and where) conversions happen versus when
But it seems not an issue of the string conversions,I think it is to do with string conversion -- specifically when they
occur and when the default numeric output format is used instead.
you can also seeAll produce 20219 which is the expected result using the default output
effects with floats. Obviously depending on the size of the number,
the number of decimals. I played around with floats, reduced number
of decimals, adjusted CONVFMT, etc. etc., like in
awk '
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
and conversion formats: "%.6g".
awk -v CONVFMT="%.8g" 'Only the last uses CONVFMT. All the others use OFMT which remains at
BEGIN{a=20219.01;print a;print a+0;a=0+a;print a;printf "%s\n", a}
'
the default value of "%.6g".
The behavior and observed output is not spending confidence.It surprising that there are two formats -- one for printing and one for
conversion to a string -- but having only one might not be so convenient.
The manual is not perfect here. Someone using it for reference would
see that is says that numbers are converted to strings for printing, and
the sections on "How awk Converts Between Strings and Numbers" refers
(naturally enough) to CONVFMT. You have to read a few sections further
in the explanation of print to see that it has its own format.
they are printed. I conflated them. "but having only one might not be
so convenient." Agreed, the potential consequences of doing anything
should be well thought out. Who knows what might "break". Awk has
been doing well for 44 years without any(?) complaints about what I
stumbled across.
In article <87im76m...@bsb.me.uk>,Thank you very much Aharon! Two different semantics explains that there can be a difference. And, after carefully rereading your carefully written GAWK: Effective AWK Programming, I realized that you indirectly explained part of this when you wrote, "If
Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
But, as it happens, the problem is a new(ish) one. The original AWKPOSIX separated the semantics of general number to string conversion
book (1988, a mere 33 years ago) specified only one format: OFMT.
Printing and conversion to a string were, back then, consistent.
from the semantics of printing numbers. IMHO this was a good thing.
Because CONVFMT and OFMT both have the same default value, almost
all programs continued to work unchanged.
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
But, as it happens, the problem is a new(ish) one. The original AWK
book (1988, a mere 33 years ago) specified only one format: OFMT.
Printing and conversion to a string were, back then, consistent.
Sysop: | Keyop |
---|---|
Location: | Huddersfield, West Yorkshire, UK |
Users: | 296 |
Nodes: | 16 (2 / 14) |
Uptime: | 83:37:59 |
Calls: | 6,658 |
Calls today: | 4 |
Files: | 12,203 |
Messages: | 5,333,527 |
Posted today: | 1 |