Forum: >>> Magnum BBS <<<

Dark
Log in

Username Password

Determine size demand of (Unicode-)characters on terminal from shell

From Janis Papanagnou@21:1/5 to All on Mon Dec 27 14:07:03 2021

I'm using ANSI escape codes ("\033[%d;%dH") to position Unicode
characters on a terminal window. The indices to provide for %d
are suited for (e.g.) the Latin character sets, but not for
character sets where characters require more than one unit for
the displayed glyph, e.g. like the Chinese characters. So with
a Latin character set I'd use indices 1, 2, 3, ... and for the
Asian sets I's use 1, 3, 5, ... to position the characters at
the screen. My question:

Is the size that the character glyphs need for representation
on a terminal somehow retrievable, so that I get, say, for
Unicode character \U0041 a value of 1 and for \U30ee a value
of 2, so that I can automatize the displaying on a terminal?

Janis

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Janis Papanagnou on Mon Dec 27 16:00:03 2021

On 27.12.2021 14:07, Janis Papanagnou wrote:

I'm using ANSI escape codes ("\033[%d;%dH") to position Unicode
characters on a terminal window. The indices to provide for %d
are suited for (e.g.) the Latin character sets, but not for
character sets where characters require more than one unit for
the displayed glyph, e.g. like the Chinese characters. So with
a Latin character set I'd use indices 1, 2, 3, ... and for the
Asian sets I's use 1, 3, 5, ... to position the characters at
the screen. My question:

Is the size that the character glyphs need for representation
on a terminal somehow retrievable, so that I get, say, for
Unicode character \U0041 a value of 1 and for \U30ee a value
of 2, so that I can automatize the displaying on a terminal?

In another newsgroup I was directed to a StackExchange post.[*]

'wc -L' seems to be the solution for me; it's non-standard but works
on my system at least.

$ printf "\U30ee" | wc -L
2
$ printf "\U0041" | wc -L
1

Janis

[*] https://unix.stackexchange.com/questions/245013/get-the-display-width-of-a-string-of-characters

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Janis Papanagnou@21:1/5 to Janis Papanagnou on Sat Jan 15 10:43:39 2022

On 27.12.2021 16:00, Janis Papanagnou wrote:

[ characters from different languages (or Unicode planes) require
differing amounts of cells for displaying in a terminal window ]

'wc -L' seems to be the solution for me; it's non-standard but works
on my system at least.

$ printf "\U30ee" | wc -L
2
$ printf "\U0041" | wc -L
1

For the examples above it works, it's possible to determine the number
of required cells.

But it doesn't work correctly for other subsets of characters, e.g. for
the Smiley-characters I don't get the correct number (2) on my system.

$ printf "\U1f600" | wc -L
0

I noticed that with a newer OS version the result is correct.

$ printf "\U1f600" | wc -L
2

I want to fix the system files where these numbers are defined.[*]

Q: Can anyone tell me where the character cell widths are defined?

Thanks.

Janis

[*] I don't want to upgrade that system or selectively update files
from a repository with all dependencies and possibly undesired effects.

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Bob Worm
  Fri Apr 19 14:04:19 2024
  from Wales, Uk via Telnet
- Richard
  Fri Apr 19 12:43:01 2024
  from Leeds, Uk via SSH
- Bob Worm
  Fri Apr 19 09:15:26 2024
  from Wales, Uk via Telnet
- Bob Worm
  Fri Apr 19 08:49:01 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	293
Nodes:	16 (2 / 14)
Uptime:	223:37:31
Calls:	6,623
Calls today:	5
Files:	12,171
Messages:	5,318,368