5 Commits

Author SHA1 Message Date
Hubert Głuchowski 7ec11ad67a misc/codepoint_width: handle partially ill-formed UTF-8
Previously the function just bailed on invalid input, this instead makes
it count how many replacement characters would be shown by a terminal
complying with the Unicode specification's recommendation here:
https://www.unicode.org/versions/Unicode17.0.0/core-spec/chapter-3/#G66453

Fixes https://github.com/mpv-player/mpv/pull/17773#issuecomment-4277538534
2026-04-22 00:06:59 +02:00
Kacper Michajłow 59d1dc43b9 various: fix typos 2025-01-04 15:59:49 +02:00
Kacper Michajłow 0c4c2caabf misc/codepoint_width: assume tabstop width to be 8
It has been hardcoded to the same value in stats.lua so keep the current
behaviour. Can be made configurable if requested in the future.
2024-10-21 20:06:48 +02:00
Kacper Michajłow bf025cd289 msg: allow to truncate the message to terminal width 2024-10-11 15:16:33 +02:00
Kacper Michajłow 95f0046309 misc/codepoint_width: add unicode width detection support
Add 4 stage trie to lookup unicode codepoint width and grapheme join
rules.

Generated by GraphemeTableGen from Microsoft Terminal (MIT Licence):
https://github.com/microsoft/terminal/blob/a7e47b711a2adc7b9e80eddea8168089f7d3b11e/src/tools/GraphemeTableGen/Program.cs

With minor adjustment to use it in C codebase.
- Replaced constexpr with static
- Replaced auto with explicit types

Generated from Unicode 16.0.0:
ucd.nounihan.grouped.xml: sha256(b11c2d23673bae660fff8ddcd3c1de4d54bdf6c60188a07696b010282f515fcf)
2024-10-11 15:06:14 +02:00