Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support %<PRIu16> and friends formatting macros from <inttypes.h> #860

Open
Artoria2e5 opened this issue Sep 9, 2024 · 6 comments
Open

Comments

@Artoria2e5
Copy link

Artoria2e5 commented Sep 9, 2024

Some gettext files I get include insertions such as %<PRIu16>, which is generated by xgettext from code resembling:

// https://gitlab.com/cryptsetup/cryptsetup/-/blob/main/lib/bitlk/bitlk.c#L471
		log_err(cd, _("Unsupported sector size %" PRIu16 "."), params->sector_size);

POEdit has a tendency to highlight those as HTML tags, and the auto translator similarly has a tendency to auto-close them unnecessarily:
image

Additional evidence for this gettext behavior: https://www.postgresql.org/message-id/20180425141324.111ec08e%40wp.localdomain

@Artoria2e5
Copy link
Author

Artoria2e5 commented Sep 9, 2024

Another failure-prone pattern is in manpages, where B<xz> stands for bolding the text xz: https://translationproject.org/POT-files/xz-man-5.6.0-pre2.pot

This is probably generated by po4a.

@vslavik
Copy link
Owner

vslavik commented Sep 9, 2024

Some gettext files I get include insertions such as %, which is generated by xgettext from code resembling:

Please be more specific than vague hand-waving about "some" files, i.e. provide reproduction steps that can be followed. I have never seen this syntax. If it's something homegrown, it's not worth supporting. If it's something more commonly used, I welcome PRs improving placeholder detection.

has a tendency to highlight those as HTML tags

So what? It does decent enough job of highlighting these custom placeholders as it is...

auto translator similarly has a tendency

Poedit doesn't have a magical "auto translator". Pre-translation is approximate and it is expected that some things will need human correction. If something actually is a HTML tag in 99.99% cases, it is reasonable if it requires human correction in 0.01% cases.

How such lone tags are handled is a matter of the MT engine, not Poedit, and while Microsoft Translator doubles them, neither Google nor DeepL do (though both treat them — reasonably — as a placeholder, not prefix for the next word; xz's syntax is simply poorly designed).

@Artoria2e5
Copy link
Author

Artoria2e5 commented Sep 9, 2024

Please be more specific than vague hand-waving about "some" files, i.e. provide reproduction steps that can be followed. I have never seen this syntax. If it's something homegrown, it's not worth supporting. If it's something more commonly used, I welcome PRs improving placeholder detection.

The % thing is shown in every file that does the code above. I've provided evidence, from an unrelated piece of software, that upstream xgettext does it.

Here's some gettext internal code that handles what it calls a "system-dependent string", i.e. the %<> stuff: https://github.com/autotools-mirror/gettext/blob/67c601e9e42f6759c57ea0ae85336dc1e61b55c8/gettext-tools/src/read-mo.c#L140

Here's how the runtime parses it: https://github.com/autotools-mirror/gettext/blob/67c601e9e42f6759c57ea0ae85336dc1e61b55c8/gettext-runtime/intl/loadmsgcat.c#L409

Here's where xgettext makes it: ttext-tools/src/x-c.c#L1960

@bhaible
Copy link

bhaible commented Sep 9, 2024

And here is its documentation: https://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html (search for <inttypes.h>).

@vslavik
Copy link
Owner

vslavik commented Sep 9, 2024

The % thing is shown in every file that does the code above. I've provided evidence, from an unrelated piece of software, that upstream xgettext does it.

Sorry, but w/o context or explanation, that really wasn't very comprehensible.

And here is its documentation: https://www.gnu.org/software/gettext/manual/html_node/Preparing-Strings.html (search for <inttypes.h>).

@bhaible I see, thanks a lot!

So this presumably only applies to c-format strings?

@vslavik vslavik changed the title %<PRIu16> and friends treated as HTML Support %<PRIu16> and friends placeholder macros from <inttypes.h> Sep 9, 2024
@vslavik vslavik changed the title Support %<PRIu16> and friends placeholder macros from <inttypes.h> Support %<PRIu16> and friends formatting macros from <inttypes.h> Sep 9, 2024
@bhaible
Copy link

bhaible commented Sep 9, 2024

So this presumably only applies to c-format strings?

In theory, it can occur in any string extracted by xgettext's C parser. But usually such strings should be marked as c-format, yes. If they aren't, the POT file is broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants