digitalmars.D - Internationalization support and format strings

Bruno Haible (35/35) Mar 24 Hi,

Richard (Rikki) Andrew Cattermole (8/37) Mar 24 Needs to be discussed.

Bruno Haible (54/62) Mar 25 Hmm, you mean, instead of specifying a fixed order:
Bruno Haible (17/20) Mar 25 Another reason why it's better to keep a fixed order of

Paul Backus (3/9) Mar 24 Have you seen this package?

Bruno Haible (12/17) Mar 25 Thanks for the hint. Yes, I have looked at all of these:

Steven Schveighoffer (11/23) Mar 25 https://code.dlang.org/packages/gettext is, as far as I know,

Bruno Haible <bruno clisp.org> writes:

Hi,

The GNU gettext package contains tools for internationalization, 
enabling a programmer to make their package "speak" to the users 
in their specific language.

GNU gettext so far supports a number of programming languages, see
https://www.gnu.org/software/gettext/manual/html_node/List-of-Programming-Languages.html

I thought it would be a good idea to make GNU gettext support 
also the D programming language. This is a registered wish list 
item since 2017: https://savannah.gnu.org/bugs/?51291 . On the D 
side, a rudimentary interface to the gettext() function in the 
GNU C library exists as well: 
https://code.dlang.org/packages/libintl

I am now trying to implement this support. I am already done with 
the xgettext support (parsing D source code and extracting 
messages). But from the programming language, this support also 
needs format strings with positions (so that translators can 
reorder arguments in their translations of format strings).

D has format strings in its standard library (phobos): 
https://dlang.org/library/std/format.html
But this format string facility has 4 major bugs:
https://github.com/dlang/phobos/issues/10699
https://github.com/dlang/phobos/issues/10711
https://github.com/dlang/phobos/issues/10712
https://github.com/dlang/phobos/issues/10713

Two questions:

1) How can this be, that a programming language that is more than 
20 years old and that is integrated into GCC for 6 years, has a 
format string facility that is riddled with bugs? Is D only a 
playground for compiler hackers and not used for real 
applications, and thus the standard library is "uninteresting"?

2) How should I continue? What advice would you give me? Should I 
wait until the format string bugs are fixed (and if so, in which 
time frame)? Or should I cancel the GNU gettext support for D ?

Best regards. I don't want to offend anyone. If you feel an 
offense, please excuse it with frustration on my side.

Mar 24

"Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:

On 25/03/2025 11:39 AM, Bruno Haible wrote:
 Hi,
 
 The GNU gettext package contains tools for internationalization, 
 enabling a programmer to make their package "speak" to the users in 
 their specific language.
 
 GNU gettext so far supports a number of programming languages, see
 https://www.gnu.org/software/gettext/manual/html_node/List-of- 
 Programming-Languages.html
 
 I thought it would be a good idea to make GNU gettext support also the D 
 programming language. This is a registered wish list item since 2017: 
 https://savannah.gnu.org/bugs/?51291 . On the D side, a rudimentary 
 interface to the gettext() function in the GNU C library exists as well: 
 https://code.dlang.org/packages/libintl
 
 I am now trying to implement this support. I am already done with the 
 xgettext support (parsing D source code and extracting messages). But 
 from the programming language, this support also needs format strings 
 with positions (so that translators can reorder arguments in their 
 translations of format strings).
 
 D has format strings in its standard library (phobos): https:// 
 dlang.org/library/std/format.html
 But this format string facility has 4 major bugs:

 https://github.com/dlang/phobos/issues/10699

Needs to be discussed.

Requires sentinel value, and right now its typed as a ubyte.

 https://github.com/dlang/phobos/issues/10711

I think its more that docs are wrong here, rather than implementation.

 https://github.com/dlang/phobos/issues/10712

I'm not sure about this one, it looks funky, but at least its easy 
enough to work around with.

I'll leave it to someone else.

 https://github.com/dlang/phobos/issues/10713

https://github.com/dlang/phobos/pull/10714

Mar 24

Bruno Haible <bruno clisp.org> writes:

Richard (Rikki) Andrew Cattermole wrote:
 https://github.com/dlang/phobos/issues/10713

 https://github.com/dlang/phobos/pull/10714

Thanks for handling this one.

 https://github.com/dlang/phobos/issues/10711

 I think its more that docs are wrong here, rather than 
 implementation.

Hmm, you mean, instead of specifying a fixed order:

Parameters:
     Position Flags Width Precision Separator

the spec should specify arbitrary order?

Parameters:
     empty
     Parameter Parameters
Parameter:
     Position
     Flags
     Width
     Precision
     Separator

If that is intended, then

1) Is it valid to specify two positions, two widths, two 
precisions, or two separators? For example
   %1$2$d

   %.3.5d
   %,3,5d
And if it is valid, does the first 
position/width/precision/separator matter, or the last one?

2) Since Flags can start with a digit 0, how do you disambiguate 
Flags after Width, Precision, or Separator?
For example
   %40d
   %.50d
   %,50d

The current spec, with the fixed order, is better at avoiding 
these ambiguities.

 https://github.com/dlang/phobos/issues/10712


Richard (Rikki) Andrew Cattermole wrote:
 I'm not sure about this one, it looks funky, but at least its 
 easy enough to work around with.

Such a workaround does not help me with the internationalization.

The situation with the internationalization is:

1) The programmer specifies a format string as a gettext() 
argument. For instance,
gettext("%s is replaced with %*s")
or
"%s is replaced with %*s".gettext

2) The translator decides whether they need reordering, and thus 
translates
"%s is replaced with %*s"
with
"%3$*2$d ersetzt %1$s"

3) The GNU msgfmt program verifies that the translator's 
translation "matches", based on the specification of format 
strings.

There is no programmer that could add a workaround, since the 
programmer is not involved after step 1. And the translator 
usually does not try their translations "live".

So, what is really needed here, is not a possible workaround but 
an implementation of std.format that is in sync with its 
specification.

Mar 25

Bruno Haible <bruno clisp.org> writes:

Richard (Rikki) Andrew Cattermole wrote:
 https://github.com/dlang/phobos/issues/10711

 I think its more that docs are wrong here, rather than 
 implementation.

Another reason why it's better to keep a fixed order of
     Position Flags Width Precision Separator
is the runtime execution (implemented in std/format/write.d, 
function formattedWrite). This function currently processes 
width, precision, separators in that order.

Now, think of a format string such as
     "%,*.**d"
The programmer would expect that argument 1 are the separator 
digits, argument 2 is the precision, argument 3 is the width, and 
argument 4 is the value to be formatted.

If you don't change formattedWrite, it will actually use argument 
1 for the width, argument 2 for the precision, argument 3 for the 
separator digits — which doesn't match programmer expectations.

Whereas if you change formattedWrite to use the arguments in the 
order in which they were referenced in the format string, you are 
slowing down the formatting at runtime.

Mar 25

Paul Backus <snarwin gmail.com> writes:

On Monday, 24 March 2025 at 22:39:11 UTC, Bruno Haible wrote:
 I thought it would be a good idea to make GNU gettext support 
 also the D programming language. This is a registered wish list 
 item since 2017: https://savannah.gnu.org/bugs/?51291 . On the 
 D side, a rudimentary interface to the gettext() function in 
 the GNU C library exists as well: 
 https://code.dlang.org/packages/libintl

Have you seen this package?

https://code.dlang.org/packages/gettext

Mar 24

Bruno Haible <bruno clisp.org> writes:

Paul Backus wrote:
 On the D side, a rudimentary interface to the gettext() 
 function in the GNU C library exists as well: 
 https://code.dlang.org/packages/libintl

 Have you seen this package?

 https://code.dlang.org/packages/gettext

Thanks for the hint. Yes, I have looked at all of these:
https://code.dlang.org/packages/libintl
https://code.dlang.org/packages/gettext
https://code.dlang.org/packages/i18nd
https://code.dlang.org/packages/mofile
https://code.dlang.org/packages/djtext

They all have very small "Download Stats", indicating that their 
actual use in applications is nonexistent or irrelevant.

 From these five packages, the one that comes closest to having a 
usable API, on par with the gettext APIs for other programming 
languages, is https://code.dlang.org/packages/libintl .

Mar 25

Steven Schveighoffer <schveiguy gmail.com> writes:

On Tuesday, 25 March 2025 at 09:23:56 UTC, Bruno Haible wrote:
 Paul Backus wrote:
 Have you seen this package?

 https://code.dlang.org/packages/gettext

 Thanks for the hint. Yes, I have looked at all of these:
 https://code.dlang.org/packages/libintl
 https://code.dlang.org/packages/gettext
 https://code.dlang.org/packages/i18nd
 https://code.dlang.org/packages/mofile
 https://code.dlang.org/packages/djtext

 They all have very small "Download Stats", indicating that 
 their actual use in applications is nonexistent or irrelevant.

https://code.dlang.org/packages/gettext is, as far as I know, 
being used in a real application, and was developed specifically 
for that application, for one of D's major users in industry (see 
Bastiaan's talk in 2023: https://dconf.org/2023/#veelob)

I would focus on that one for de-facto standardization. Likely it 
has a small number of users, but would be actively fixed if 
issues are found.

And also, the formatting bugs should all be fixed as well, 
regardless of gettext support.

-Steve

Mar 25

D Programming

C/C++ Programming

Other

digitalmars.D - Internationalization support and format strings