Discussion:
Regression with printf 8-bit format macros
Ambroz Bizjak
2016-01-01 11:08:30 UTC
Permalink
Hello,

It appears to me that commit "Fix for pri and scn formats" [1] broke
the 8-bit printf format macros (PRI*8), for scenarios when newlib is
configured either with --enable-newlib-nano-formatted-io or without
--enable-newlib-io-c99-format is not used. Because the new code
defines these macros to the "hh" formats, which are not supported
then.

I think the code should be changed so that the PRI*8 macros are
defined correctly even when printf does not support hh. The previous
definitions of these format macros didn't have length modifiers (they
were just i, d, x, X, etc.) and seemed to have worked, presumably due
to default-promotion of int8_t/uint8_t to int. The way I understand
the standard, it is not correct to e.g. use PRId8 but pass to printf
something which is not an int8, so it shouldn't be a problem that e.g.
printf("%"PRId8, 1000) actually prints 1000; what matters is that it
prints all the possible int8_t values correctly when passed an int8_t.

[1] https://cygwin.com/git/gitweb.cgi?p=newlib-cygwin.git;a=commit;h=892cfcb7c2fbdc3b2b09c95a7a6c2065e5e0cfae

Best regards,
Ambroz
Freddie Chopin
2016-01-01 22:44:52 UTC
Permalink
Post by Ambroz Bizjak
Hello,
It appears to me that commit "Fix for pri and scn formats" [1] broke
the 8-bit printf format macros (PRI*8), for scenarios when newlib is
configured either with --enable-newlib-nano-formatted-io or without
--enable-newlib-io-c99-format is not used. Because the new code
defines these macros to the "hh" formats, which are not supported
then.
Is it actually a problem that PRI* macros don't work without support for C99?
After all, these are used for C99 types, so not really needed without C99...

Regards,
FCh
Ambroz Bizjak
2016-01-01 23:03:57 UTC
Permalink
Hi Freddie,

I hope we agree that C99 support should not be an all-or-nothing
decision, you should be able to have some parts (like the stdint types
which cost nothing), but not other parts like the full C99 support in
printf. The worst part really is that the PRI*8 macros are defined in
the mentioned configurations but they do not work. I see that on the
other hand the SCN*8 macros are not defined when they would not work
(see the top comment in inttypes.h).

But then we can have the PRI*8 macros for free anyway, so why should
they be available only when printf has "hh" support?

Best regards,
Ambroz
Post by Freddie Chopin
Post by Ambroz Bizjak
Hello,
It appears to me that commit "Fix for pri and scn formats" [1] broke
the 8-bit printf format macros (PRI*8), for scenarios when newlib is
configured either with --enable-newlib-nano-formatted-io or without
--enable-newlib-io-c99-format is not used. Because the new code
defines these macros to the "hh" formats, which are not supported
then.
Is it actually a problem that PRI* macros don't work without support for C99?
After all, these are used for C99 types, so not really needed without C99...
Regards,
FCh
Andre Vieira
2016-01-04 11:21:59 UTC
Permalink
Hi Ambroz,

Just to clarify as to why this change was made. The PRI*8 and SCN*8
macro's were printing '*' before, the pattern for 32-bit types, when the
actual size of __INT8_TYPE__ was '8 bits', i.e. _WANT_IO_C99_FORMATS was
defined. Which we agreed at the time was the wrong behavior.

Now you are not the first to stumble onto this issue, though as FCh
points out these are C99 types and the configurations you mentioned do
not support these.
Post by Ambroz Bizjak
Hi Freddie,
I hope we agree that C99 support should not be an all-or-nothing
decision, you should be able to have some parts (like the stdint types
which cost nothing), but not other parts like the full C99 support in
printf. The worst part really is that the PRI*8 macros are defined in
the mentioned configurations but they do not work. I see that on the
other hand the SCN*8 macros are not defined when they would not work
(see the top comment in inttypes.h).
I do agree that it is weird that for the SCN*8 macros to be guarded with
_WANT_IO_C99_FORMATS, whilst the PRI*8 are not. Though guarding PRI*8
would not solve your problem, as you seem to want these to be defined as
'*', i.e. defaulting to their earlier behavior of returning the 32-bit
format pattern. I would oppose doing this in all cases, since if
_WANT_IO_C99_FORMATS is defined, then this would yield the erroneous
behavior this patch was meant to fix.

There might be a case to be made for doing so if _WANT_IO_C99_FORMATS is
not defined. On the other hand, doing this for SCN*8 would be a bad idea
in my opinion and this would keep the discrepancy between PRI*8 and SCN*8.

I do not think we should encourage the use of PRI*8 and SCN*8 types if
these are not supported by the library being used. As this seems to be
the case, since _WANT_IO_C99_FORMATS is not defined, which I think would
be the appropriate macro to indicate support for such.
Post by Ambroz Bizjak
But then we can have the PRI*8 macros for free anyway, so why should
they be available only when printf has "hh" support?
Best regards,
Ambroz
Post by Freddie Chopin
Post by Ambroz Bizjak
Hello,
It appears to me that commit "Fix for pri and scn formats" [1] broke
the 8-bit printf format macros (PRI*8), for scenarios when newlib is
configured either with --enable-newlib-nano-formatted-io or without
--enable-newlib-io-c99-format is not used. Because the new code
defines these macros to the "hh" formats, which are not supported
then.
Is it actually a problem that PRI* macros don't work without support for C99?
After all, these are used for C99 types, so not really needed without C99...
Regards,
FCh
Further discussion is most welcome!

Cheers,
Andre
Ambroz Bizjak
2016-01-04 16:05:44 UTC
Permalink
Hi Andre,

I understand that defining PRI*8 to "*" rather than "hh*" is not
elegant, but it is correct.

This is the relevant text from the C99 draft (7.8.1):

Each of the following object-like macros expands to a character string
literal containing a conversion specifier, possibly modified by a
length modifier, suitable for use within the format argument of a
formatted input/output function when converting the corresponding
integer type. These macro names have the general form of PRI
(character string literals for the fprintf and fwprintf family) or SCN
(character string literals for the fscanf and fwscanf family),
followed by the conversion specifier, followed by a name corresponding
to a similar type name in 7.18.1. In these names, N represents the
width of the type as described in 7.18.1. For example, PRIdFAST32 can
be used in a format string to print the value of an integer of type
int_fast32_t.

So all that is required is that the definition is suitable for use in
combination with that type. There is nothing that specifically implies
that PRI*8 needs to be "hh*".

For example, if PRId8 is defined to "d", the following code will
behave as expected. This is because x is implicitly promoted to int,
and int is the valid type to use with %d.
uint8_t x = 255; printf("%"PRId8, x);

I think that one of these three things should be done:
- Revert the changes so that PRI*8 is again defined to "*".
- Provide the new definitions ("hh*") when C99 printf is supported,
but the old ones ("*") if it's not.
- Do not define the PRI*8 macros when C99 printf is not supported.
Post by Andre Vieira
I do agree that it is weird that for the SCN*8 macros to be guarded with _WANT_IO_C99_FORMATS, whilst the PRI*8 are not.
This is likely because the old PRI*8 definitions were known to be
correct regardless of C99 printf support. And the reason that the
SCN*8 macros must be guarded is because the scanf implementation needs
to specifically aware of the exact type that the passed pointers point
to, so that the outputs can be correctly stored. There is no implicit
promotion to "save us" in the case of scanf.

You might be surprised how Linux defines these macros - at least on
the machine I tested on, PRI*8 was "*".

Best

On Mon, Jan 4, 2016 at 12:21 PM, Andre Vieira
Post by Andre Vieira
Hi Ambroz,
Just to clarify as to why this change was made. The PRI*8 and SCN*8 macro's
were printing '*' before, the pattern for 32-bit types, when the actual size
of __INT8_TYPE__ was '8 bits', i.e. _WANT_IO_C99_FORMATS was defined. Which
we agreed at the time was the wrong behavior.
Now you are not the first to stumble onto this issue, though as FCh points
out these are C99 types and the configurations you mentioned do not support
these.
Hi Freddie,
I hope we agree that C99 support should not be an all-or-nothing
decision, you should be able to have some parts (like the stdint types
which cost nothing), but not other parts like the full C99 support in
printf. The worst part really is that the PRI*8 macros are defined in
the mentioned configurations but they do not work. I see that on the
other hand the SCN*8 macros are not defined when they would not work
(see the top comment in inttypes.h).
I do agree that it is weird that for the SCN*8 macros to be guarded with
_WANT_IO_C99_FORMATS, whilst the PRI*8 are not. Though guarding PRI*8 would
not solve your problem, as you seem to want these to be defined as '*', i.e.
defaulting to their earlier behavior of returning the 32-bit format pattern.
I would oppose doing this in all cases, since if _WANT_IO_C99_FORMATS is
defined, then this would yield the erroneous behavior this patch was meant
to fix.
There might be a case to be made for doing so if _WANT_IO_C99_FORMATS is not
defined. On the other hand, doing this for SCN*8 would be a bad idea in my
opinion and this would keep the discrepancy between PRI*8 and SCN*8.
I do not think we should encourage the use of PRI*8 and SCN*8 types if these
are not supported by the library being used. As this seems to be the case,
since _WANT_IO_C99_FORMATS is not defined, which I think would be the
appropriate macro to indicate support for such.
But then we can have the PRI*8 macros for free anyway, so why should
they be available only when printf has "hh" support?
Best regards,
Ambroz
Post by Freddie Chopin
Post by Ambroz Bizjak
Hello,
It appears to me that commit "Fix for pri and scn formats" [1] broke
the 8-bit printf format macros (PRI*8), for scenarios when newlib is
configured either with --enable-newlib-nano-formatted-io or without
--enable-newlib-io-c99-format is not used. Because the new code
defines these macros to the "hh" formats, which are not supported
then.
Is it actually a problem that PRI* macros don't work without support for C99?
After all, these are used for C99 types, so not really needed without C99...
Regards,
FCh
Further discussion is most welcome!
Cheers,
Andre
Craig Howland
2016-01-04 18:07:16 UTC
Permalink
Post by Ambroz Bizjak
Hi Andre,
I understand that defining PRI*8 to "*" rather than "hh*" is not
elegant, but it is correct.
Each of the following object-like macros expands to a character string
literal containing a conversion specifier, possibly modified by a
length modifier, suitable for use within the format argument of a
formatted input/output function when converting the corresponding
integer type. These macro names have the general form of PRI
(character string literals for the fprintf and fwprintf family) or SCN
(character string literals for the fscanf and fwscanf family),
followed by the conversion specifier, followed by a name corresponding
to a similar type name in 7.18.1. In these names, N represents the
width of the type as described in 7.18.1. For example, PRIdFAST32 can
be used in a format string to print the value of an integer of type
int_fast32_t.
So all that is required is that the definition is suitable for use in
combination with that type. There is nothing that specifically implies
that PRI*8 needs to be "hh*".
Well, actually there is, but it is "hidden" in the fprintf hh definition:
"hh Specifies that a following d, i, o, u, x, or X conversion specifier applies to a
signed char or unsigned char argument (the argument will have
been promoted according to the integer promotions, but its value shall be
converted to signed char or unsigned char before printing); or that
a following n conversion specifier applies to a pointer to a signed char
argument."
There are actually 2 reasons why it must contain "hh".
Firstly then, according to the standard (the part in parentheses),
printf("%"PRId8, 1000) really should print 232--1000 is not OK (as suggested in
the first email in this chain) because the passed value gets converted before
printing.
Secondly, it also makes a difference for %n--it is critical that the pointer is
to char and not int. (While you might think that %n is really a scan--rather
than a print--specification, the standard is explicit (as quoted above by
Ambroz) that the PRI* macros go with the fprintf family and SCN* with the scanf
family. While this is probably an oversight/mistake in the standard (i.e. SCN*
really should go with printf's %n), that is what it does say. I also checked
C11, and they did not change the language concerning this.).
Post by Ambroz Bizjak
For example, if PRId8 is defined to "d", the following code will
behave as expected. This is because x is implicitly promoted to int,
and int is the valid type to use with %d.
uint8_t x = 255; printf("%"PRId8, x);
- Revert the changes so that PRI*8 is again defined to "*".
- Provide the new definitions ("hh*") when C99 printf is supported,
but the old ones ("*") if it's not.
- Do not define the PRI*8 macros when C99 printf is not supported.
The third option is the correct approach according to the standard. From 7.8.1:
"For each type that the implementation provides in <stdint.h>, the corresponding
fprintf macros shall be defined and the corresponding fscanf macros shall be
defined unless the implementation does not have a suitable fscanf length
modifier for the type." If we're saying that we don't have a suitable modifier
(i.e. we're choosing to not support hh when _WANT_IO_C99_FORMATS is not
defined), then PRI*8 should be left undefined.
Post by Ambroz Bizjak
I do agree that it is weird that for the SCN*8 macros to be guarded with _WANT_IO_C99_FORMATS, whilst the PRI*8 are not.
This is likely because the old PRI*8 definitions were known to be
correct regardless of C99 printf support. And the reason that the
SCN*8 macros must be guarded is because the scanf implementation needs
to specifically aware of the exact type that the passed pointers point
to, so that the outputs can be correctly stored. There is no implicit
promotion to "save us" in the case of scanf.
You might be surprised how Linux defines these macros - at least on
the machine I tested on, PRI*8 was "*".
(This is bad for the reasons noted above.)
Craig
Ambroz Bizjak
2016-01-04 19:16:05 UTC
Permalink
Hi Chris, others,

The quoted part of the standard does explain how "hh" shall behave,
but I do not see any implication that PRI*N should use hh or generally
that the described conversion (after promotion) is required for PRI*N.
(note: I suspect you meant to use PRIu8 in the example not PRId8
because the conversion of 1000 to int8_t is an integer overflow.)

About the %n issue, it has no relation to the PRI* macros. The PRI*
macros cannot possibly used together with "n", because each PRI* macro
must end with a conversion specifier, which is not "n", and there can
only be one conversion specifier in a conversion. For example,
"%n"PRId8 and "%"PRId8"n" could become "%nd" and "%dn" respectively
(the first one writes out the position then prints a literal d, the
second prints a integer and a literal n). Notice that the standard
does not define any "PRIn..." macro.

Best regards,
Ambroz

On Mon, Jan 4, 2016 at 7:07 PM, Craig Howland
Post by Craig Howland
Post by Ambroz Bizjak
Hi Andre,
I understand that defining PRI*8 to "*" rather than "hh*" is not
elegant, but it is correct.
Each of the following object-like macros expands to a character string
literal containing a conversion specifier, possibly modified by a
length modifier, suitable for use within the format argument of a
formatted input/output function when converting the corresponding
integer type. These macro names have the general form of PRI
(character string literals for the fprintf and fwprintf family) or SCN
(character string literals for the fscanf and fwscanf family),
followed by the conversion specifier, followed by a name corresponding
to a similar type name in 7.18.1. In these names, N represents the
width of the type as described in 7.18.1. For example, PRIdFAST32 can
be used in a format string to print the value of an integer of type
int_fast32_t.
So all that is required is that the definition is suitable for use in
combination with that type. There is nothing that specifically implies
that PRI*8 needs to be "hh*".
"hh Specifies that a following d, i, o, u, x, or X conversion specifier applies to a
signed char or unsigned char argument (the argument will have
been promoted according to the integer promotions, but its value shall be
converted to signed char or unsigned char before printing); or that
a following n conversion specifier applies to a pointer to a signed char
argument."
There are actually 2 reasons why it must contain "hh".
Firstly then, according to the standard (the part in parentheses),
printf("%"PRId8, 1000) really should print 232--1000 is not OK (as suggested
in the first email in this chain) because the passed value gets converted
before printing.
Secondly, it also makes a difference for %n--it is critical that the pointer
is to char and not int. (While you might think that %n is really a
scan--rather than a print--specification, the standard is explicit (as
quoted above by Ambroz) that the PRI* macros go with the fprintf family and
SCN* with the scanf family. While this is probably an oversight/mistake in
the standard (i.e. SCN* really should go with printf's %n), that is what it
does say. I also checked C11, and they did not change the language
concerning this.).
Post by Ambroz Bizjak
For example, if PRId8 is defined to "d", the following code will
behave as expected. This is because x is implicitly promoted to int,
and int is the valid type to use with %d.
uint8_t x = 255; printf("%"PRId8, x);
- Revert the changes so that PRI*8 is again defined to "*".
- Provide the new definitions ("hh*") when C99 printf is supported,
but the old ones ("*") if it's not.
- Do not define the PRI*8 macros when C99 printf is not supported.
The third option is the correct approach according to the standard. From
7.8.1: "For each type that the implementation provides in <stdint.h>, the
corresponding fprintf macros shall be defined and the corresponding fscanf
macros shall be defined unless the implementation does not have a suitable
fscanf length modifier for the type." If we're saying that we don't have a
suitable modifier (i.e. we're choosing to not support hh when
_WANT_IO_C99_FORMATS is not defined), then PRI*8 should be left undefined.
Post by Ambroz Bizjak
Post by Andre Vieira
I do agree that it is weird that for the SCN*8 macros to be guarded with
_WANT_IO_C99_FORMATS, whilst the PRI*8 are not.
This is likely because the old PRI*8 definitions were known to be
correct regardless of C99 printf support. And the reason that the
SCN*8 macros must be guarded is because the scanf implementation needs
to specifically aware of the exact type that the passed pointers point
to, so that the outputs can be correctly stored. There is no implicit
promotion to "save us" in the case of scanf.
You might be surprised how Linux defines these macros - at least on
the machine I tested on, PRI*8 was "*".
(This is bad for the reasons noted above.)
Craig
Craig Howland
2016-01-04 20:33:53 UTC
Permalink
Hi Craig, others,
The quoted part of the standard does explain how "hh" shall behave,
but I do not see any implication that PRI*N should use hh or generally
that the described conversion (after promotion) is required for PRI*N.
(note: I suspect you meant to use PRIu8 in the example not PRId8
because the conversion of 1000 to int8_t is an integer overflow.)
Yes, it is implied, by the "suitable for use within the format argument of a
formatted input/output function when converting the corresponding integer type"
clause. Not only that, the very existence of the macros implies it. Why would
you bother with ""%"PRId*" rather than just using "%d"? You use the macro to
supply the correct modifier that you would use if you knew the type (and that
the type were invariant). If the macros never supplied modifiers, then they'd
be useless.
I copied the example directly from the opening email, which used d. I should
have said that -24 should be printed--the signed interpretation of 0xE8--rather
than 232. (I used an unsigned method of giving myself the decimal initially.
1000=0x3E8, which will be passed as that (assuming ints are >=16 bits), then the
cast to signed char truncates to 0xE8, is either 232 or -24.)
About the %n issue, it has no relation to the PRI* macros. The PRI*
macros cannot possibly used together with "n", because each PRI* macro
must end with a conversion specifier, which is not "n", and there can
only be one conversion specifier in a conversion. For example,
"%n"PRId8 and "%"PRId8"n" could become "%nd" and "%dn" respectively
(the first one writes out the position then prints a literal d, the
second prints a integer and a literal n). Notice that the standard
does not define any "PRIn..." macro.
My bad on that, I failed to check the list properly for PRIn... . (I guess I
shouldn't rush so much next time. Sorry for the noise.)
Best regards,
Ambroz
Loading...