Discussion:
ctype.h macros may conflict with C++11 UDL
Glenn Elliott
2018-10-10 01:03:27 UTC
Permalink
Hello newlib maintainers,

ctype.h “#define"s the following values: _U, _L, _N, _S, _P, _C, _X, and _B.

These macros may conflict with C++11 user defined literals (UDL) (https://en.cppreference.com/w/cpp/language/user_literal <https://en.cppreference.com/w/cpp/language/user_literal>). It’s easy to see how C++11 UDL suffixes might conflict it the macros defined in ctype.h. For instance, one may define a UDL function of “_N” to instantiate a C++ type that represents newtons of force. Indeed, this is done by this units library: https://github.com/nholthaus/units <https://github.com/nholthaus/units>

Must these macros leak from ctype.h?
Corinna Vinschen
2018-10-10 10:59:13 UTC
Permalink
Post by Glenn Elliott
Hello newlib maintainers,
ctype.h “#define"s the following values: _U, _L, _N, _S, _P, _C, _X, and _B.
These macros may conflict with C++11 user defined literals (UDL) (https://en.cppreference.com/w/cpp/language/user_literal <https://en.cppreference.com/w/cpp/language/user_literal>). It’s easy to see how C++11 UDL suffixes might conflict it the macros defined in ctype.h. For instance, one may define a UDL function of “_N” to instantiate a C++ type that represents newtons of force. Indeed, this is done by this units library: https://github.com/nholthaus/units <https://github.com/nholthaus/units>
Must these macros leak from ctype.h?
Yes, they have to, otherwise the isXXX ctype macros can't be resolved
successfully. Am I the only one thinkiong it was a bit non-considerate
of the C++ standarization commitee to use the underscore in a way
potentially colliding with implementation-defined symbols.

Probably the most feasible workaround is to change those macros to
double underscores throughout.


Corinna
--
Corinna Vinschen
Cygwin Maintainer
Red Hat
Richard Damon
2018-10-10 11:40:30 UTC
Permalink
Post by Corinna Vinschen
Post by Glenn Elliott
Hello newlib maintainers,
ctype.h “#define"s the following values: _U, _L, _N, _S, _P, _C, _X, and _B.
These macros may conflict with C++11 user defined literals (UDL) (https://en.cppreference.com/w/cpp/language/user_literal <https://en.cppreference.com/w/cpp/language/user_literal>). It’s easy to see how C++11 UDL suffixes might conflict it the macros defined in ctype.h. For instance, one may define a UDL function of “_N” to instantiate a C++ type that represents newtons of force. Indeed, this is done by this units library: https://github.com/nholthaus/units <https://github.com/nholthaus/units>
Must these macros leak from ctype.h?
Yes, they have to, otherwise the isXXX ctype macros can't be resolved
successfully. Am I the only one thinkiong it was a bit non-considerate
of the C++ standarization commitee to use the underscore in a way
potentially colliding with implementation-defined symbols.
Probably the most feasible workaround is to change those macros to
double underscores throughout.
Corinna
The normal criteria described for the Standard is that it works very
hard to try to not break existing USER code, so users don't need to
change (much) existing working code to move it to a new standard. There
is little such effort for implementations. Implementations, in general,
will need to have potentially very significant changes to implement the
new features of the new Standard, so needing to make minor internal
changes to get around new defined names is expected.

This is why, in general, any new 'keyword' will be defined in the
implementation name space, with some active change in the program needed
to make it 'nicer' and affecting user name space.
--
Richard Damon
Craig Howland
2018-10-10 15:18:26 UTC
Permalink
Post by Corinna Vinschen
Post by Glenn Elliott
Hello newlib maintainers,
ctype.h “#define"s the following values: _U, _L, _N, _S, _P, _C, _X, and _B.
These macros may conflict with C++11 user defined literals (UDL) (https://en.cppreference.com/w/cpp/language/user_literal <https://en.cppreference.com/w/cpp/language/user_literal>). It’s easy to see how C++11 UDL suffixes might conflict it the macros defined in ctype.h. For instance, one may define a UDL function of “_N” to instantiate a C++ type that represents newtons of force. Indeed, this is done by this units library: https://github.com/nholthaus/units <https://github.com/nholthaus/units>
Must these macros leak from ctype.h?
Yes, they have to, otherwise the isXXX ctype macros can't be resolved
successfully. Am I the only one thinkiong it was a bit non-considerate
of the C++ standarization commitee to use the underscore in a way
potentially colliding with implementation-defined symbols.
Probably the most feasible workaround is to change those macros to
double underscores throughout.
Corinna
It should be OK.  From the examples on the quoted page:

double operator"" _Z(long double); // error: all names that begin with underscore
// followed by uppercase letter are reserved
double operator""_Z(long double); // OK: even though _Z is reserved ""_Z is allowed

Notice that it is saying _[A-Z].* is still reserved (as in C).  (The second line
works because of the preprocessor maximal-munch rule, given that the ""_Z
construct is defined as a possible token.)  They comment on it in the definition
for user-defined-string-literal:

"the character sequence "" followed, without a space, by the character sequence
that becomes the ud-suffix. This special syntax makes it possible to use
language keywords and reserved identifiers as ud-suffixes, and is used by the
declaration of operator ""if from the header <complex>. Note that using this
form does not change the rules that user-defined literal operators must begin
with an underscore: declarations such as operator ""if may only appear as part
of a standard library header. However, it allows the use of an underscore
followed by a capital letter (which is otherwise a reserved identifier)"

The user-defined literals are specifically called out in the definition for
preprocessing tokens
(https://en.cppreference.com/w/cpp/language/translation_phases Phase 3, point
1d).  That is, the standard took special care to not clobber the reserved
identifiers.  So the worst thing that might happen for Newlib would be that
maybe we'd need to add some spaces where they are used, but that could be easily
checked.

Craig

Loading...