Post by narwhal xPost by Brian InglisPost by narwhal xOn Tue, Jul 3, 2018 at 8:07 PM, Brian Inglis
Post by Brian InglisPost by narwhal xI have a question regarding newlib and the -fstrict-aliasing implied
by turning on O2.
The strict aliasing implied by the ISO standard and enabled in gcc
with O2 (This might be specific to gcc, but could be the case with any
compiler with aliasing optimizations) makes it so you can only cast a
pointer to a compatible type, and a special case is malloc, which
should return an "undeclared type" *.
I however did not find the -fno-strict-aliasing flag in any
configuration or makefile (If I just overlooked it, and the flag is
mandatory that would answer my question)
"top = (mchunkptr)brk;"
Here top is of type "mchunkptr" and brk is a "char *". The standard
says that you can not just alias a incompatible type and dereference
it (unless it's a malloc'ed variable, as it would change it's type
when written to, but how do you inform the compiler?)
As an example, see 4.2.1 (p. 63) in
https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
So is this allowed? Or am I missing something.
* After asking in the gcc IRC, they mentioned that the way they go
about having the special case for malloc is making sure the libc
library is linked from a library and no LTO is performed.
My main reason for asking is just wanting to know how a malloc
implementation should deal with these restrictions stated by the ISO C
standard, and improve my understanding of the (sometimes confusing)
aliasing rules.
Pointer types char * and void * can be converted to other data pointer types,
and character types can alias other types, but you should not alias objects via
casts or conversions of pointers to objects stored as incompatible types,
because optimization could eliminate the stores, so the underlying storage of
the object of incompatible type may not be updated, and the compiler would not
know that because the type is different, as the compiler does not track possible
aliasing of incompatible types. Roughly IMHO HTH YMMV ;^>
Implementations of malloc use char * internally and convert those to char ** and
int * to maintain their internal housekeeping data at the start of the block,
often using unions, returning a pointer to universally aligned storage following
that block prefix, often resulting in malloc overhead of one or more universally
aligned blocks per allocation; reducing space overhead takes more work: see e.g.
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/stdlib/mallocr.c;h=ecc445f3d36365a4840e31c737db5018ddba42e9;hb=8e732f7f7f684f22b283f39a5d407375b3b0b3af
Forgive me if I misunderstand, but doesn't your recap regarding strict
aliasing agree with my understanding that this is an aliasing
violation?
Because you mention (correctly I think)
Post by Brian Inglis... you should not alias objects via
casts or conversions of pointers to objects stored as incompatible types ...
And in the case I mentioned (one of many) in mallocr.c on line 2212
Here brk is declared as: char *brk and is returned by sbrk (in my
case) which takes memory from the heap declared somewhere in a
linkerscript (or similar) AFAIK. But top is a mchunk *, which is a
struct.
Here the char * is converted to a mchunk * and that is okay, works, both will be
checked for aliasing; the inverse conversion is also allowed; no object is
accessed using the pointer here.
Post by narwhal xSo this is not a compatible type right (so the ARE incompatible)? You
could cast from mchunk * TO char * and dereference it according to the
standard, but not the other way around.
Also if you look at the document I linked in my initial mail
Post by Brian InglisPost by narwhal xAs an example, see 4.2.1 (p. 63) in
https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf
Isn't this exactly what is done in mallocr.c ? And they state
specifically that this can only be done when strict aliasing is NOT
uphold. And that seems to be in accordance to the standard.
I get that this is how the space is managed internally, I also know a
lot of embedded applications and networking stacks do this casting
from a char * to a struct *, but these also had to disable strict
aliasing to avoid bugs.
So am I missing something? If I am talking nonsense or
misunderstanding something please let me know.
I know it works basically always, but isn't this technically undefined
behavior without -fno-strict-aliasing?
I think you may be missing that the issues are when using casts to pointer types
to access type-punned union members in a struct, or other objects, that are not
compatible types.
At the minimum at a low level, the objects should be in the same memory type or
register set for the compiler to be able to consider them possibly aliased,
although the spec is stricter, more general, and abstract, to apply on the
abstract machine, for which the compiler is required to provide an
implementation on a real machine, where the properties conform to the abstract
model.
Thanks for the replies so far, sorry for being a nuance, but I really
want to understand this fully.
Post by Brian InglisHere the char * is converted to a mchunk * and that is okay, works, both will be
checked for aliasing; the inverse conversion is also allowed; no object is
accessed using the pointer here.
So I agree that this is not the undefined point, as nothing is
dereferenced. But let me give the comparison between the example from
the document, which they state is not in accordance to the aliasing
rules, and and trimmed down version of the part in mallocr.c which I
think is the same. Could you point me to the difference? (Or argue
against the statement in the document)
char *brk;
brk = (char*)(sbrk(sbrk_size)); // system sbrk (aligned)
top = (mchunkptr)brk;
top->size = top_size; // access the struct, violation?
Here sbrk returns a void * and casts from those to other pointer types have been
unnecessary and inadvisable in modern C compilers for over a decade; I'd write
this as top = brk = sbrk(sbrk_size); but some compilers might warn about this,
and some project builds like to treat warnings as errors, to enforce a clean
compile or require disabling compiler warnings, which to me encourages unsafe
cruft.
Post by narwhal xDocument example (see first email, 4.2.1 (p. 63))
unsigned char c[sizeof(float)]; // (aligned)
float *fp = (float *)c; // example uses float, but should hold for other types
*fp=1.0; // access, violation "DEFACTO: defined behaviour iff
-no-strict-aliasing"
Also quoting Joseph Myers regarding using unsigned char arrays to hold
" No, this is not safe (if it's visible to the compiler that the
memory in question has unsigned char as its declared type)."
http://www.cl.cam.ac.uk/~pes20/cerberus/notes50-survey-discussion.html
<http://www.cl.cam.ac.uk/%7Epes20/cerberus/notes50-survey-discussion.html>[11/15]
https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html
<https://gcc.gnu.org/ml/gcc/2015-04/msg00325.html>
Using char arrays for other types is unsafe as often arithmetic types have
strict alignment requirements e.g. natural alignment on addresses which are
multiples of the object size, requirements not to cross a cache line, or a page
boundary, and that assumption is written into the standards spec; where no
alignment is required by an architecture, any such mismatch may result in
performance from poor to bad, so compilers and compiler and library implementers
have to be aware of and work around all restrictions, including implementing
functions in assembler where the compiler won't do what is required.
See first link above, last question, last point about memcpy: to be conforming,
use memcpy; but memcpy may be written in C for many library targets where
assembler versions are not available, so the library implementer has to be aware
of all the pitfalls.
An old book, The Standard C Library, by P.J.Plauger, 1992, explained the design
and implementation in ANSI Standard C, which he followed up with The Draft
Standard C++ Library in 1995, then one on STL; he was also on the standards
committees; and his companies Whitesmiths, Intermetrics, Dinkumware have been
providing libraries to MS, IBM, and embedded companies for decades.
Post by narwhal xAs sbrk returns a pointer to a char array and the compiler can see
this, shouldn't it cause the same issue?
Thanks for bearing with me
** trying to figure out how to correctly reply to the mailinglist **
Read books and articles about library and compiler implementations that are not
just code listings, and read FAQs and discussions on groups like comp.lang.c,
where these questions would be better asked and answered.
--
Take care. Thanks, Brian Inglis, Calgary, Alberta, Canada