Limited RAM stdio replacement for newlib

Discussion:

Keith Packard

2018-09-21 00:46:22 UTC

I build embedded systems with limited RAM; ARM cortex M0-M4 parts with
as little as 4kB of RAM and 32kB of ROM. With these tiny systems, the
traditional newlib stdio implementation can't be used as it requires
more memory than I have available. I suspect I'm not alone and wanted to
share what I'm using. Here's a link to my current git repository with
the code:

https://keithp.com/cgit/newlib.git/

I looked at the stdio implementation in the AVR libc:

https://www.nongnu.org/avr-libc/

and found that it's architecture was a good match for my
requirements. RAM required for a simple stdin/stdout console app is as
little as 16 bytes. And it doesn't use malloc, which means I can do
fully static memory allocation.

A FILE structure contains a tiny amount of state along with function
pointers to get and/or put a single character. There's no internal
buffering, and not a lot of nested function calls, which is nice in
reducing stack usage.

I've also used the libc included with SDCC, and that is even smaller but
it doesn't support multiple FILEs at all, which makes it usable only for
applications with stdin/stdout as their only use of stdio.

The code as provided by AVR libc has a bunch of AVR assembly code and
dependence on AVR data types -- the AVR environment provides only 32-bit
floats, and makes 'double' an alias for that type. I've replaced the
assembly code with C code and extended it to support 64-bit floats for
both printf and scanf. I've also included a few tests to check printf
and scanf to make sure they work correctly. Those can be built natively
so that the library can be tested without needing a separate target
machine.

To build this library, I've constructed a parallel build system using
meson so that I didn't need to change the existing autotools build
system. This has a nice advantage of speeding up building the library.
Configuring the build takes less than 20 seconds. Compiling 20
variations of the library for different embedded ARM configurations
takes about 5 minutes on my low-power laptop.

I've been upstreaming the changes I'm making in the core newlib sources
so that all of the stdio and meson changes simply add new files.

I've packaged the resulting arm-none-eabi library for Debian and it's
sitting in the 'new' queue at this point. Eventually it will become part
of a Debian distribution and may become the default library for use by
the arm-none-eabi toolchain if that makes sense.

I'd love to hear from others interested in using newlib in these smaller
systems and whether it might make sense to merge these changes into the
main newlib repository.

--
-keith

Can Finner

2018-09-21 01:04:19 UTC

Permalink

Post by Keith Packard
I build embedded systems with limited RAM; ARM cortex M0-M4 parts with
as little as 4kB of RAM and 32kB of ROM. With these tiny systems, the
traditional newlib stdio implementation can't be used as it requires

Did you try newlib-nano options? In case it doesn't fit into your use
case either, is it possible to improve nano instead?

Thanks,
bin

Post by Keith Packard
more memory than I have available. I suspect I'm not alone and wanted to
share what I'm using. Here's a link to my current git repository with
https://keithp.com/cgit/newlib.git/
https://www.nongnu.org/avr-libc/
and found that it's architecture was a good match for my
requirements. RAM required for a simple stdin/stdout console app is as
little as 16 bytes. And it doesn't use malloc, which means I can do
fully static memory allocation.
A FILE structure contains a tiny amount of state along with function
pointers to get and/or put a single character. There's no internal
buffering, and not a lot of nested function calls, which is nice in
reducing stack usage.
I've also used the libc included with SDCC, and that is even smaller but
it doesn't support multiple FILEs at all, which makes it usable only for
applications with stdin/stdout as their only use of stdio.
The code as provided by AVR libc has a bunch of AVR assembly code and
dependence on AVR data types -- the AVR environment provides only 32-bit
floats, and makes 'double' an alias for that type. I've replaced the
assembly code with C code and extended it to support 64-bit floats for
both printf and scanf. I've also included a few tests to check printf
and scanf to make sure they work correctly. Those can be built natively
so that the library can be tested without needing a separate target
machine.
To build this library, I've constructed a parallel build system using
meson so that I didn't need to change the existing autotools build
system. This has a nice advantage of speeding up building the library.
Configuring the build takes less than 20 seconds. Compiling 20
variations of the library for different embedded ARM configurations
takes about 5 minutes on my low-power laptop.
I've been upstreaming the changes I'm making in the core newlib sources
so that all of the stdio and meson changes simply add new files.
I've packaged the resulting arm-none-eabi library for Debian and it's
sitting in the 'new' queue at this point. Eventually it will become part
of a Debian distribution and may become the default library for use by
the arm-none-eabi toolchain if that makes sense.
I'd love to hear from others interested in using newlib in these smaller
systems and whether it might make sense to merge these changes into the
main newlib repository.
--
-keith

--
Regards.

Keith Packard

2018-09-21 15:19:40 UTC

Permalink

Post by Can Finner
Did you try newlib-nano options? In case it doesn't fit into your use
case either, is it possible to improve nano instead?

Yes I did. Even with all of the knobs dialed to minimize memory usage it
still uses many kB of memory, requires malloc and has a fairly deep call
stack. At this point, I'm pretty convinced that there's no reasonable
way to create a stdio that targets good performance in high-capability
environments while offering options that reduce memory usage to near
zero bytes of ram.

The other benefit of this replacement stdio is that it doesn't depend on
any POSIX interfaces like open/close read/write, so the underlying OS
needn't have support for those, only character-at-a-time get and put
functions.

--
-keith

Joel Sherrill

2018-09-21 15:27:07 UTC

Permalink

Post by Keith Packard

Post by Can Finner
Did you try newlib-nano options? In case it doesn't fit into your use
case either, is it possible to improve nano instead?

Yes I did. Even with all of the knobs dialed to minimize memory usage it
still uses many kB of memory, requires malloc and has a fairly deep call
stack. At this point, I'm pretty convinced that there's no reasonable
way to create a stdio that targets good performance in high-capability
environments while offering options that reduce memory usage to near
zero bytes of ram.
The other benefit of this replacement stdio is that it doesn't depend on
any POSIX interfaces like open/close read/write, so the underlying OS
needn't have support for those, only character-at-a-time get and put
functions.

This made me remember that RTEMS has a simple set of printk stdio. Would it
be possible to have a set of "kernel" stdio methods in parallel with
regular stdio. Use names like printk, sprintk, etc and have a user level
feature flag to enable macros that map stdio to those methods.

I am just pondering. The idea of having both in newlib sounds useful. We
use multilibs for many variants within an architecture so we have a single
toolchain for all arm variants. Being able to let an application select
when it is built if they are mapped to these would be desirable.

Now the best you get is integer only methods or a rebuild for nano which
doesn't meet these smallest systems.

--joel

Post by Keith Packard
--
-keith

Keith Packard

2018-09-21 16:43:49 UTC

Permalink

Post by Joel Sherrill
This made me remember that RTEMS has a simple set of printk stdio. Would it
be possible to have a set of "kernel" stdio methods in parallel with
regular stdio. Use names like printk, sprintk, etc and have a user level
feature flag to enable macros that map stdio to those methods.

That's what SDCC does as well, and it's useful as long as all you want
is console output. If you want to offer any additional files, then you
need something a tiny bit more complex, and that's where I'm at these
days.

Post by Joel Sherrill
I am just pondering. The idea of having both in newlib sounds useful. We
use multilibs for many variants within an architecture so we have a single
toolchain for all arm variants. Being able to let an application select
when it is built if they are mapped to these would be desirable.

Right, I'm building all 20 arm-none-eabi variants that the toolchain
lists. I guess I wonder what arm-none-eabi user would want the larger
stdio; it requires a posix syscall API, which seems like that would
involve an operating system with a toolchain though?

Post by Joel Sherrill
Now the best you get is integer only methods or a rebuild for nano which
doesn't meet these smallest systems.

For the AVR libc stdio, I build both fp and integer-only printf/scanf
methods, with the integer-only versions named:

vsnprintf vsniprintf
vfprintf vfiprintf
vprintf viprintf
fprintf fiprintf
printf iprintf
sprintf siprintf
snprintf sniprintf
asprintf asiprintf
asnprintf asniprintf

vfscanf vfiscanf
scanf iscanf
fscanf fiscanf
sscanf siscanf

By selecting the compile-time option 'NEWLIB_INTEGER_PRINTF_SCANF', the
user application gets the 'normal' names defined as the integer-only
names so that it ends up using those functions instead. By building both
into the library, you avoid needing to pick which variant you want at
library build time and can leave it up to the application. It's not
perfect; I'd love to have the fp-enabled versions 'replace' the
integer-only ones so that another library could call iprintf and have
that use printf if the application ended up needing the full-featured
version. Maybe some trickery with weak symbols could solve this...

--
-keith