Printf is not a function
In many languages, formatted output looks like a normal function call.
But compilers often treat those calls very differently.
This post follows that strange pattern through Fortran, C, Pascal, OCaml, Python, and Rust.
It started with Fortran. There was no single function to do formatted output in Fortran,
there was a special statement to do that. Here is how it looked like:
WRITE OUTPUT TAPE 6, 601, IA, IB, IC, AREA
FORMAT (4H A= ,I5,5H B= ,I5,5H C= ,I5,
& 8H AREA= ,F10.2, 13H SQUARE UNITS)
Needless to say, it was a distinguished language element,
both the programmer and the compiler must have had a respect for it.
And the statement itself looked quite involved,
for a good reason -- see that 601 after the WRITE keyword, it stands for the line number of the relevant FORMAT form!
Going forward in history...
-
Algol implemented printf as a function, but unlike the usual printf, the format string that it received was not a regular string,
but rather a separate entity whose value must have been available at compile-time,
and which had its own, special constructor-syntax.
This restriction has been later adopted by other languages in ML-family, notably OCaml.
-
C's printf does not look outstanding from the language point of view, but it is handled with care in today's compilers.
Pass a wrong type of argument that does not comform to the format string,
or use a dynamically generated format string, and you will get a warning from GCC.
No other regular function can possibly express such behaviour,
but there is a compiler-specific extension that allows one to bless any function with such a check
by simply adding the following magic to its signature:
__attribute__(format(printf, ...))
It may also be interesting to note that there are not that many variable-arity functions in the C standard library, and their semantics is usually a complication for compiler writers, considering the relative simplicity of C.
- Pascal's write was very exceptional in that it was one of a few privileged functions
that were allowed to accept variable number of arguments.
- Even though OCaml dropped special syntax of Algol's printf,
the format string still was not quite a normal string.
- In Python 2, a dynamic language with great potential in teaching,
print is statement, rather than a function.
-
Rust's printf! looks like a macro, but searching for the implementation we encounter this piece of code:
macro_rules! format_args_nl {
($fmt:expr) => {{ }};
($fmt:expr, $($args:tt)*) => {{ }};
}
There do not seem to be many reasons why this could not be implemented as an ordinary Rust macro.
Printf is so complicated it has been considered an attack surface.
I think the reason is that the format string is not just data - it is a tiny language describing the shape and types of the following arguments.
An instructive limiting case is a dependently typed implementation of printf in Idris. There, printf can indeed be written as an ordinary function, but only because the type system is strong enough to compute the rest of the function's type from the format string. The implementation first parses the string into a Format value, then maps that value to a type: roughly, %d adds an Int ->, %s adds a String ->, and the end of the format finally returns String. The resulting signature is the revealing part: printf : (fmt : String) -> PrintfType (toFormat (unpack fmt)). So this is not a cheap trick that ordinary languages merely forgot to use. To make printf a regular function, the language must allow ordinary function types to depend on the value of an ordinary string argument.
References
- printf's history
- First language to have printf
- Mechanics of Algol's printf
- Confused user tries to understand C's printf
- Critique of printf (rust case)
- Attack on printf
return home