Signaling NaNs Rise Again
written by Walter Bright
March 26, 2009
Back around 1980, I was working on the 757 flight controls design for Boeing. This was in the last halcyon days when drafting was done with pen and ink on paper. Memos were handwritten and given to the secretary who machine-gunned them out on an IBM Selectric. Wite-out was worshiped. And our brilliant female engineer from MIT would get asked to serve coffee at meetings.
About the only computers in use at the time by the mechanical engineers were desktop calculators. Many complex calculations were done using geometry by a draftsman, a technique I didn’t understand. Company computers at the time were controlled by a separate organization, and were located in a “computer center” which was a chilly air-conditioned special room with access locks and a raised floor under which ran all the cables. Think of it as being like one of those ancient egyptian temples that had an inner sanctum where only the priests were allowed in, and you get the picture.
The nearest such computer center had a glowing and humming PDP-11 in its altar room. I talked the sys admin into giving me the code for the door and a bootleg account. I wrote the programs I needed for analysis from scratch — geometry, matrices, numerical integration, finite element, etc. This was all done in FORTRAN. My boss had been burned by “computer numbers” and incomprehensible geeks before, and sensibly refused to accept any numbers generated by computer. But he thought he’d have a bit of fun with me, and challenged me to a duel - computer against the drafting method. As his champion, he selected his best draftsman. We were to calculate a table of about 40 numbers needed to prove out part of the design of the elevator system.
It took me a couple hours to write the program, produce the numbers, and double check everything. It took the draftsman maybe a couple days. There was one mismatch between the sets of numbers. He said I’d made a mistake. I asked him to humor me and recheck that number, which he good-naturedly agreed to. Two hours later, he came back and said he’d made a mistake, and the computer numbers were right. Furthermore, the computer numbers were correct to 6 digits and the drafting numbers were only good to 3. After that, I was trusted with using the computer as a tool.
A while after that, the organization that ran the computer centers got wind that I was bootlegging time off of it, and wrote me up for inappropriate and unauthorized use of the facilities. My supervisor, however, backed me up 100% and put the smackdown on them. From then on I had free reign of the facilities.
So what has all this to do with signaling NaNs? That is when I developed a soft spot for the needs and requirements of using a computer for numerical analysis. This has influenced the C and C++ compilers I’ve built, and more interestingly, has influenced the design of the D programming language.
C’s support for numerical programming in the 1980’s was minimal. While Intel raised the bar with their revolutionary 8087 floating point coprocessor, C support for its capabilities was a long time coming. In particular, the coprocessor supported NaN encodings, but C did not. The C89 Standard came and went, without acknowledgment of NaNs. x86 users had this wonderful floating point unit, and little software support for it.
This motivated the formation of NCEG, the Numeric C Extensions Group. NCEG wanted a complete makeover of C to support numerical programming, including full support for the capabilities of the x87 numeric coprocessors. This included NaNs, both quiet and signaling NaNs. I was on their mailing list, and enthusiastically implemented their proposed extensions into the Zortech C and C++ compilers. After about 300 papers from 1989 to 1993, NCEG simply vanished. Little trace of them remains. (Although I have a complete set of all their papers scanned in.) With the end of NCEG was the end of signaling NaNs. The newer C99 standard did not support them. The reasons given are here. That 2002 document somewhat resurrects them, but I don’t know any implementation of them, or more significantly, have never heard anyone ask for them or even mention them.
So although NaNs went into the D programming language, they did mainly because of my soft spot for numerics programming. I had given up on signaling NaNs completely. At least quiet NaNs were thoroughly and explicitly supported — they were not optional like in C and C++.
And so things went until Don Clugston appeared on the scene. If there was ever a person who was geekier about the intricacies of floating point math than I am, it’s Don. Don and I are kindred spirits in caring about getting the last bit of accuracy out of math functions. He re-engineered the D math library for performance and accuracy. D has a feature where all variables are initialized to a default value if no explicit initializer is given. Floating point variables are default initialized to a quiet NaN. Don suggested that instead, the default initializer should be a signaling NaN. Not only that, he submitted patches to the compiler source to do it. Even more significantly, others chimed in wanting it.
Signaling NaNs now play in D the role they were originally created for — to make the use of uninitialized floating-point variables trivial to track down. They eradicate the nightmares you get in floating-point code when code fails intermittently. The instant recognition of how useful this can be indicates a high level of awareness of numerics in the D community.
Here’s an example Don wrote illustrating their use:
void main() { double a, b, c; a*=7; // Exactly the same as it is now, a is nan. enableExceptions(); c = 6; // ok, c is initialized now c *= 10; b *= 10; // BANG ! Straight into the debugger b *= 5; disableExceptions(); } void enableExceptions() { version(D_InlineAsm_X86) { short cont; asm { fclex; fstcw cont; mov AX, cont; and AX, 0xFFFE; // enable invalid exception mov cont, AX; fldcw cont; } } } void disableExceptions() { version(D_InlineAsm_X86) { short cont; asm { fclex; fstcw cont; mov AX, cont; or AX, 0x1; // disable invalid exception mov cont, AX; fldcw cont; } } }
So I have to say, signaling NaNs are back, and I hope here to stay.
Acknowledgements
Thanks to Don Clugston and Bartosz Milewski for reviewing a draft of this.