Nonre-entrant library calls are frequently made from multithreaded
programs. The potential for problems is often not caught until the code
is placed under heavy load on a multiprocessor machine. To make matters
worse, calls such as ctime(3C) can corrupt the process heap and cause
crashes in unrelated areas. In his Solaris Developer ConnectionSM
Program technical article "Eliminating nonre-entrant Libary Calls in
Multithreaded Programs," Bruce Chapman provides a list of APIs to
avoid, as well as a means of checking to make sure code is not
harboring dangerous, nonre-entrant library calls.
With the SolarisTM Operating Environment (Solaris OE), to avoid
making the mistake of using nonre-entrant library calls, you must look
at the man page for every library call a code makes. If a re-entrant
<name>_r version of the call exists, use it; according to Chapman,
that is the only safe call
to make in a multithreaded (MT) program. However, again according to
Chapman, not all
engineers have the discipline to make this approach effective.
This kind of insidious bug can creep into MT code in many other ways as
well. Chapman provides illustrations of these and other bugs in the
article.
The program in Chapman's article was run with up to 1600 threads on a
single-processor machine with no problems. It was then run with 100
threads on a 4-CPU machine and also ran fine. Only when run with 200
threads on a 4-CPU machine did the nonre-entrant call to ctime()
finally cause a crash. Since what ctime() has done is corrupt the C
heap, the crash could have occurred elsewhere in a program that
manipulated the C heap on its own.
The tedious way to eliminate this type of problem is to perform source
code analysis by referring back to library call man pages for the
Solaris OE. Another approach is to use Solaris OE software tools to
look at all the libraries a binary uses. This can be done either
statically or with a running process. However, in the early stages of
code development and testing, these problems may not yet have
manifested themselves.
Chapman illustrates some basic pre-deployment checks that can be made
online, along with the source code for a simple library,
multithreaded_nonreentrant.c, that can interpose all the nonre-entrant
library calls for the Solaris OE that have re-entrant equivalents.
Examples of code and a list of nonre-entrant standard library calls to
avoid with the Solaris 8 OE in a multithreaded program are available
online:
http://soldc.sun.com/articles/multithreaded.html
Read More ...
[...read more...]