FAQ


Last modified: Tue Aug 28 15:47:27 PDT 2001

  This is a list of the questions which have come up on the newsgroup with
  any answers that were given. (Somewhat edited by yours truly.)  In a few
  cases I have left in the names of the participants.  If you'd like me to
  remove your name, let me know.  If you have other comments/corrects, just
  drop me a line (Bil LambdaCS.com).  (Of course I'll expect *you* to supply 
  any corrections! :-)

  This list is a bit of a hodge-podge, containing everything that I thought
  *might* be useful. Hence it is HUGH and not very well edited. It even has
  duplicates (or worse, near-duplicates). The MFAQ is much smaller and better
  maintained. You may wish to check there first.


-Bil


     ==================================================================
    	   F R E Q U E N T L Y    A S K E D    Q U E S T I O N S 
     ==================================================================
				Also see:

     Brian's FAQ
     (Sun's Threads page and FAQ is no more.)

  Many of the most general questions can be answered by reading (a) the
  welcome message, (b) the general information on the other threads pages,
  and (c) any of the books on threads.  References to all of these can be
  found in the welcome message.

Q1:   How fast can context switching be?
Q2:   What about special purpose processors?
Q3:   What kinds of issues am I faced with in async cancellation?
Q4:   When should I use these new thread-safe "_r" functions?
Q5:   What benchmarks are there on POSIX threads?
Q6:   Has anyone used the Sparc atomic swap instruction?
Q7:   Are there MT-safe interfaces to DBMS libraries?
Q8:   Why do we need re-entrant system calls?
Q9:   Any "code-coverage" tools for MT applications?
Q10:  How can POSIX join on any thread?
Q11:  What is the UI equivalent for PTHREAD_MUTEX_INITALIZER?
Q12:  How many threads are too many in one heavyweight process? 
Q13:  Is there an atomic mutex_unlock_and_wait_for_event()?
Q14:  Is there an archive of this newsgroup somewhere?
Q15:  Can I copy pthread_mutex_t structures, etc.?
Q16:  After 1800 calls to thr_create() the system freezes. ??
Q17:  Compiling libraries which might be used in threaded or unthreaded apps?
Q18:  What's the difference of signal handling for process and thread? 
Q19:  What about creating large numbers of threads?
Q20:  What about using sigwaitinfo()?
Q21:  How can I have an MT process communicate with many UP processes?
Q22:  Writing Multithreaded code with Sybase CTlib ver 10.x?
Q23:  Can we avoid preemption during spin locks?
Q24:  What about using spin locks instead of adaptive spin locks?
Q25:  Will thr_create(...,THR_NEW_LWP) fail if the new LWP cannot be added?
Q26:  Is the LWP released upon bound thread termination?
Q27:  What's the difference between pthread FIFO the solaris threads scheduling?
Q28:  I really think I need time-sliced RR.
Q29:  How important is it to call mutex_destroy() and cond_destroy()?
Q30:  EAGAIN/ENOMEM etc. apparently aren't in ?!
Q31:  What can I do about TSD being so slow?
Q32:  What happened to the pragma 'unshared' in Sun C?
Q33:  Can I profile an MT-program with the debugger?
Q34:  Sometimes the specified sleep time is SMALLER than what I want.
Q35:  Any debugger that single step a thread while the others are running?
Q36:  Any DOS threads libraries?
Q37:  Any Pthreads for Linux?
Q38:  Any really basic C code example(s) and get us newbies started?
Q39:  Please put some Ada references in the FAQ.
Q40:  Which signals are synchronous, and whicn are are asynchronous?
Q41:  If we compile -D_REENTRANT, but without -lthread, will we have problems?
Q42:  Can Borland C++ for OS/2 give up a TimeSlice?
Q43:  Are there any VALID uses of suspension?
Q44:  What's the status of pthreads on SGI machines?
Q45:  Does the Gnu debugger support threads?
Q46:  What is gang scheduling?
Q47:  LinuxThreads linked with X11, calls to X11 seg fault.
Q48:  Are there Pthreads on NT?
Q49:  What about garbage collection?
Q50:  Does anyone have any information on thread programming for VMS?
Q51:  Any information on the DCE threads library?
Q52:  Can I implement pthread_cleanup_push without a macro?
Q53:  What switches should be passed to particular compilers?
Q54:  How do I find Sun's bug database?
Q55:  How do the various vendors' threads libraries compare?
Q56:  Why don't I need to declare shared variables VOLATILE?
Q57:  Do pthread_cleanup_push/pop HAVE to be macros (thus lexically scoped)?
Q58:  Analyzer Fatal Error[0]:  Slave communication failure ??
Q59:  What is the status of Linux threads?
Q60:  The Sunsoft debugger won't recognize my PThreads program!
Q61:  How are blocking syscall handled in a two-level system?
Q62:  Can one thread read from a socket while another thread writes to it?
Q63:  What's a good way of writing threaded C++ classes?
Q64:  Can thread stacks be built in privately mapped memory?
Q66:  I think I need a FIFO mutex for my program...
Q67:  Why my multi-threaded X11 app with LinuxThreads crashes?
Q68:  How would we put a C++ object into a thread?
Q69:  How different are DEC threads and Pthreads?
Q70:  How can I manipulate POSIX thread IDs?
Q71:  I'd like a "write" that allowed a timeout value...
Q72:  I couldn't get threads to work with glibc-2.0.
Q73:  Can I do dead-owner-process recovery with POSIX mutexes?
Q74:  Will IRIX distribute threads immediately to CPUs?
Q75:  IRIX pthreads won't use both CPUs?
Q76:  Are there thread mutexes, LWP mutexes *and* kernel mutexes?
Q77:  Does anyone know of a MT-safe alternative to setjmp and longjmp?
Q78:  How do I get more information inside a signal handler?
Q79:  Is there a test suite for Pthreads? 
Q80:  Flushing the Store Buffer vs. Compare and Swap
Q81:  How many threads CAN a POSIX process have? 
Q82:  Can Pthreads wait for combinations of conditions?
Q83:  Shouldn't pthread_mutex_trylock() work even if it's NOT PTHREAD_PROCESS_SHARED?
Q84:  What about having a NULL thread ID?
Q85:  Explain Traps under Solaris
Q86:  Is there anything similar to posix conditions variables in Win32 API ?
Q87:  What if a cond_timedwait() times out AND the condition is TRUE?
Q88:  How can I recover from a dying thread?
Q89:  How to implement POSIX Condition variables in Win32?
Q90:  Linux pthreads and X11
Q91:  One thread runs too much, then the next thread runs too much!
Q92:  How do priority levels work?
Q93:  C++ member function as the startup routine for pthread_create(). 
Q94:  Spurious wakeups, absolute time, and pthread_cond_timedwait()
Q95:  Conformance with POSIX 1003.1c vs. POSIX 1003.4a?
Q96:  Cleaning up when kill signal is sent to the thread.?
Q97:  C++ new/delete replacement that is thread safe and fast?
Q98:  beginthread() vs. endthread() vs. CreateThread? (Win32)
Q99:  Using pthread_yield()?
Q100: Why does pthread_cond_wait() reacquire the mutex prior to being cancelled?
Q101: HP-UX 10.30 and threads?
Q102: Signals and threads are not suited to work together?
Q102: Patches in IRIX 6.2 for pthreads support?
Q104: Windows NT Fibers?
Q105: LWP migrating from one CPU to another in Solaris 2.5.1?
Q106: What conditions would cause that thread to disappear?
Q107: What parts, if any, of the STL are thread-safe?
Q108: Do pthreads libraries support cooperative threads?
Q109: Can I avoid mutexes by using globals?
Q110: Aborting an MT Sybase SQL?
Q111: Other MT tools?
Q112: That's not a book. That's a pamphlet!
Q114: How to cleanup TSD in Win32?
Q115: Onyx1 architecture has one problem
Q116: LinuxThreads linked with X11 seg faults.
Q117: Comments about Linux and Threads and X11
Q118: Memory barriers for synchonization
Q119: Recursive mutex debate
Q120: Calling fork() from a thread
Q121: Behavior of [pthread_yield()] sched_yield()
Q122: Behavior of pthread_setspecific()
Q123: Linking under OSF1 3.2: flags and library order
Q124: What is the TID during initialization? 
Q125: TSD destructors run at exit time... and if it crashes?
Q126: Cancellation and condition variables
Q127: RedHat 4.2 and LinuxThreads?
Q128: How do I measure thread timings? 
Q129: Contrasting Win32 and POSIX thread designs
Q130: What does POSIX say about putting stubs in libc?
Q131: MT GC Issues
Q132: Some details on using CMA threads on Digital UNIX 
Q133: When do you need to know which CPU a thread is on?
Q134: Is any difference between default and static mutex initialization? 
Q135: Is there a timer for Multithreaded Programs? 
Q136: Roll-your-own Semaphores 
Q137: Solaris sockets don't like POSIX_C_SOURCE!
Q138: The Thread ID changes for my thread! 
Q139: Does X11 support multithreading ? 
Q140: Solaris 2 bizzare behavior with usleep() and poll() 
Q141: Why is POSIX.1c different w.r.t. errno usage? 
Q142: printf() anywhere AFTER pthread_create() crashes on HPUX 10.x 
Q143: Pthreads and Linux 
Q144: DEC release/patch numbering 
Q145: Pthreads (almost) on AS/400 
Q146: Can pthreads & UI threads interoperate in one application?
Q147: Thread create timings 
Q148: Timing Multithreaded Programs (Solaris) 
Q149: A program which monitors CPU usage? 
Q150: standard library functions: whats safe and whats not? 
Q151: Where are semaphores in POSIX threads? 
Q152: Thread & sproc (on IRIX) 
Q153: C++ Exceptions in Multi-threaded Solaris Process 
Q154: SCHED_FIFO threads without root privileges ? 
Q155: "lock-free synchronization" 
Q156: Changing single bytes without a mutex 
Q157: Mixing threaded/non-threadsafe shared libraries on Digital Unix 
Q158: VOLATILE instead of mutexes? 
Q159: After pthread_cancel() destructors for local object do not get called?!
Q160: No pthread_exit() in Java.
Q161: Is there anyway I can make my stacks red zone protected?
Q162: Cache Architectures, Word Tearing, and VOLATILE
Q163: Can ps display thread names?
Q164: (Not!) Blocking on select() in user-space pthreads.
Q165: Getting functional tests for UNIX98
Q166: To make gdb work with linuxthreads?
Q167: Using cancellation is *very* difficult to do right...
Q168: Why do pthreads implementations differ in error conditions?
Q169: Mixing threaded/non-threadsafe shared libraries on DU
Q170: sem_wait() and EINTR
Q171: pthreads and sprocs
Q172: Why are Win32 threads so odd?
Q173: What's the point of all the fancy 2-level scheduling??
Q174: Using the 2-level model, efficency considerations, thread-per-X
Q175: Multi-platform threading api
Q176: Condition variables on Win32 
Q177: When stack gets destroyed relative to TSD destructors?
Q178: Thousands of mutexes?
Q179: Threads and C++
Q180: Cheating on mutexes
Q181: Is it possible to share a pthread mutex between two distinct processes?
Q182: How should one implement reader/writer locks on files?
Q183: Are there standard reentrant versions of standard nonreentrant functions?
Q184: Detecting the number of cpus
Q185: Drawing to the Screen in more than one Thread (Win32)
Q186: Digital UNIX 4.0 POSIX contention scope
Q187: Dec pthreads under Windows 95/NT?
Q188: DEC current patch requirements
Q189: Is there a full online version of 1003.1c on the web somewhere?
Q190: Why is there no InterlockedGet?
Q191: Memory barrier for Solaris
Q192: pthread_cond_t vs pthread_mutex_t
Q193: Using DCE threads and java threads together on hpux(10.20)
Q194: My program returns enomem on about the 2nd create.
Q195: Does pthread_create set the thread ID before the new thread executes?
Q196: thr_suspend and thr_continue in pthread
Q197: Are there any opinions on the Netscape Portable Runtime?
Q198: Multithreaded Perl
Q199: What if a process terminates before mutex_destroy()?
Q200: If a thread performs an illegal instruction and gets killed by the system...
Q201: How to propagate an exception to the parent thread?
Q202: Discussion: "Synchronously stopping things" / Cheating on Mutexes
Q203: Discussion: Thread creation/switch times on Linux and NT.
Q204: Are there any problems with multiple threads writing to stdout?
Q205: How can I handle out-of-band communication to a remote client?
Q206: I need a timed mutex for POSIX
Q207: Does pthreads has an API for configuring the number of LWPs?
Q208: Why does Pthreads use void** rather than void*?
Q209: Should I use poll() or select()?
Q210: Where is the threads standard of POSIX ????
Q211: Is Solaris' unbound thread model braindamaged?
Q212: Releasing a mutex locked (owned) by another thread.
Q213: Any advice on using gethostbyname_r() in a portable manner?
Q214: Passing file descriptors when exec'ing a program.
Q215: Thread ID of thread getting stack overflow? 
Q216: Why aren't my (p)threads preemted?
Q217: Can I compile some modules with and others without _POSIX_C_SOURCE?
Q218: timed wait on Solaris 2.6?
Q219: Signal delivery to Java via native interface
Q220: Concerning timedwait() and realtime behavior.
Q221: pthread_attr_getstacksize on Solaris 2.6
Q222: LinuxThreads: Problem running out of TIDs on pthread_create
Q223: Mutexes and the memory model
Q224: Poor performance of AIO in Solaris 2.5?
Q225: Strategies for testing multithreaded code?
Q226: Threads in multiplatform NT 
Q227: Guarantee on condition variable predicate/pthreads?
Q228: Pthread API on NT? 
Q229: Sockets & Java2 Threads
Q230: Emulating process shared threads 
Q231: TLS in Win32 using MT run-time in dynamically loaded DLLs?
Q232: Multithreaded quicksort
Q233: When to unlock for using pthread_cond_signal()?
Q234: Multi-Read One-Write Locking problem on NT
Q235: Thread-safe version of flex scanner 
Q236: POSIX standards, names, etc
Q237: Passing ownership of a mutex?
Q238: NT fibers
Q239: Linux (v.2.0.29 ? Caldera Base)/Threads/KDE 
Q240: How to implement user space cooperative multithreading?
Q241: Tools for Java Programming 
Q242: Solaris 2.6, phtread_cond_timedwait() wakes up early
Q243: AIX4.3 and PTHREAD problem
Q244: Readers-Writers Lock source for pthreads
Q245: Signal handlers in threads 
Q246: Can a non-volatile C++ object be safely shared amongst POSIX threads?
Q247: Single UNIX Specification V2
Q248: Semantics of cancelled I/O (cf: Java)
Q249: Advice on using multithreading in C++?
Q250: Semaphores on Solaris 7 with GCC 2.8.1 
Q251: Draft-4 condition variables (HELP) 
Q252: gdb + linuxthreads + kernel 2.2.x = fixed :) 
Q253: Real-time input thread question
Q254: How does Solaris implement nice()?  
Q255: Re: destructors and pthread cancelation...  
Q256: A slight inaccuracy WRT OS/2 in Threads Primer 
Q257: Searching for an idea 
Q258: Benchmark timings from "Multithreaded Programming with Pthreads" 
Q259: Standard designs for a multithreaded applications? 
Q260: Threads and sockets: Stopping asynchroniously 
Q261: Casting integers to pointers, etc. 
Q262: Thread models, scalability and performance  
Q263: Write threaded programs while studying Japanese!  
Q264: Catching SIGTERM - Linux v Solaris 
Q265: pthread_kill() used to direct async signals to thread? 
Q266: Don't create a thread per client 
Q267: More thoughts on RWlocks 
Q268: Is there a way to 'store' a reference to a Java thread? 
Q269: Java's pthread_exit() equivalent?  
Q270: What is a "Thread Pool"?
Q271: Where did "Thread" come from?
Q272: Now do I create threads in a Solaris driver?
Q273: Synchronous signal behavior inconsistant?
Q274: Making FORTRAN libraries thread-safe?
Q275: What is the wakeup order for sleeping threads?
Q276: Upcalls in VMS?
Q277: How to design synchronization variables?
Q278: Thread local storage in DLL?
Q279:  How can I tell what version of linux threads I've got?
Q280: C++ exceptions in a POSIX multithreaded application?
Q281: Problems with Solaris pthread_cond_timedwait()?
Q282: Benefits of threading on uni-process
Q283: What if two threads attempt to join the same thread?
Q284: Questions with regards to Linux OS?
Q285: I need to create about 5000 threads?
Q286:  Can I catch an exception thrown by a sla
Q287: _beginthread() versus CreateThread()?
Q288: Is there a select() call in Java??
Q289: Comment on use of VOLATILE in the JLS.?
Q290: Should I try to avoid GC by pooling objects myself??
Q291: Does thr_X return errno values? What's errno set to???
Q292: How I can wait more then one condition variable in one place?
Q293: Details on MT_hot malloc()?
Q294: Bug in Bil's condWait()?
Q295: Is STL considered thread safe??
Q296: To mutex or not to mutex an int global variable ??
Q297: Stack overflow problem ?
Q298: How would you allow the other threads to continue using a "forgotten" lock?
Q299: How unfair are mutexes allowed to be?
Q300: Additionally, what is the difference between -lpthread and -pthread? ?
Q301: Handling C++ exceptions in a multithreaded environment?
Q302: Pthreads on IRIX 6.4 question?
Q303: Threading library design question ?
Q304: Lock Free Queues?
Q305: Threading library design question ?
Q306: Stack size/overflow using threads ?
Q307: correct pthread termination?
Q308: volatile guarantees??
Q309: passing messages, newbie?
Q310: solaris mutexes?
Q311: Spin locks?
Q312: AIX pthread pool problems?
Q313: iostream libray and multithreaded programs ?
Q314: Design document for MT appli?
Q315: SCHED_OTHER, and priorities?
Q316: problem with iostream on Solaris 2.6, Sparcworks 5.0?
Q317: pthread_mutex_lock() bug ???
Q318: mix using thread library?
Q319: Re: My agony continues (thread safe gethostbyaddr() on FreeBSD4.0) ?
Q320: OOP and Pthreads?
Q321: query on threading standards?
Q322: multiprocesses vs multithreaded..??
Q323: CGI & Threads?
Q324: Cancelling detached threads (posix threads)?
Q325: Solaris 8 recursive mutexes broken?
Q326: sem_wait bug in Linuxthreads (version included with glibc 2.1.3)?
Q327: pthread_atfork??
Q328: Does anybody know if the GNU Pth library supports process shared mutexes?
Q329: I am trying to make a thread in Solaris to get timer signals.
Q330: How do I time individual threads?
Q331: I'm running out of IPC semaphores under Linux!
Q332: Do I have to abandon the class structure when using threads in C++?
Q333: Questions about pthread_cond_timedwait in linux.
Q334: Questions about using pthread_cond_timedwait.
Q335: What is the relationship between C++ and the POSIX cleanup handlers?
Q336: Does selelct() work on calls recvfrom() and sendto()?
Q337: libc internal error: _rmutex_unlock: rmutex not held.
Q338: So how can I check whether the mutex is already owned by the calling thread?
Q339: I expected SIGPIPE to be a synchronous signal.
Q340: I have a problem between select() and pthread...
Q341: Mac has Posix threading support.
Q342: Just a few questions on Read/Write for linux.
Q343: The man pages for ioctl(), read(), etc. do not mention MT-safety.
Q344: Status of TSD after fork()?
Q345: Static member function vs. extern "C" global functions?
Q346: Can i kill a thread from the main thread that created it?
Q347: What does /proc expose vis-a-vis LWPs?
Q348: What mechanism can be used to take a record lock on a file?
Q349: Implementation of a Timed Mutex in C++
Q350: Effects that gradual underflow traps have on scaling.
Q351: LinuxThreads woes on SIGSEGV and no core dump.
Q352: On timer resolution in UNIX.
Q353: Starting a thread before main through dynamic initialization.
Q354: Using POSIX threads on mac X and solaris?
Q355: Comments on ccNUMA on SGI, etc.
Q356: Thread functions are NOT C++ functions! Use extern "C"
Q357: How many CPUs do I have?
Q358: Can malloc/free allocate from a specified memory range?
Q359: Can GNU libpth utilize multiple CPUs on an SMP box?
Q360: How does Linux pthreads identify the thread control structure?
Q361: Using gcc -kthread doesn't work?!
Q362: FAQ or tutorial for multithreading in 'C++'?
Q363: WRLocks & starvation.
Q364: Reference for threading on OS/390.
Q365: Timeouts for POSIX queues (mq_timedreceive())
Q366: A subroutine that gives cpu time used for the calling thread?
Q367: Documentation for threads on Linux
Q368: Destroying a mutex that was statically initialized.
Q369: Tools for debugging overwritten data.
Q370: POSIX synchronization is limited compared to win32.
Q371: Anyone recommend us a profiler for threaded programs?
Q372: Coordinating thread timeouts with drifting clocks.
Q373: Which OS has the most conforming POSIX threads implementation?
Q374: MT random number generator function.
Q375: Can the main thread sleep without causing all threads to sleep?
Q376: Is dynamic loading of the libpthread supported in Redhat?
Q377: Are reads and writes atomic?
Q378: More discussion on fork().
Q379: Performance differences: POSIX threads vs. ADA threads?
Q380: Maximum number of threads with RedHat 255?
Q381: Best MT debugger for Windows...
Q382: Thread library with source code ? 
Q383: Async cancellation and cleanup handlers.
Q384: How easy is it to use pthreads on win32?
Q385: Does POSIX require two levels of contention scope?
Q386: Creating threadsafe containers under C++
Q387: Cancelling pthread_join() DOESN'T detach target thread?
Q388: Scheduling policies can have different ranges of priorities?
Q389: The entity life modeling approach to multi-threading.
Q390: Is there any (free) documentation?
Q391: Grafting POSIX APIs on Linux is tough!
Q392: Any companies  using pthread-win32?
Q393: Async-cancel safe function: guidelines?
Q394: Some detailed discussion of implementations.
Q395: Cancelling a single thread in a signal handler?
Q396: Trouble debugging under gdb on Linux.
Q397: Global signal handler dispatching to threads.
Q398: Difference between the Posix and the Solaris Threads?
Q399: Recursive mutexes are broken in Solaris?
Q400: pthreads and floating point attributes?
Q401: Must SIGSEGV be sent to the thread which generated the signal?
Q402: Windows and C++: How?
Q403: I have blocked all signals and don't get SEGV!
Q404: AsynchronousInterruptedException (AIE) and POSIX cancellation


=================================TOP===============================
 Q1: How fast can context switching be?  

In general purpose processors (SPARC, MIPS, ALPHA, HP-PA, POWER, x86) a
LOCAL thread context switch takes on the order of 50us.  A GLOBAL thread
context switch takes on the order of 100us.  However...

heddaya@csb.bu.edu (Abdelsalam Heddaya) writes:

>- Certain multi-threaded processor architectures, with special support
>  for on-chip caching of thread contexts can switch contexts in,
>  typically, less than 10 cycles, down to as little as one cycle.

The Tera machine switches with 0 cycles of overhead.

>  Such processors still have to incur a high cost when they run out of
>  hardware contexts and need to perform a full "context swap" with
>  memory.

Hmmm.  With 128 contexts/processors and 16 processors on the smallest
machine, we may be talking about a rare sitution.  Many people doubt
we'll be able to keep the machine busy, but you propose an
embarassment of riches/parallelism.

In any case, I disagree with the implication that a full context swap
is a problem to worry about.  We keep up to 2048 threads active at a
time, with others confined to memory.  The processors issues
instructions for the active threads and completely ignore the inactive
threads -- there's no swapping of threads between processor and memory
in the normal course of execution.  Instead, contexts are "swapped"
when one thread finishes, or blocks too long, or is swapped to disk,
etc.  In other words, at fairly significant intervals.

Preston Briggs

=================================TOP===============================

 Q2: What about special purpose processors?  

What are the distinctions between these special purpose processors and
the general purpose processors we're using?


??

=================================TOP===============================
 Q3: What kinds of issues am I faced with in async cancellation?  


Michael C. Cambria wrote:
> 
> In article <4eoe2a$kje@news.hal.com>, spike@hal.com (Spike White) wrote:
> [deleted]
> > thread2()
> > {
> >    ...
> >    while(1) {
> >       pthread_setasynccancel(CANCEL_ON);
> >       pthread_testcancel();  /* if there's a pending cancel */
> >       read(...);
> >       pthread_setasynccancel(CANCEL_OFF);
> >       ...process data...
> >    }
> > }
> >
> > Obviously, you shouldn't use any results from the read() call that was
> > cancelled -- God knows what state it was when it left.
> >
> > That's the only main use I've ever found for async cancel.
> 
> I used something quite similar to your example (quoted above) in my
> original question.
> 
> Since the read() call itself is not async cancel safe according to Posix,
> is it even safe to do the above?  In general for any posix call which is
> not async cancel safe, my guess (and many e-mails to me agree) is to
> just not use it.
> 
> Using read() as an example, I'll bet everyone will agree with you not
> to use the results of the read() call.  However, the the motivation for
> my original question was, being as a call() is not async cancel safe,
> by canceling a thread when it is in one of these calls _may_ screw up
> other threads in general and other threads using the same fd in
> particular.  This is why I asked why one would use it.
> 
> In your example, if read() did anything with static data, the next read on
> that fd could have problems if a thread was cancelled while in the read().
> (Note:  if you don't like the "static data" example, substitute whatever
> you like for the implementation reason for read(), or any call, not being
> async cancel safe.  I used static data as an example only.)
> 
> Mike

Specifically, NO, it is NOT safe to call read() with async cancel. On some
implementations it may work, sometimes. In general, it *MAY* work if, on
the particular release of your particular operation system, read() happens
to be implemented with no user-mode code (aside from a syscall trap). In
most cases, a user mode cancel will NOT be allowed to corrupt kernel data.

However, no implementations make any guarantees about their implementation
of read(). It may be a syscall in one version and be moved partly into
libc in the next version.

Unfortunately, the OSF DCE porting guide made reference to the possibility
of using async cancel in place of synchronous system cancel capability on
platforms that don't support the latter. That was really too bad, and it
set a very dangerous precedent.

POSIX 1003.1c-1996 encourages all routines to document whether they are
async cancel safe. (Luckily the advice is in rationale -- which is to say
it's really just commentary and not part of the standard -- because it'd
be horrendously difficult to change the documentation of every single
routine in a UNIX system.) In practice, you should always assume that a
function is NOT async cancel safe unless it says that it IS. And you won't
see that very often.

Because, as has already been commented, async cancel really isn't very
useful. There is a certain small class of application that can benefit
dramatically from async cancel, for good response to shutdown requests in
long-running compute-bound threads. In a long and tight loop it's not
practical to call pthread_testcancel(). So in cma we provided async cancel
for those cases. In retrospect I believe that's probably one of the bad
parts of cma, which POSIX should have omitted. There may well have been
"hard realtime" people in the room who wanted to use it, though (the POSIX
threads standard was developed by roughly 10 "threads people" and 40 to 50
"realtime people").

------------------------------------------------------------------------
Dave Butenhof                              Digital Equipment Corporation
butenhof@zko.dec.com                       110 Spit Brook Rd, ZKO2-3/Q18
Phone: 603.881.2218, FAX: 603.881.0120     Nashua, NH 03062-2711
                 "Better Living Through Concurrency"
------------------------------------------------------------------------


> In article <3659533E.ABBA639A@hotmail.com>,
> Jose Luis Ramos =?iso-8859-1?Q?Mor=E1n?=   wrote:
> %   pthread_setcancelstate(PTHREAD_CANCEL_ENABLE,NULL);
> %   pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS,NULL);
>
> I would guess that your problem comes from this. Asynchronous cancellation
> is almost never a good idea, but if you do use it, you should be really
> careful about whether there's anything with possible side-effects in your
> code. For instance, the C++ exception handler could be screwed up for your
> whole process if you cancel at a bad moment.
>
> Anyway, try taking out the asynchronous cancellation and see if the problem
> goes with it.

I'll put it a little more strongly than Patrick. The program is illegal. You
CANNOT call any function with asyncronous cancel enabled unless that function
is explicitly specified as "async-cancel safe". There are very few such
functions, and sleep() is not one of them. In fact, within the scope of the
POSIX and UNIX98 standards, with async cancel enabled you are allowed only to

  1. Disable asynchronous cancellation (set cancel type to DEFERRED)
  2. Disable cancellation entirely (set cancel state to DISABLE)
  3. Call pthread_cancel() [This is bizarre and pointless, but it is specified
     in the standard.]

If you call any other function defined by ANSI C, POSIX, or UNIX98 with async
cancel enabled, then your program is nonportable and "non conforming". It MAY
still be "correct", but only IF you are targeting your code to one specific
implementation of the standard that makes the NON-portable and NON-standard
guarantee, in writing, that the function you're calling actually is
async-cancel safe on that implementation. Otherwise, the program is simply
broken.

You can, of course, write your own async-cancel safe functions. It's not that
hard to do. In general, like most correct implementations of pthread_cancel(),
you simply DISABLE async cancellation on entry and restore the previous
setting on exit. But it's silly to do that very often. And, of course, that's
not the same as actually allowing async cancel. THAT is a much, much harder
job, except for regions of code that own no resources of any kind.

Asynchronous cancelation was designed for tight CPU-bound loops that make no
calls, and therefore would suffer from the need to call pthread_testcancel()
on some regular basis in order to allow responsiveness to cancellation
requests. That's the ONLY time or place you should EVER even consider using
asynchronous cancellation.

/---------------------------[ Dave Butenhof ]--------------------------\
| Compaq Computer Corporation                     butenhof@zko.dec.com |
| 110 Spit Brook Rd ZKO2-3/Q18       http://members.aol.com/drbutenhof |
| Nashua NH 03062-2698  http://www.awl.com/cseng/titles/0-201-63392-2/ |
\-----------------[ Better Living Through Concurrency ]----------------/


=================================TOP===============================
 Q4: When should I use these new thread-safe "_r" functions?  


David Brownell wrote:
> 
> If the "_r" versions are available at all times, use them but
> beware of portability issues.  POSIX specifies a pretty minimal
> set and many implementations add more (e.g. gethostbyname_r).
> Some implementations only expose the "_r" versions if you
> compile in a threaded environment, too.
> 
> - Dave

POSIX 1003.1c-1995 deliberately separates _POSIX_THREAD_SAFE_FUNCTIONS
from _POSIX_THREADS so that they can be easily implemented by
non-threaded systems. The "_r" versions aren't just thread-safe, they
are also much "cleaner" and more modular than the traditional forms.
(for example, you can have several independent readdir_r or strtok_r
streams active simultaneously).

The grand vision is that all UNIX systems, even though without threads,
would of course want to pick up this wonderful new set of interfaces. I
doubt you'll see them in any system without threads support, of course,
but it would be nice.

=================================TOP===============================
 Q5: What benchmarks are there on POSIX threads?  

In the book on POSIX.4 by B.Gallmeister there are some very useful POSIX
benchmark programs which allow to measure the real-time performance of an
operating system. However there is nothing on the threads of POSIX.4a!  Does
anybody know of a useful set of benchmark programs on these POSIX threads ??

Any help is greatly appreciated.

Markus Joos
CERN ECP/ESS
(markus.joos@cern.ch)


??
=================================TOP===============================
 Q6: Has anyone used the Sparc atomic swap instruction?  

Has anyone used the Sparc atomic swap instruction to safely build lists 
in a multithreaded application?  Any examples?  Any references?


Yes, but it would not help you if you use sun4c machines. ( No atomic
instructions..)  Thus you would be forced to use atomic in sun4m or later,
and spl stuff in sun4c.  Does not make a pretty picture. Why not use
mutex_lock/unlock and let the libraries worry about that. mutex_lock uses
atomic/spl stuff.

Sinan

[sun4c are SPARC v7 machines such as 4/110, SS1, SS1+, SS2, IPC,
IPX, EPC, EPX. sun4m are v8 machines including SS10, SS20, SS4, SS5, 4/690,
SS1000, SC2000. The UltraSPARC machines are SPARC v8+ (soon to be v9), but
have the same instructions as the sun4ms.]
=================================TOP===============================
 Q7: Are there MT-safe interfaces to DBMS libraries?  

A: In general, no.  My current understanding is that NO major DBMS
   vendor has an MT-safe client-side library. (1996)


Peter Sylvester wrote:
> 
> In article <311855DF.41C6@steffl.aui.dec.com>, Andreas Reichenberger
>  wrote:
> 
> > Richard Moulding wrote:
> > >
> > > I need to interface to an Oracle 7 DB from DCE (non-Oracle)
> > > clients. We are planning to build our own  DCE RPC-stored
> > > procedure interface but someone must be selling something to do
> > > this, no?
> 
> I have the same problem with an Informix application which uses a DCE
> interface, and currently limit it to 1 thread coming in.  This works, but
> could be a bottleneck in busy environments, as other incoming RPCs are put
> in a queue (blocked) until the current one finishes.

... stuff deleted

 
> A potential way around this would be to fork off separate processes which
> then start their own connection to the database.  The parent then acts as a
> dispatcher for requests coming in.  I know the forking part works without
> DCE, but I suspect that you have to do all the forks before going into the
> DCE server listening mode.
> 
> I also thought I heard something about Oracle providing a thread safe
> library, maybe in 7.3.  Anyone know?
> 
> --
> Peter Sylvester
> MITRE Corp.
> Bedford, MA
> (peters@mitre.org)

This is exactly the way we handled the problem. We wrote a tool that
generates the complete dispatcher from an IDL file. The dispatcher (which is
virtually invisible to the clients and to the developers) distributes the
requests from the clients to its 'backends', which are connected to the
DB. The backends are implemented as single-threaded DCE Servers with the
Interface specified in the IDL File.

We added some features that are not in DCE, like 
  - asyncronous RPC's (the RPC returns immediately and the client can ask the
    dispatcher to return the state of the RPC (if it is done or still running)
    or request the RPC to be canceled) 
  - dividing the backends into classes. i.e. it's possible to have one class of
    backends for querying the database and another class for updates, etc. By
    assigning 2 backends to the query class and the rest of the backends to
    other classes you can limit the number of concurrent queries to 2 (because
    they are time consuming). The client has to specify which class is to be used
    for a RPC (we currently support up to 10 classes)

Context handles are used to tie a client to one backend for transactions which
require more than one RPC to be handled by the same backend (= DB Session).

The reason why the hell we had to do this anyway was to limit the number of
backend processes neccessary to support a few hundred PC clients. We
currently run it on AIX and Digital UNIX with Oracle and Ingres. However,
there's no reason why it shouldn't work on any UNIX platform which supports
OSF DCE (V1.1) and with any DB.

Feel free to contact me for more details...

See 'ya

=================================TOP===============================
 Q8: Why do we need re-entrant system calls?  

A:
jbradfor@merlot.ecn.purdue.edu (Jeffrey P Bradford) wrote:
>Why do we need re-entrant system calls?  I know that it's so that
>system calls can be used in a multithreaded environment, but how often
>does one really have multiple threads executing the same system call?
>Do we really need system calls that can be executed by multiple
>threads, or would mutual exclusion be good enough?

Well, there have been some implimentations that felt (feel?) that mutual
exclusion is good enough. And, in fact, that will "thread safe" the
functions. But it runs havoc with performance, and things like
cancelability. Turns out that real applications have multiple threads calls
executing the same system call all the time. read() and write() are popular,
as are send() and recv() (On UNIX).

>I'm assuming that system calls can be designed intelligently enough so
>that, for example, if a process wants to perform a disk read, the
>process performs a system call, exits the system call (so another
>thread can perform a disk read), and then is woken up when the disk
>read is done.
>
>Jeff

[I assume the behavior you reference "leave the system call" means
"return to user space"]

That all depends on the OS. On UNIX, that is not the default
system call behavior. On VMS it is (Just two examples).

Brian Silver.

=================================TOP===============================
 Q9: Any "code-coverage" tools for MT applications?  

Is there an application that can help me with "code-coverage" for
MT applications?


A:

Upon which platform are you working?  I did performance profiling last week
on a MT app using prof & gprof on a Solaris 2.4 machine.  For code coverage,
I use tcov.  I suspect that most OS's w/ kernel threads have thread-aware
gprof and tcov commands.

--
Spike White          | spike@hal.com               | Biker Nerds
HaL Software Systems | '87 BMW K75S, DoD #1347     |  From  HaL
Austin, TX           |  http://www.halsoft.com/users/spike/index.html 
Disclaimer:  HaL, want me to speak for you?  No, Dave... 
=================================TOP===============================
 Q10: How can POSIX join on any thread?  

The pthread_join() function will not allow you to wait for "any" 
thread, like the UI function thr_join() will.  How can I get this?

A:
> >: I want to create a number of threads and then wait for the first
> >: one to finish, not knowing which thread will finish first.  But
> >: it appears pthread_join() forces me to specify exactly which of
> >: my threads I want to wait for.  Am I missing something basic, or
> >: is this a fundamental flaw in pthread_join()?
> >
> >:      Rich Stevens
> >
> >Good call.  I notice Solaris native threads have this support and the
> >pthreads implementations I've seen don't.  I wondered about this myself.
> >
> 
> Same here.  The situation I ran into was a case where once the main
> created the necessary threads and completed any work it was responsible
> for, it just needed to "hang-around" until all the threads completed
> their work before exiting.  pthread_join() for "any" thread in loop using
> a counter for the number of threads seemed the logical choice.  Then I
> realized Solaris threads supported this but POSIX didn't (along with
> reader/writer locks).  Oh well.
> 
> How about the Solaris SPLIT package.  Does it support the "wait for any"
> thread join?

This "wait for any" stuff is highly misleading, and dangerous in most real
threaded applications. It is easy to compare with the traditional UNIX "wait
for any process", but there's no similarity. Processes have a PARENT -- and
when a process "waits for any" it is truly waiting only for its own
children. When your shell waits for your "make" it CANNOT accidentally chomp
down on the termination of the "cc" that make forked off!

This is NOT true with threads, in most of the common industry threading
models (including POSIX 1003.1c-1995 and the "UNIX International" threads
model supported by Solaris). Your thr_join(NULL,...) call may grab the
termination status of a thread used to parallelize an array calculation
within the math library, and thus BREAK the entire application.

Without parent/child relationships, "wait for any" is not only totally
useless, it's outright dangerous. It's like the Win32 "terminate thread"
interface. It may seem "neat" on the surface, but it arbitrarily breaks all
shared data & synchronization invariants in ways that cannot be detected or
repaired, and thus CANNOT be used in anything but a very carefully
constructed "embedded system" type environment where every aspect of the
code is tightly controlled (no third-party libraries, and so forth). The
very limited enviroments where they are safe & useful are dramatically
outweighed by the danger that having them there (and usually very poorly
explained) encourages their use in inappropriate ways.

It really wouldn't have been hard to devise POSIX 1003.1c-1995 with
parent/child relationships. A relatively small overhead. It wasn't even
seriously considered, because it wasn't done in any of the reference
systems, and certainly wasn't common industry practice. Nevertheless,
there are clearly advantages to "family values" in some situations...
among them being the ability to usefully support "wait for any". But
wishful thinking and a dime gets you one dime...

------------------------------------------------------------------------
Dave Butenhof                              Digital Equipment Corporation
butenhof@zko.dec.com                       110 Spit Brook Rd, ZKO2-3/Q18
Phone: 603.881.2218, FAX: 603.881.0120     Nashua, NH 03062-2711
                 "Better Living Through Concurrency"
------------------------------------------------------------------------

I find Dave's comments to be most insightful.  He hits on a big point
that I have hear a number of people express confusion about.  My 2-bits
to add:

  As a programmer we should be thinking about the availability of resources
-- when is something ready for use?  "Is the Matrix multiply complete?" "Has
the data request been satisfied?" etc.  thr_join() is often used as a cheap
substitute for those questions, because we ASSUME that when all N threads
have exited, that the computation is complete.  (Generally accurate, as long
as we control the entire program.  Should some lout get hired to maintain
our code, this assumption could become false in a hurry.)

  The only instance where we REALLY care if a thread has exited is when
the resource in question IS that thread (e.g., we want to un-mmap pages
we reserved for the stack or other rare stuff).

  So... the correct answer is "Don't do that."  Don't use thr_join()
to count threads as they exit.  Set up a barrier or a CV and have the threads
count down as they complete their work.  IE:

worker threads:

	do_work();
...  	lock(M);
	running_threads--;
	if (running_threads == 0) cond_signal(CV);
	unlock(M);
	thr_exit();




"Master" thread:

...	running_threads = N;
	create_workers(N);
	lock(M)
	while (running_threads != 0) cond_wait(M, CV);
	...


-Bil


=================================TOP===============================
 Q11: What is the UI equivalent for PTHREAD_MUTEX_INITALIZER?  

A:

From the man page (man mutex_init):

Solaris Initialize
     The equivalent Solaris API used to  initialize  a  mutex  so
     that  it has several different types of behavior is the type
     argument passed to mutex_init().  No current type  uses  arg
     although  a  future  type  may  specify  additional behavior
     parameters via arg.  type may be one of the following:

     USYNC_THREAD        The mutex can synchronize  threads  only
                         in  this  process.  arg is ignored.  The
                         USYNC_THREAD Solaris mutex type for pro-
                         cess  scope  is  equivalent to the POSIX
                         mutex         attribute          setting
                         PTHREAD_PROCESS_PRIVATE.

     USYNC_PROCESS       The mutex  can  synchronize  threads  in
                         this  process and other processes.  Only
                         one process should initialize the mutex.
                         arg   is   ignored.   The  USYNC_PROCESS
                         Solaris mutex type for process scope  is
                         equivalent  to the POSIX mutex attribute
                         setting   PTHREAD_PROCESS_SHARED.    The
                         object  initialized  with this attribute
                         must  be  allocated  in  memory   shared
                         between  processes, i.e. either in Sys V
                         shared memory  (see  shmop(2)).   or  in
                         memory  mapped  to a file (see mmap(2)).
                         It is illegal to initialize  the  object
                         this  way and to not allocate it in such
                         shared memory.

     Initializing mutexes can also be accomplished by  allocating
     in  zeroed  memory  (default),  in  which  case,  a  type of
     USYNC_THREAD is assumed.  The same mutex must not be  simul-
     taneously  initialized  by  multiple  threads.  A mutex lock
     must not be re-initialized while in use by other threads.

     If default mutex attributes are used, the macro DEFAULTMUTEX
     can  be used to initialize mutexes that are statically allo-
     cated.

=================================TOP===============================
 Q12: How many threads are too many in one heavyweight process?    

How many are too many for a single machine?

A:

The answer, of course, is "it depends".

Presumably, the number of threads you're considering far outstrips the
number of processors you have available, so it's not really important
whether you're running on uni- or a multiprocessor, and it's not really
important (in this general case) whether the threads implementation has
any kernel support (presumably it doesn't on HP-UX, judging by your post
from 14 Feb 1996 14:31:42 -0500).  So, it comes down to what these
bazillion threads of yours are actually doing.  

If, for the most part, they just sit there waiting for someone to tickle
the other end of a socket connection, then you can probably create LOTS
before you hit "too many".  In this case it would depend on how much
memory is available to your process, in which to keep all of these
sleeping threads (and how much kernel resources are available to create
sockets for them ;-).

If, on the other hand, every one of these bazillion threads is hammering
away on the processor (trying to compute some fractal or something :-),
then creating any more threads than you have processors is too many.
That is, you waste time (performance, throughput, etc.) in switching
back and forth between the threads which you could be spending on
something useful.  That is, life would be better if you just created a
couple of threads and had them make their way through all the work at
hand.

Presumably, your application falls somewhere between the two extremes.
The idea is to design so that your "typical operating conditions"
involve a relatively small number of threads active at any one time.
Having extra ones running isn't a catastrophe, it just means that things
aren't quite as efficient as they otherwise might be.

-- 

------------------------------------------------------------------------
Webb Scales                                Digital Equipment Corporation
scales@wtfn.enet.dec.com                   110 Spit Brook Rd, ZKO2-3/Q18
Voice: 603.881.2196, FAX: 603.881.0120     Nashua, NH 03062-2711
         Rule #12:  Be joyful -- seek the joy of being alive.
------------------------------------------------------------------------

=================================TOP===============================
 Q13: Is there an atomic mutex_unlock_and_wait_for_event()?  

Is it possible for a thread to release a mutex and begin
waiting on an "event" in one atomic operation?  I can think of a few
convoluted ways to achieve or simulate this, but am wondering if
there's an easy solution that I'm missing.


A:

This isn't how you'd really want to look at things (POSIX). Figure out what
condition you're interested in and use a CV.

	=================================TOP===============

The NT4.0 beta has a new Win32 API, SignalObjectAndWait that will do what you
want. Sorry, it is not available in 3.51 or earlier.
    -John

Robert V. Head
=================================TOP===============================
 Q14: Is there an archive of this newsgroup somewhere?  

I believe http://www.dejanews.com keeps a 1 year record of every
newsgroup on the Usenet.  You can search it by author to get your
articles, then pick out individual threads...

=================================TOP===============================
 Q15: Can I copy pthread_mutex_t structures, etc.?  

"Ian" == Ian Emmons  writes:
In article <32D4149F.588B@persistence.com> Ian Emmons  writes:

Ian> Variables of the data type pthread_t are, semantically speaking, a sort of 
Ian> reference, in the following sense:

Ian> 	pthread_t tid1;
Ian> 	pthread_t tid2;
Ian> 	void* ret_val;

Ian> 	pthread_create(&tid1, NULL, some_function, NULL);
Ian> 	// Now tid1 references a new thread.
Ian> 	tid2 = tid1;
Ian> 	// Now tid2 references the same thread.
Ian> 	pthread_join(tid2, &ret_val);

Ian> In other words, after creating the thread, I can assign from one pthread_t 
Ian> to another, and they all reference the same thread.  Pthread_key_t's (I 
Ian> believe) behave the same way.

	You should not copy one structure pthread_t to another pthread_t
...  it may not be portable.  In some implementations the pthread_t is
not simple a structure containing only a pointer and some keys .... it
is infact the REAL structure, which would then create two independant
structures which each can be manipulated individually reaping havoc.

Ian> An attributes object, like pthread_attr_t (or an unnamed semaphore sem_t), 
Ian> on the other hand does not behave this way.  It has value semantics, because 
Ian> you can't copy one into another and expect to have a second valid attribute 
Ian> object.

Ian> My question is, do pthread_mutex_t's and pthread_cond_t's behave as 
Ian> references or values?

	Same statement .... I have seen enough problems where someone copied
an initialized lock then continued to lock the two mutexes independently
creating very unwanted behavior.

-- 
William E. Hannon Jr.                         internet:whannon@austin.ibm.com
AIX/DCE Technical Lead                                         whannon@austin
Austin, Texas 78758     Department ATKS/9132     Phone:(512)838-3238 T/L(678)
'Confidence is what you had, before you understood the situation.' Dr. Dobson


FOLLOWUP: For most programs, you should be passing pointers around, not
structures:


pthread_mutex_t		my_lock;


main()
{  ...
   foo(&my_lock);
   ...
}

foo(pthread_mutex_t *m)
{
pthread_mutex_lock(m);
...
}
=================================TOP===============================
 Q16: After 1800 calls to thr_create() the system freezes. ??  

My problem is that the thread does not get freed or released back to the
system for reuse.  After 1800 calls to thr_create() the system freezes. ??
A: The default for threads in both UI and POSIX is for threads to be
   "undetached" -- meaning that they MUST be joined (thr_join()).  Otherwise
   they will not be garbage collected.  (This default is the wrong choice.  Oh
   well.)
=================================TOP===============================
 Q17: Compiling libraries which might be used in threaded or unthreaded apps?  


   What *is* the straight scoop on how to compile libraries which 
   might be used in threaded or unthreaded apps?  Hopefully the 
   "errno" and "putc()" macros will continue to work even if
   libthread isn't pulled in, so that vendors can make a single
   version of any particular library.

A: Always compile *all* libraries with the reentrancy flag (_REENTRANT for
   UI threads, _POSIX_C_SOURCE=199506L for POSIX threads). Otherwise some 
   poor soul will try to use your library and get hammered.  putc() and
   getc() WILL be slower, but you may use putc_unlocked() & getc_unlocked()
   if you know the I/O stream will be used safely.

   All Solaris libraries are compiled like this.
=================================TOP===============================
 Q18: What's the difference of signal handling for process and thread?   

   What's the difference of signal handling for process and thread? Do the
   signals divided into the types of process-based and thread-based which were
   treated differently in HP-RT? Is there any examples? I'd like to know how to
   initiate, mask, block, wait, catch, ...... the signals. How can I set the
   notification list (process or thread?) of SIGIO for both socket and tty
   using fcntl or ioctl? 

A: You probably want to buy one of the books that discuss this in detail.
   Here's the short answer:



	Signal masking is on per-thread based.
	But the signal handlers are per-process based.
	The synchronous signals like SIGSEGV, SIGILL etc will be 
	processed by the thread which caused the signal.

	The other signals will be handled by any ready thread which
	has the mask enabled for the signal.
	
	There are no special thread library for signal handling.
=================================TOP===============================
 Q19: What about creating large numbers of threads?  

I've asked a question about creating 2500 unbound threads. During these
days, I have written some more testing programs. Hope you would help me to
solve some more problems.

1. I have written a program that creates 10 threads. Then the 10 threads
each create 10 more threads. The 100 newly created threads each creates 10
more threads. In a SPARC 2000, if the concurrency level is 100, the program
takes 7 seconds to terminate. From a paper, unbound thread creation is
claimed to take only 56 usec. How comes my testing program is so slow on a
SPARC 2000 that has 20 CPUs? If I use a SPARC 10, the program only takes 1
second to terminate. Is SPARC 2000 slower than a SPARC 10?

2. Instead of creating 2500 threads, I have written a program that creates
200 threads and then kills them all and creates 200 threads and kills them
all and ..... After some while of creating and killing, the program hangs. I
use sigaction to set a global signal handler for the whole process. As the
program is so simple, I don't know where the problem is.

3. In addition, I have written a program that creates 1000 bound
threads. Each thread has a simple loop:

		while (1)
		{
			randomly read an entry of an array
		}

   This time, not only my program hangs, the whole SPARC 2000 hangs. I can't
reset the machine from console. Finally, I have to power down the machine.

Thanks in advance.


A:
=================================TOP===============================
 Q20: What about using sigwaitinfo()?  

>Here is what I am doing.  I am using the early access POSIX threads.
>My main program blocks SIGUSR1 and creates a number of threads.
>One of these threads is dedicated to this signal.  All it does is a
>sigwaitinfo on this signal, sets a flag when it returns, and exits.
>If I send the SIGUSR1 signal to the process using the kill command
>from another window, it does not seem to get it and the other threads
>(which are doing a calculation in a loop) report that SIGUSR1 is not
>pending.
>
>An earlier version of the program which used a signal handler to set
>the flag worked perfectly.
>
>Do you have any ideas on this?

A:

I assume you are using sigwaitinfo(3r) from libposix4.
Unfortunately, sigwaitinfo() is not MT-safe, i.e. does not work correctly
in an MT program, on 2.3/2.4. Use sigwait(2) - it should work on 2.3/2.4.
On 2.5 beta, sigwaitinfo() works.

If you really need the siginfo on 2.3/2.4, it is going to be hard, and the 
solution depends on whether you are running 2.3/2.4 but here is an 
alternative suggestion:

Programmers have used signals between processes as an IPC mechanism. Sounds
like you are trying to do the same. If this is the case, I would strongly
suggest that you use shared memory (see mmap(2)) between processes and
shared memory synchronization (using the SysV shared semaphores - see
semop(2)), or POSIX synchronization objects with the PTHREAD_PROCESS_SHARED
attribute. For example, you can set-up a region of shared memory protected
by a mutex and condition variable. The mutex and condition variable would
also be allocated from the shared memory and would be initialized with the
PTHREAD_PROCESS_SHARED attribute. Now, processes which share this memory
can use the mutex and condition variable as IPC mechanisms - any information
that needs to be passed between them can be passed through the shared
memory (alternative to siginfo :-)). To make this asynchronous, you can
have a thread dedicated to monitoring the shared memory area by waiting
on the condition variable. Now, whenever the signalling process wants to
send a signal, it instead issues a cond_signal on the condition variable.
The thread sleeping on this in the other (receiving) process wakes up
now and processes the information.

In general, signal handlers and threads, even though the system might support
this correctly, should not be used together. Signal handlers could be
looked upon as "substitute threads" when threads were not around in UNIX, 
and now that they are, the interactions between them can be complicated. 
You should mix them together only if absolutely necessary.

=================================TOP===============================
 Q21: How can I have an MT process communicate with many UP processes?  

>I have a multithreaded process, each thread in the multithreaded
>process wants to communicate with another single-threaded process,
>what is the good way to do that?
>
>Assume each thread in the multithreaded process is identical, i.e.
>they are generated using the same funcation call and each thread
>creates a shared memory to do the communication, will the generated
>shared memories operate independently if no synchronization provided?  

A:


  It sounds like you have the right idea.  For each thread/process pair,
build a shared memory segment and use that for communications.  You'll need
some sort of synchronization variable in that shared segement for
coordination.  

  There is no interaction between segments what-so-ever.
=================================TOP===============================
 Q22: Writing Multithreaded code with Sybase CTlib ver 10.x?  


>A customer is trying to write a multi-threaded application that also
>uses Sybase CTlib ver 10.x, and he is facing some limitations due to
>the Sybase library. 
>
>BOTTOM LINE: CTlib is reentrant, but according to Sybase is not usable
>in a multi-threaded context. That means it does NOT seem to be usable
>in an MT application.
>
>The purpose of this mail is NOT to get a fix for CTlib, but to try to
>find a workaround, if one exists...

A:

The workaround for the moment is to use the XA library routines from
Sybase, which are, in turn, based upon the TransArc package pthread*
routines.

We should be getting an alpha version of MT safe/hot CTlib towards the first
part of June 1995.  Also of potential interest is there will also be an early
version of native-threaded OpenServer soon as well, which really opens
up a lot of possibilities.

Chris Nicholas
SunSoft Developer Engineering
--------------------------------------------------------------
=================================TOP===============================
 Q23: Can we avoid preemption during spin locks?  

> 	A while ago I asked you for information on preemption control
> interfaces (in-kernel) which might be available in Solaris2.x. I am
> looking for ways of lowering number of context switches taken as the
> result of adaptive muxtex contention. We have a number of places a
> lock is taken and held for a few scant lines of C. It would be great
> to prevent preemption during these sections of code.

A:

  You're obvious writing a driver of some sort. (Video driver I'd guess?)
And you're VERY concerned with performance on *MP* machines (UPs be damned).
You have tested you code on a standardized, repeatable benchmark, and you
are running into a problem.  You have solid numbers which you are absolutely
certain of.  Right?

  You'll have to excuse my playing the heavy here, but you're talking deep
do-do here, and I don't want to touch it unless I'm totally convinced I (and
you) have to.

  You could set the SPL up to turn off all interrupts.  It would slow your
code down quite a bit though.  The probablity of preemption occuring over "a
few scant lines of C" (i.e., a few dozen instructions) approaches zero.
Regularly suffering from preemption during just these few instructions would
be a VERY odd thing.  I am hard pressed to INVENT a situation like this.
Are you absolutely, totally, completely, 100% certain you're seeing this?
Are you willing to put $10 on it?

=================================TOP===============================
 Q24: What about using spin locks instead of adaptive spin locks?  
> 
> 	I also would like to know more about something I saw in
> /usr/include/sys/mutex.h. It would appear that it possible to 
> create pure spinning locks (MUXTEX_SPIN) as opposed to the default 
> adaptive mutexes (MUTEX_ADAPTIVE_STAT). These might provide the kind 
> of control I am looking for assuming that these are really supported 
> and not some bastard orphan left over.

A:

  If I understand the question, the answer is "no".  That's what an adaptive
mutex is for.  It optimizes a spin lock to sleep if there's no value in
spinning.  If you use a dumb spin lock instead, you are GUARANTEED to run
slower.
=================================TOP===============================
 Q25: Will thr_create(...,THR_NEW_LWP) fail if the new LWP cannot be added?  

>	Does Sun's implementation of thr_create(...,THR_NEW_LWP) fail
>to create the multiplexed thread if the new LWP cannot be added to the
>multiplexing pool?  The unixware docs indicate Novell's implementation
>of thr_create() uses THR_NEW_LWP as a hint to the implementation to
>increase the pool size.  They also do not state the behavior if the
>new lwp cannot be created.  What is the official statement?

A:

  It should not create a new thread if it returns EAGAIN.  Mind you, you're
fairly unlikely EVER to see this happen in a real program.  (You'll see it
in bugs & in testing/design.)
=================================TOP===============================
 Q26: Is the LWP released upon bound thread termination?  

>  In the sun implementation, if you create a bound
>thread, and the thread eventually terminates, is the LWP released
>upon termination, or upon thr_join with the terminated thread?

A:

  Yes, a bound thread's LWP is released.  This should not affect your
programming at all.  Use thr_setconcurrency() & leave it at that.
=================================TOP===============================
 Q27: What's the difference between pthread FIFO the solaris threads scheduling?  

A:  Very little.

=================================TOP===============================
 Q28: I really think I need time-sliced RR.  

>Well, i really think I need time-sliced RR. Since I'm making an 
>multithreaded implementation of a functional concurrent process-
>oriented  language. MT support is needed to get good usage
>of multi CPU machines and better realtime. Today processes are custom 
>user-level and the runtime system delivers the scheduling. And the
>language semantic is that processes are timesliced RR.
>Changing the sematic is not realistic. I really hope the pthreads
>will spec RR timeslicing, it would make things easier.

A:

  Think VERY carefully.  When will you ever *REQUIRE* RR scheduling?  And
why?  Remember, you've never had it ever before, so why now?  (There may be
a reason, but it had better be good.)  Scheduling should normally be
invisible, and forcing up to user-awareness is generally a bad thing.

>For the moment, since this will only be a prototype, bound threads
>will do but bot in a real system with a couple with houndreds of
>threads/processes.
>
>Convince me I don't need RR timeslicing, that would make things easier.
>Or how do I make my own scheduler in solaris, or should I stay with
>bound threads?

  OK.  (let me turn it around) Give one example of your program which will
fail should thr 3 run before thr 2 where there is absolutely NO
synchronization involved.  With arbitrary time-slicing of course.  I can't
think of an example myself.  (It's just such a weird depencency that I
can't come up with it.  But I don't know everything...)
=================================TOP===============================
 Q29: How important is it to call mutex_destroy() and cond_destroy()?  

here is how I init serval of my threading variables

    mutex_init( &lock, USYNC_PROCESS, 0 );
    cond_init( ¬Busy, USYNC_PROCESS, 0 );
   
The storage for the variables is in memory mapped file. once I have
opened the file, I call unlink to make sure it will be automatically
cleaned up. How important is it to call mutex_destroy() and
cond_destroy()? Will I wind up leaking some space in the kernel is I
do not call these functions?

A:
=================================TOP===============================
 Q30: EAGAIN/ENOMEM etc. apparently aren't in ?!  

A:
  'Course not.  :-)

  They're in errno.h.  pthread_create() will return them if something goes
wrong.  Be careful, ERRNO is NOT used by the threads calls.
=================================TOP===============================
 Q31: What can I do about TSD being so slow?  
 Q32: What happened to the pragma 'unshared' in Sun C?  

   I read about a pragma 'unshared' for the C-compiler in some Solaris-thread
   papers. The new C-3.01 don't support the feature anymore I think. There is
   no hint in the Solaris 2.4 Multithread Programming Guide. But the new
   TSD is very slow. I tested a program with direct register allocation under
   gcc (asm "%g3") instead of calling the thr_getspecific procedure and it was 
   over three times faster. Can I do the same thing or something else with the 
   Sun C-compiler to make the C-3.01 Code also faster?

A:

The "thread local storage" feature that was mentioned in early papers
about MT on Solaris, and the pragma "unshared", were never
implemented.  I know what you mean about the performance of TSD.  It
isn't very fast.  I think the key here is to try to structure your
program so that you don't rely too much on thread specific data, if
that's possible.

The SPARC specification reserves those %g registers for internal use.
In general, it's dangerous to rely on using them in your code.
However, SC3.0.1 does not use the %g registers in any user code. It
does use them internally, but never across function calls, and never
in user code.  (If you do use the %g registers across function calls,
be sure to save and restore the registers.)

You can accomplish what gcc does with the "asm" statement by writing
what we call an "inline function template."  Take a look at the math
library inline templates for an idea on how to do that, and see the
inline() man page.  You might also want to take a look at the
AnswerBook for SPARC Assembly Language Programming, which is found in
the "Solaris 2.x Software Developer Answerbook".  The latest part
number for that is 

801-6649-10     SPARC ASSEMBLY LANGUAGE REFERENCE MANUAL REV.A AUG 94

The libm templates are found in /opt/SUNWspro/SC3.0.1/lib/libm.il.
Inline templates are somewhat more work to write, as compared to using
gcc's "asm" feature, but, it's safer.  I don't know about the
robustness of code that uses "asm" - I like gcc, and I use it, but
that particular feature can lead to interesting bugs.

Our next compiler, SC4.0 (coming out in late 1995) will use the %g
registers more aggressively, for performance reasons.  (Having more
registers available to the optimizer lets them do more optimizations.)
There will be a documented switch, -noregs=global (or something like
that) that you will use to tell the SC4.0 NOT to use the global
registers.    When you switch to SC4.0, be sure to read the cc(1) man
page and look for that switch.  
=================================TOP===============================
 Q33: Can I profile an MT-program with the debugger?  

   Can I profile an MT-program with the debugger and a special MT-license
   or do I need the thread-analyser?

A:

The only profiling you can do right now for an MT program is what you
get with the ThreadAnalyzer.  If you have the MT Debugger and SC3.0.1,
then, you should also have a copy of the ThreadAnalyzer (it was first
shipped on the same CD that had SC3.0.1) Look for the binary "tha"
under /opt/SUNWspro/bin.  

The "Collector" feature that you can use from inside the Debugger
doesn't work with MT programs.  Future MT-aware-profiling tools will
be integrated with the Debugger - is that where you'd like to use
profiling?
=================================TOP===============================
 Q34: Sometimes the specified sleep time is SMALLER than what I want.  

>I have a program that generates UDP datagrams at regular intervals.
>It uses real time scheduling for improved accuracy.
>(The code I used is from the Solaris realtime manual.)
>
>This helps, but once in a while I do not get the delay I wanted.
>The specified sleep time is SMALLER (i.e. faster) than what I want.
>
>I use the following procedure for microsecond delays
>
>void
>delay(int us) /* delay in microseconds */
>{
>	struct timeval tv;
>
>	tv.tv_sec = us / 1000000;
>	tv.tv_usec = us % 1000000;
>	(void)select( 0, (fd_set *)NULL, (fd_set *)NULL, (fd_set *)NULL, &tv );
>
>}
>
>
>As I said, when I select a delay, occasionally I get a much smaller delay.
>
>examples:
>	Wanted: 19,776 microseconds, got: 10,379 microseconds
>	Wanted:    910 microseconds, got:    183 microseconds
>
>
>As you can see, the error is significant when it happens.
>It does not happen often. (0.5% of the time)
>
>I could use the usleep() function, but that's in the UCB library.
>Anyone have any advice?

A:

First of all, you can not do a sleep implementation in any increments
other than 10 milliseconds (or 1/HZ variable).

Second, there is a bug in the scheduler (fixed in 2.5) that may
mess up your scheduling in about 1 schedules around every
300,000 or so. 

Third, A much better timing interface will be available in
Solaris 2.6 (or maybe  earlier) thru posix interfaces. That
should give you microsecond resolution with less than 
50 microseconds latency.

Sinan
=================================TOP===============================
 Q35: Any debugger that single step a thread while the others are running?  

|>  Has anyone looked into the possibility of doing a MT debugger
|> that will allow you to single step a thread while the others
|> are running? This will probably require a debugger that attaches
|> a debugger thread to each thread...

A:

This was the topic of my master's thesis. You might check:

http://plg.uwaterloo.ca/~mkarsten

and follow the link to the abstract or the full version.

Martin
	=================================TOP=================

We have used breakpoint debugging to debug threads programs. We have
implemented a debugger that enables the user to write scripts to debug
programs (not limited to threads programs). This is made possible by a Tcl
interface atop gdb and hooks in gdb, that exports some basic debugger
internals to the user domain.  Thus allowing the user to essentially write
his own Application Specific debugger.

Please see the following web page for more info on the debugger

http://www.eecs.ukans.edu/~halbhavi/debugger.html
or
http://www.tisl.ukans.edu/~halbhavi/debugger.html

Cheers
Sudhir Halbhavi
halbhavi@eecs.ukans.edu
=================================TOP===============================
 Q36: Any DOS threads libraries?  

> Is there any way or does anyone have a library that will allow to program
> multitreads.. I need it for SVGA mouse functions.. I use both C++ and
> Watcom C++, 

A:

I use DesqView for my DOS based multi-thread programs.  (Only they don't call
them threads, they call them tasks....)  I like the DesqView interface to 
threads better than the POSIX/Solaris interface, but putting up with DOS was
almost more than I could stand.
=================================TOP===============================
 Q37: Any Pthreads for Linux?  

See: http://pauillac.inria.fr/~xleroy/linuxthreads/
http://sunsite.unc.edu/pub/Linux/docs/faqs/Threads-FAQ/html

Linux has kernel-level threads now and has had a thread-safe libc for a
while.  With LinuxThreads, you don't have to worry about things like your
errno, or blocking system calls. The few standard libc functions that are
inherently not thread safe (due to using static data areas) have been
augmented with thread-safe alternatives.

LinuxThreads are not (fully) POSIX, however. 
   
                   -----------------

I'm quite familiar with Xavier's package. He's done an awesome job given
what he had to work with. Unfortunately, the holes are large, and his
valiant attempts to plug them result in a larger and more complicated
user-mode library than should be necessary, without being able to
completely resolve the problems.

Linux uses clone() which is not "kernel-level threads", though, with
some proposed (and possibly pending) extensions in a future version of
the kernel, it could become that. Right now, it's just a way to create
independent processes that share some resources. The most critical
missing component is the ability to create multiple executable entities
(threads) that share a single PID, thereby making those entities threads
rather than processes.

Linuxthreads, despite using the "pthread_" prefix, is NOT "POSIX
threads" (aka "pthreads") because of the aforementioned substantial and
severe shortcoming of the current implementation based on clone().
Without kernel extensions, a clone()-based thread package on Linux
cannot come close to conforming to the POSIX specification. The common
characterization of Linuxthreads as "POSIX threads" is incorrect and
misleading. This most definitely is not "a true pthreads
implementation", merely a nonstandard thread package that uses the
"pthread" prefix.

Note, I'm not saying that's necessarily bad. It supports much of the
interface, and unlike user-mode implementations (which also tend to be
far more buggy than Linuxthreads), allows the use of multiple
processors.  Linuxthreads is quite useful despite its substantial
deficiencies, and many reasonable programs can be created and ported
using it. But it's still not POSIX.

=================================TOP===============================
 Q38: Any really basic C code example(s) and get us newbies started?  

>Could one of you threads gods please post some really, really basic C code
>example(s) and get us newbies started?  There just doesn't seem to be any other
>way for us to learn how to program using threads.

A:

The following is a compilation of all the generous help that was posted or mailed to me 
concerning the use of threads in introductory programs.  I apologize for it not being
edited very well...  (Now I just need time to go through all of these)

Here's all of the URL's:

http://www.pilgrim.umass.edu/pub/osf_dce/contrib/contrib.html
http://www.sun.com/workshop/threads
http://www.Sun.COM/smi/ssoftpress/catalog/books_comingsoon.html
http://www.aa.net/~mtp/


--Carroll
=================================TOP===============================
 Q39: Please put some Ada references in the FAQ.  

A:

Most Ada books will introduce threading concepts.  Also, check out Windows
Tech Journal, Nov. 95 for more info on this subject.
=================================TOP===============================
 Q40: Which signals are synchronous, and whicn are are asynchronous?  

>I have another question. Since we must clearly distinguish the
>sinchronous signals from the asynchronous ones for MT, is there any
>documentation on which is which? I could not find any.

A:

In general, independent of MT, this is an often mis-understood area of
signals.  The adjective: "synchronous"/"asynchronous" cannot be applied to a
signal.  This is because any signal (including normally synchronously
generated signals such as SIGSEGV) could be asynchronously generated using
kill(2), _lwp_kill(2) or thr_kill(3t).

e.g. SIGSEGV, which is normally synchronously generated, can also be sent
via kill(pid, SIGSEGV), in which case it is asynchronously generated. So
labelling SIGSEGV as synchronous and a program that assumes this, would be
incorrect.

For MT, a question is: would a thread that caused the generation of a signal
get this signal?

If this is posed for a trap (SIGSEGV, SIGBUS, SIGILL, etc.), the answer is:
yes - the thread that caused the trap would get the signal.  But the handler
for the trap signal, i.e. a SIGSEGV handler, for example, cannot assume that
the handler was invoked for a synchronously generated SIGSEGV (unless the
application knows that it could not have receieved a SIGSEGV via a kill(),
or thr_kill()).

If this question is posed for any other signal (such as SIGPIPE, or the
real-time signals) the answer should not really matter since the program
should not depend on whether or not the thread that caused the signal to be
generated, receives it. For traps, it does matter, but for any other signal,
it should not matter.

FYI: On 2.4 and earlier releases, SIGPIPE, and some other signals were sent
to the thread that resulted in the generation of the signal, but on 2.5, any
thread may get the signal. The only signals that are guaranteed to be sent
to the thread that resulted in its generation, are the traps (SIGILL,
SIGTRAP, SIGSEGV, SIGBUS, SIGFPE, etc.). This change should not matter since
a correctly written MT application would not depend on the synchronicity of
the signal generation for non-traps, given the above description of signal
synchronicity that has always been true.

-Devang
=================================TOP===============================
 Q41: If we compile -D_REENTRANT, but without -lthread, will we have problems?  

>Hi -
>
>I had posed a question here a few weeks ago and received a response. Since
>then the customer had some follow-on questions. Can anyone address this
>customer's questions:
>
>(note: '>' refers to previous answer we provided customer)
>
>> If only mutexes are needed to make the library mt-safe, the library writer 
>> can do the following to enable a single mt-safe library to be used by both 
>> MT and single-threaded programs:
>
>Actually, we are only using the *_r(3c) functions, such as strtok_r(3c),
>getlogin_r(3c), and ctime_r(3c).  We are not actually calling thr_*,
>mutex_*, cond_*, etc. in the libraries.
>
>We want to use these *_r(3c) library functions instead of the normal
>non-MT safe versions (such as strtok(), ctime(), etc.), but if we compile
>the object files with -D_REENTRANT, but do not link with -lthread, will
>we have problems?

A:


No - you will not have any problems, if you do not link with -lthread.

But if your library is linked into a program which uses -lthread, then:

You might have problems in a threaded program because of how you allocate 
and use the buffers that are passed in to the *_r routines.

The usage of the *_r routines has to be thread-safe, or re-entrant in
the library. The *_r routines take a buffer as an argument. If the library
uses a global buffer to be passed to these routines, and does not protect
this buffer appropriately, the library would be unsafe in a threaded program.

Note that here, the customer's library has to do one of the following to ensure
that their usage of these buffers is re-entrant:

- if possible, allocate the buffers off the stack - this would be per-thread
  storage and would not require the library to do different things depending
  on whether the library is linked into a threaded program or not.

- if the above is not possible:

On any Solaris release, the following may be done: (recommended solution):

	- use mutexes, assuming that threads are present, to protect the 
	  buffers. If the threads library is not linked in, there are dummy 
	  entry points in libc for mutexes which do nothing - and so this 
	  will compile correctly and still work. If the threads library is 
	  linked in, the mutexes will be real and the buffers will be 
	  appropriately protected.

On Solaris 2.5 only:

	- if you do not want to use mutexes for some reason and want to use
	  thread-specific data (TSD) if threads are present (say), then on 2.4
          you cannot do anything. On 2.5, though, one of the following may be 
	  done:
 
	(a) on 2.5, you can use thr_main() to detect if threads are linked in 
          or not. If they are, carry out appropriate TSD allocation of buffers.

	(b) If you are sure only POSIX threads will be used (if at all), and you
	  do not like the non-portability of thr_main() which is not a POSIX
	  interface, then, on 2.5, you can use the following (hack) to detect if
	  pthreads are linked in or not: you need the #pragma weak declaration 
	  so that you can check if a pthreads symbol is present or not. If 
	  it is, then pthreads are linked in, otherwise they are not. Following
	  is a code snippet which demonstrates this. You can compile it with
	  both -lpthread and without. If compiled without -lpthread it prints
	  out the first print statement. If compiled with -lpthread, it prints
	  out the second print statement. I am not sure if this usage of
	  #pragma weak is any more portable than using thr_main().

		#include 

		#pragma weak pthread_create

		main()
		{
			if (pthread_create == 0) {
				printf("libpthread not linked\n");
			} else {
				printf("libpthread is present\n");
				/*
				 * In this case, use Thread Specific Data
				 * or mutexes to protect access to the global
				 * buffers passed to the *_r routines.
				 */
			}
		}




-Devang

=================================TOP===============================
 Q42: Can Borland C++ for OS/2 give up a TimeSlice?  

Johan>    Does anyone know if Borland C++ for OS/2 has a function that could be 
Johan>    used within a THREAD to give up a TimeSlice.

A:

	If all you want to do is give up your timeslice
		DosSleep(0)
however if you are the highest priority thread, you will be immediately dispatched
again, before other threads.  Even when all the threads are the same priority,
my understanding is that the OS/2 operating system has a degradation algorithm
for the threads in a process ... so even if you DosSleep with the "same" priority
your thread still could be dispatched immediately --- depending on the
degradation algorithm.

	If you want to sleep to next clock tick
		DosSleep(1)
works, because the system round the 1 up to the next clock tick value.
This should allow other threads in your process to be dispatched.

	Both are valid semantics, depending on what you would prefer.
--
William E. Hannon Jr.                         internet:whannon@austin.ibm.com
DCE Threads Development                                        whannon@austin
=================================TOP===============================
 Q43: Are there any VALID uses of suspension?  

    UI threads, OS/2 and NT all allow you to suspend a thread.  I have yet to
  see a program which does not go below the API (ie debuggers, GCs, etc.), but
  still uses suspension.  I don't BELIEVE there is a valid use.  I could be
  wrong.

A:

I'll bite.  Whether we "go below the API" or not is for you to decide.
Our product, ObjectStore, is a client-server object-oriented database
system.  For the purpose of this discussion, it functions like a
user-mode virtual memory system: We take a chunk of address space
and use it as a window onto a database; if the user touches an address
within our special range, we catch the page fault, figure out which
database page "belongs" there, and read that page from the server.  After
putting the page into place, we continue the faulting instruction, which
now succeeds, and the user's code need never know that it wasn't there
all the time.

This is all fine for a single-threaded application.  There's a potential
problem for MT applications, however; consider reading a page from a
read-only database.  Thread A comes along and reads a non-existent page.
It faults, the fault enters our handler, and we do the following:
	get data from server
	make page read-write	;open window
	copy data to page
	make page read-only	;close window
During the window between the two page operations, another thread can
come along and read invalid data from the page, or in fact write the
page, with potentially disastrous effect.

On Windows and OS/2, we do the following:
	get data from server
	suspend all other threads
	make page read-write
	copy data to page
	make page read-only
	resume all other threads
to prevent the "window" from opening.  On OS/2, we use DosEnterCritSec,
which suspends all other threads.  On NT, we use the DllMain routine
to keep track of all the threads in the app, and we call SuspendThread
on each.  We're very careful to keep the interval during which threads
are suspended as brief as possible, and on OS/2 we're careful not to call
the C runtime while holding the critical section.

On most Unix systems, we don't have to do this, because mmap() has the
flexibility to map a single physical page into two or more separate
places in the address space.  This enables us to do this:
	get data from server
	make read-write alias of page, hidden from user
	copy data to alias page
	make read-only page visible to user
The last operation here is atomic, so there's no opportunity for other
threads to see bogus data.  There's no equivalent atomic operation on
NT or OS/2, at least not one that will operate at page granularity.
	=================================TOP==============
Since you do not like Suspend/Resume to be available to user level apis,
I thought the following set of functions (available to programs)
in WinNT (Win32) might catch your interest :) :

CreateRemoteThread -- allows you to start a thread in another process's
address space.. The other process may not even know you've done it
(depending on circumstances).  Supposedly, with full security turned
on (off by default!) this won't violatge C2 security.

SetThreadContext/GetThreadContext - Just lke it sounds.  You can
manipulate a thread's context (CPU registers, selectors, etc!).

Also, you can forcibly map a library (2-3 different ways: createremotethread
can allow this as well) to another proces's address space (that is, you
can map a DLL of yours to a running process).  Then, you can do
things like spawn off threads, after you have invisibly mapped your DLL
into the space.  Yes, there is potential for abuse (and for interestiing
programs).

But, microsoft has a use for these things.  They can help you subclass
a window on the desktop for instance.  If you wanted to make say
Netscape's main window beep twice every time it repaints, you could
map a DLL into netscape's address space, subclass the main window
(subclass == "Send ME the window's messages, instead of sending it to
the window -- i'll take care of everything!"), and watch for PAINTs
to come through.

Anyway, don't mean to waste your time.  Just thought you might find
it interesting that a user can start additional threads in someone else's
process, change thread context forcibly (to a decent degree), and
even latch onto a running process in order to change its behavior, or
just latch on period to run a thread you wrote in another proceses's
address space.




=================================TOP===============================
 Q44: What's the status of pthreads on SGI machines?  
>> We are considering porting of large application from Concurrent Computer
>> simmetrical multiprocessor running RTU-6 to one of the Silicon Graphics
>> multiprocessors running IRIX (5.3?).
>> 
>> Our application uses threads heavily. Both so-called user threads and 
>> kernel threads are required with a fair level of synchronization 
>> primiteves support and such.
>> 
>> My question is: what kind of multi-threaded application programming 
>> support is available in IRIX? 
>> 
>> Reading some of the SGI technical papers available on their WWW page 
>> just confuses me. I know that Posix threads or Solaris-type 
>> LWP/threads supports would be OK. 

A:

POSIX thread support in IRIX is more than a rumor - pthreads are currently 
scheduled to be available shortly after release of IRIX 6.2 (IRIX 6.2 is 
currently scheduled for release in Feb 96).  If you are interested in 
obtaining pthreads under IRIX as soon as possible, I would recommend 
contacting your local SGI office.
-- 
Bruce Johnson, SGI ASD                 
Real-time Applications Engineering          
=================================TOP===============================
 Q45: Does the Gnu debugger support threads?  

A:

An engineer at Cygnus is implementing thread support in gdb for Solaris.
No date for completion is given.
=================================TOP===============================
 Q46: What is gang scheduling?  

A:

Gang Scheduling is described a variety of ways. Generally the
consistent thread is that a GS gives a process all the processors at
the same time (or none for a time slice). This is most helpful for
"scientific apps" because the most common set up is something like

	do i=1, bignum
	   stuff
	   more stuff
	   lots more stuff
	end do

the obvious decomposition is bignum/nproc statically allocated. Stuff
and friends take very close to the same time per chunk, so if you get
lucky it all happens in one chime (viz. one big clock). Else it takes
precisely N chimes with no leftovers. When unlucky, it's N chimes +
cleanup for stragglers.

Virtually all supercomputers do this, they may not even bother to give
it a special name. SGI makes this explicit (and supported).

On SPARC/Solaris there is no way for the compiler to know if we'll get
the processors requested or when. So you can suffer multiple chime
losses quite easily.

One can reallocate processor/code on the fly, but with increased overhead.
=================================TOP===============================
 Q47: LinuxThreads linked with X11, calls to X11 seg fault. 


You can't rely on libraries that are not at the very least compiled
with -DREENTRANT to do anything reasonable with threads.  A vanilla
X11 build (with out -DREENTRANT and without XTHREADS enabled) 
will likely behave badly with threads.  

It's not terribly hard to build X with thread support these days,
especially if you're using libc-6 with builtin LinuxThreads.  Contact
your Linux distribution maintainer and insist on it.  Debian has just 
switched to a thread-enabled X11 for their libc6 release; has any other
distribution? 

Bill Gribble
=================================TOP===============================
 Q48: Are there Pthreads on Win32?  

Several answers here.  #1 is probably the best (most recent!).


A: Yes, there is a GNU pthreads library for Win32.  It is still under
   active development, but you can find out more by looking at
   http://sourceware.cygnus.com/pthreads-win32/

   (This is a combination of Ben Elliston & John Bossom's work. & others?)


Also:

Well, Dave Butenhof will probably kill me for saying this, but Digital has a
pthreads implementation for WIN32. I bug them occasionally about packaging
up the header and dll and selling it separately (for a reasonable price, of
course). I think it's a great idea. My company has products on NT and UNIX,
so it would solve some painful portability issues for us.  This
implementation uses the same "threads engine" that Digital uses, rather

than just some wrappers on NT system services.

So, maybe if a few potential customers join me in asking Digital for this,
we'll get somewhere.  What say, Dave?

================

I have such a beast...sort of.

I have a pthreads draft 4 wrapper that is (nearly) complete and has been
in use for a while (so it seems to work!).

About 6 weeks back I changed this code to provide a draft 10 interface. This
code has however not yet been fully tested nor folded into my projects.
Casting my mind back (a lot has happened in 6 weeks!) I seem to remember
one or two small issues where I wasn't sure of the semantics; I was working
from a document I picked up at my last job which showed how to migrate
from pthreads 4 to pthreads 10, rather than a copy of the standard.

If anyone wants this code, I can make it available.

Ian
ian.johnston@ubs.com

		================
> > As far as I know, there is no pthreads implementation for NT.  However,
> > ACE provides a C++ threads wrapper which works on pthreads, and on NT
> > (and some others).
>
> Well, Dave Butenhof will probably kill me for saying this, but Digital has a
> pthreads implementation for WIN32. I bug them occasionally about packaging up
> the header and dll and selling it separately (for a reasonable price, of
> course). I think it's a great idea. My company has products on NT and UNIX,
> so it would solve some painful portability issues for us. This implementation
> uses the same "threads engine" that Digital uses, rather than just some
> wrappers on NT system services.

Yes, DECthreads has been ported to Win32 for quite a while. It runs on Windows
NT 3.51 and 4.0, on Alpha and Intel; and also on Windows 95 (though this was
not quite as trivial as Microsoft might wish us to believe.)

The main questions are:

  1. What's the market?
  2. How do we distribute the code, and at what cost? (Not so much "cost to the
     customer", as "cost to Digital".)

The big issue is probably that, especially with existing free code such as ACE,
it seems unlikely that there'd be much interest unless it was free or "dirt
cheap". Yet, even if we disclaim support, there will still be costs associated,
which means it'd be really tricky to avoid losing money.

> So, maybe if a few potential customers join me in asking Digital for this,
> we'll get somewhere.  What say, Dave?

We'd love to hear who wants this and why. Although I haven't felt comfortable
actually advertising the possibility here, I have forwarded the requests I've
seen here, and recieved via mail (including Jeff's) to my manager, who is the
first (probably of several) who needs to make any decisions.

I'd be glad to forward additional requests. Indications of what sort of product
(e.g., in particular, things like "sold for profit" or "internal utility"
distinctions), and, of course, whether (and how much) you'd be willing to pay,
would be valuable information.

/---------------------------[ Dave Butenhof ]--------------------------\

From: Ben Elliston 

Matthias Block  writes:

> is there someone who knows anything about a Pthread like library for
> Windows NT. It would simplify the work here for me.

I am involved with a free software project to implement POSIX threads
on top of Win32.   For the most part, it is complete, but it's still
well and truly in alpha testing right now.

I expect to be posting an announcement in a few weeks (say, 4?) to
comp.programming.threads.  The source code will be made available via
anonymous CVS for those who want to keep up to date or submit
patches.  I'm looking forward to getting some net testing!


Over the last several months I have seen some requests for
a Win32 implementation of PThreads.  I, too, had been looking
for such an implementation but to no avail.

Back in March, I decided to write my own. It is based upon the
PThreads 1003.1c standard, however, I didn't implement everything.
Missing is signal handling and real-time priority functions.

I based the implementation on the description provided by

    Programming with POSIX Threads, by
    Dave R. Butenhof

I've created a zipped file consisting of some header files, an implib, 
a DLL and a simple test program.

I'm providing this implementation for free and as-is. You may download it
from

    http://www.cyberus.ca/~jebossom/pthread1c.html

Cheers,

John


--
John E. Bossom                                     Cognos Inc.
Voice: (613) 738-1338 x3386        O_o             P.O. Box 9707, Stn. T
  FAX: (613) 738-0002             =(  )= Ack!      OTTAWA, ON  K1G 4K9
 INET:  jebossom@cognos.COM          U             CANADA
=================================TOP===============================
 Q49: What about garbage collection?  


Please, please, please mention garbage collection when you come around
to talking about making code multithreaded.  A whole lot of
heap-allocated data needs to be explicitly reference counted *even
more* in a multithreaded program than in a single threaded program
(since it is so much harder to determine whether data is live or not),
and this leads to lots of bugs and worries and nonsense.

With garbage collection, on the other hand, you get to throw away
*all* of your worries over memory management.  This is a tremendous
win when your brain is already approaching meltdown due to the strain
of debugging subtle race conditions.

In addition, garbage collection can help to make the prepackaged
libraries you link against safer to play with (although it obviously
won't help to make them thread safe).  Xt, for example, is very badly
written and leaks like a sieve, but a conservative garbage collector
will safely kill off those memory leaks.  If you're linking against
legacy libraries and you need to write a long-running multithreaded
server, GC can make the difference between buying more RAM and killing
your server every few days so that it doesn't thrash, and simply
plugging in the threads-aware GC and sailing fairly happily along.

Bryan O'Sullivan 

[Please see: Geodesic Systems (www.geodesic.com)     -Bil]

=================================TOP===============================
 Q50: Does anyone have any information on thread programming for VMS?  

No ftp or web stuff, although we do have an HTML version of the Guide to
DECthreads and we'll probably try to get it outside the firewall where
it'll do y'all some good, one of these days. I've been very impressed
with Sun's "thread web site", and I'd like to get Digital moving in that
direction to help with the global work of evangelizing threads... but
not until I've finished coding, and writing, and consulting, and all
sorts of other things that seem to take 500% of my time. For general
info, and some information (though not enough) on using POSIX threads,
check Sun's library. (They need to start tapering off the UI threads.)

If you've got VMS (anything since 5.5-2), you'll have a hardcopy of the
Guide in your docset, and on the doc cdrom in Bookreader format. OpenVMS
version 7.0 has POSIX 1003.1c-1995 threads -- anything earlier has only
the old CMA and 1003.4a/D4 "DCE threads". Furthermore, OpenVMS Alpha 7.0
supports SMP threads (kernel support for dynamic "many to few"
scheduling), although "binary compatibility paranoia" has set in and it
may end up being nearly impossible to use. OpenVMS VAX 7.0 does not have
SMP or kernel integration -- integration will probably happen "soon",
but VAX will probably never have SMP threads.

------------------------------------------------------------------------
Dave Butenhof                              Digital Equipment Corporation
butenhof@zko.dec.com                       110 Spit Brook Rd, ZKO2-3/Q18
Phone: 603.881.2218, FAX: 603.881.0120     Nashua, NH 03062-2711
                 "Better Living Through Concurrency"
------------------------------------------------------------------------
=================================TOP===============================
 Q51: Any information on the DCE threads library?  

 http://www.osf.org/dce/
=================================TOP===============================
 Q52: Can I implement pthread_cleanup_push without a macro?  

I was about to use pthread_cleanup_push, when I noticed that it is
implemented as a macro (on Solraris 2.5) which forces you to have the
pthread_cleanup_pop in the same function by having an open brace { at the
end of the first macro and closing it int the second...  Since I want to
hide most of this stuff in something like a monitor (or a guard in ACE) in
C++ by using the push in a constructor and the pop in the destructor I'm
wondering if there is something fondamental that would prevent me to do so
or could I just re-implement the stuff done by the macros inside some class
services.



POSIX 1003.1c-1995 specifies that pthread_cleanup_push and pthread_cleanup_pop
must be used at the same lexical scope, "as if" the former were a macro that
expands to include an opening brace ("{") and the latter were a macro that
expands to include the matching closing brace ("}").

The Solaris 2.5 definition therefore conforms quite accurately to the intent
of the standard. And so does the Digital UNIX definition, for that matter. If
you can get away with "reverse engineering" the contents of the macros, swell;
but beware that this would NOT be a service to those using your C++ package,
as the results will be extremely non-portable. In fact, no guarantees that it
would work on later versions of Solaris, even assuming strict binary
compatibility in their implementation -- because they could reasonably make
"compatible" changes that would take advantage of various assumptions
regarding how those macros are used that you would be violating.

What you want to do has merit, but you have to remember that you're writing in
C++, not C. The pthread_cleanup_push and pthread_cleanup_pop macros are the C
language binding to the POSIX 1003.1c cancellation cleanup capability. In C++,
the correct implementation of this capability is already built into the
language... destructors. That is, C++ and threads should be working together
to ensure that C++ destructors are run when a thread is cancelled. If that is
done, you've got no problem. If it's not done, you've got far worse problems
anyway since you won't be "destructing" most of your objects anyway.

/---[ Dave Butenhof ]-----------------------[ butenhof@zko.dec.com ]---\
| Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
| 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/


=================================TOP===============================
 Q53: What switches should be passed to particular compilers?  

> Does anyone have a list of what switches should be passed to particular
> compilers to have them generate thread-safe code?  For example,
> 
> Solaris-2 & SunPro cc       : -D_REENTRANT
> Solaris-2 & gcc             : ??
> DEC Alpha OSF 3.2 & /bin/cc : -threads
> IRIX 5.x & /bin/cc          : ??
> 
> Similarly, what libraries are passed to the linker to link in threads
> support?
> 
> Solaris-2 & Solaris threads : -lthread
> DEC Alpha OSF 3.2 threads   : -lpthreads
> IRIX 5.x & IRIX threads     : (none)
> 
> And so forth.
> 
> I'm trying to get GNU autoconf to handle threads gracefully.
> 
> Bill

That would be useful information in general, I suppose. I can supply the
information for Digital UNIX (the operating system previously known as
"DEC OSF/1"), at least.

For 3.x and earlier, the proper compiler switch is -threads, which (for
cc) is effectively just -D_REENTRANT. For linking, the cc driver expands
-threads to "-lpthreads -lmach -lc_r" -- you need all three, immediately
preceeding -lc (which must be at the end). -lpthreads isn't enough, it
will pull in libmach and libc_r implicitly and in the wrong order (after
libc, where they will fail to preempt symbols).

For 4.0, you can still use -threads if you're using the DCE threads (D4)
or cma interfaces. If you don't use -threads, the link libraries should
be changed to "-lpthreads -lpthread -lmach -lexc" (before -lc). If you
use 1003.1c-1995 threads, you use "-pthread" instead of "-threads". cc
still gets -D_REENTRANT, but ld gets -lpthread -lmach -lexc.

/---[ Dave Butenhof ]-----------------------[ butenhof@zko.dec.com ]---\
| Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
| 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/


=================================TOP===============================
 Q54: How do I find Sun's bug database?  

>I am trying to use Thread Analyzer in Solaris 2.4 for performance
>tuning. But after loading the trace directory, tha exit with following
>error message: 
>Thread Analyzer Fatal Error[0]: Slave communication failure


It always helps if you state which version of the application you are
using, in this case the Thread Analyzer.

There have been a number of bugs which result in this error message
that have been fixed.  Please obtain the latest ThreadAnalyzer patch
from your Authorized Service Provider (ASP) or from our Wep page:

http://access1.sun.com/recpatches/DevPro.html
=================================TOP===============================
 Q55:  How do the various vendors' threads libraries compare?  

    Fundamentally, they are all based on the same paradigm, and everything
    you can do in one library you can (pretty much) do in any other.  Ease
    of programming and efficency will be the major distinctions.

OS                Preferred Threads POSIX Version   Kernel Support Sched model
---------------   ----------------- -------------   -------------- -------------
Solaris 2.5       UI-threads        1003.1c-1995    yes            2 level(1)
SVR4.2MP/UW 2.0   UI-threads        No
IRIX 6.1          sproc             No
IRIX 6.2          sproc             1003.1c-1995(2)
Digital UNIX 3.2  cma               Draft 4         yes            1 to 1
Digital UNIX 4.0  1003.1c-1995      1003.1c-1995    yes            2 level
DGUX 5.4          ?                 Draft 6         yes
NEXTSTEP          (cthreads?)       No
AIX 4.1           AIX Threads(3)    Draft 7         yes            1 to 1
Plan 9            rfork()           No
OpenVMS 6.2       cma               Draft 4         no
OpenVMS Alpha 7.0 1003.1c-1995      1003.1c-1995    yes            2 level
OpenVMS VAX 7.0   1003.1c-1995      1003.1c-1995    no
WinNT             Win32 threads     No
OS/2              DosCreateThread() Draft 4
Win32             Win32 threads     No              yes            1 to 1

Notes:

1) Solaris 2.5 blocks threads in kernel with LWP, but provides a signal to
allow user level scheduler to create a new LWP if desired (and
thr_setconcurrency() can create additional LWPs to minimize the chances of
losing concurrency due to blocking.)

2) According to IRIX 6.2 info on SGI's web, 1003.1c-1995 threads will be
provided only as part of the REACT/pro 3.0 Realtime Extensions kit, not in
the base O/S.

3) Can anyone clarify this? My impression is that AIX 4.1 favors 1003.4a/D7
threads; but then I've never heard the term "AIX Threads".

=================================TOP===============================
 Q56: Why don't I need to declare shared variables VOLATILE?  


> I'm concerned, however, about cases where both the compiler and the
> threads library fulfill their respective specifications.  A conforming
> C compiler can globally allocate some shared (nonvolatile) variable to
> a register that gets saved and restored as the CPU gets passed from
> thread to thread.  Each thread will have it's own private value for
> this shared variable, which is not what we want from a shared
> variable.

In some sense this is true, if the compiler knows enough about the
respective scopes of the variable and the pthread_cond_wait (or
pthread_mutex_lock) functions. In practice, most compilers will not try
to keep register copies of global data across a call to an external
function, because it's too hard to know whether the routine might
somehow have access to the address of the data.

So yes, it's true that a compiler that conforms strictly (but very
aggressively) to ANSI C might not work with multiple threads without
volatile. But someone had better fix it. Because any SYSTEM (that is,
pragmatically, a combination of kernel, libraries, and C compiler) that
does not provide the POSIX memory coherency guarantees does not CONFORM
to the POSIX standard. Period. The system CANNOT require you to use
volatile on shared variables for correct behavior, because POSIX
requires only that the POSIX synchronization functions are necessary.

So if your program breaks because you didn't use volatile, that's a BUG.
It may not be a bug in C, or a bug in the threads library, or a bug in
the kernel. But it's a SYSTEM bug, and one or more of those components
will have to work to fix it.

You don't want to use volatile, because, on any system where it makes
any difference, it will be vastly more expensive than a proper
nonvolatile variable. (ANSI C requires "sequence points" for volatile
variables at each expression, whereas POSIX requires them only at
synchronization operations -- a compute-intensive threaded application
will see substantially more memory activity using volatile, and, after
all, it's the memory activity that really slows you down.)

/---[ Dave Butenhof ]-----------------------[ butenhof@zko.dec.com ]---\
| Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
| 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
\-----------------[ Better Living Through Concurrency ]----------------/

=================================TOP===============================
 Q57: Do pthread_cleanup_push/pop HAVE to be macros (thus lexically scoped)?  

Paul Pelletier wrote:
 
I was about to use pthread_cleanup_push, when I noticed that it is
implemented as a macro (on Solaris 2.5) which forces you to have the
pthread_cleanup_pop in the same function by having an open brace { at the
end of the first macro and closing it int the second...  Since I want to
hide most of this stuff in something like a monitor (or a guard in ACE) in
C++ by using the push in a constructor and the pop in the destructor I'm
wondering if there is something fundamental that would prevent me to do so
or could I just re-implement the stuff done by the macros inside some class
services.
 

POSIX 1003.1c-1995 specifies that pthread_cleanup_push and
pthread_cleanup_pop must be used at the same lexical scope, "as if" the
former were a macro that expands to include an opening brace ("{") and the
latter were a macro that expands to include the matching closing brace
("}").

The Solaris 2.5 definition therefore conforms quite accurately to the intent
of the standard. And so does the Digital UNIX definition, for that
matter. If you can get away with "reverse engineering" the contents of the
macros, swell; but beware that this would NOT be a service to those using
your C++ package, as the results will be extremely non-portable. In fact, no
guarantees that it would work on later versions of Solaris, even assuming
strict binary compatibility in their implementation -- because they could
reasonably make "compatible" changes that would take advantage of various
assumptions regarding how those macros are used that you would be violating.

What you want to do has merit, but you have to remember that you're writing
in C++, not C. The pthread_cleanup_push and pthread_cleanup_pop macros are
the C language binding to the POSIX 1003.1c cancellation cleanup
capability. In C++, the correct implementation of this capability is already
built into the language... destructors. That is, C++ and threads should be
working together to ensure that C++ destructors are run when a thread is
cancelled. If that is done, you've got no problem. If it's not done, you've
got far worse problems anyway since you won't be "destructing" most of your
objects anyway.

/---[ Dave Butenhof ]-----------------------[ butenhof@zko.dec.com ]---\

=================================TOP===============================
 Q58: Thread Analyzer Fatal Error[0]: Slave communication failure ??  

>I am trying to use Thread Analyzer in Solaris 2.4 for performance
>tuning. But after loading the trace directory, tha exit with following
>error message: 
>Thread Analyzer Fatal Error[0]: Slave communication failure
>
>I do not know what happened. 

It always helps if you state which version of the application you are
using, in this case the Thread Analyzer.

There have been a number of bugs which result in this error message
that have been fixed.  Please obtain the latest ThreadAnalyzer patch
from your Authorized Service Provider (ASP) or from our Wep page:

http://access1.sun.com/recpatches/DevPro.html

Chuck Fisher 

=================================TOP===============================
 Q59: What is the status of Linux threads?  



=================================TOP===============================
 Q60: The Sunsoft debugger won't recognize my PThreads program!  

Nope.  The 3.0.2 version was written before the release of Sun's pthread
library.  However, if you simply include -lthread on the compile line, it
will come up and work.  It's a little bit redundant, but works fine.  Hence:

%cc -o one one.c -lpthread -lthread -lposix4 -g

=================================TOP===============================
 Q61: How are blocking syscall handled in a two-level system?  

> Martin Cracauer wrote:
> >
> > In a thread system that has both user threads and LWPs like Solaris,
> > how are blocking syscall handled?
> 
> Well, do you mean "like Solaris", or do you mean "Solaris"? There's no
> one answer for all systems. LWP, by the way, isn't a very general term.
> Lately I've been using the more cumbersome, but generic and relatively
> meaningful "kernel execution contexts". A process is a KEC, an LWP is a
> KEC, a "virtual processor" is a KEC, a Mach thread is a KEC, an IRIX
> sproc is a KEC, etc.
> 
> > By exchanging blocking syscalls to nonblocking like in a
> > pure-userlevel thread implementation?
> 
> Generally, only "pure user-mode" implementations, without any kernel
> support at all, resort to turning I/O into "nonblocking". It's just not
> an effective mechanism -- there are too many limitations to the UNIX
> nonblocking I/O model.
> 
> > Or by making sure a thread that calls a blocking syscall is on its own
> > LWP (the kernel is enterend anyway, so what would be the cost to do
> > so)?
> 
> Solaris 2.5 "latches" a user thread onto an LWP until it blocks in user
> mode -- on a mutex, a condition variable, or until it yields. User
> threads aren't timesliced, and they stick to the LWP across kernel
> blocks. If all LWPs in a process block in the kernel, a special signal
> allows the thread library to create a new one, but other than that you
> need to rely a lot on thr_setconcurrency.
> 
> Digital UNIX 4.0 works very differently. The kernel delivers "upcalls"
> to the user mode scheduler to communicate various state changes. User
> threads, for example, are timesliced on our KECs (which are a special
> form of Mach thread). When a thread blocks in the kernel, the user mode
> scheduler is informed so that a new user thread can be scheduled on the
> virtual processor immediately. The nice thing about this model is that
> we don't need anything like thr_setconcurrency to keep things running.
> Compute-bound user threads can't lock each other out unless one is
> SCHED_FIFO policy. And instead of "fixing things up" by adding a new
> kernel execution context when the last one blocks (giving you a
> concurrency level of 1), we keep you running at the maximum level of
> concurrency supportable -- the number of runnable user threads, or the
> number of physical processors, whichever is less.
> 
> Neither model (nor implementation) is perfect, and it would be safe to
> assume that both Digital and Sun are working on improving every aspect.
> The models may easily become very different in the future.
> 
> /---[ Dave Butenhof ]-----------------------[ butenhof@zko.dec.com ]---\
> | Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
> | 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
> \-----------------[ Better Living Through Concurrency ]----------------/

-- 
> Georges Brun-Cottan wrote:
> > So recursive mutex is far more than just a hack for lazy programmer or
> > just a way to incorporate non MT safe third party code. It is a tool
> > that you need in environment such OOP, where you can not or you do not
> > want to depend of an execution context.
> 
> Sorry, but I refuse to believe that good threaded design must end where
> OOP begins. There's no reason for two independently developed packages
> to share the same mutex. There's no reason for a package to be designed
> without awareness of where and when mutexes are locked. Therefore, in
> either case, recursive mutexes remain, at best, a convenience, and, at
> worst (and more commonly), a crutch.
> 
> I created the recursive mutex for DCE threads because we were dealing
> with a brand-new world of threading. We had no support from operating
> systems or other libraries. Hardly anything was "thread safe". The DCE
> thread "global mutex" allowed any thread-safe code to lock everything
> around a call to any unsafe code. As an intellectual exercise, I chose
> to implement the global mutex by demonstrating why we'd created the
> concept of "mutex attributes" -- previously, there had been none. As a
> result of this intellectual exercise, it became possible for anyone to
> conveniently create their own recursive mutex, which is locked and
> unlocked using the standard POSIX functions. There really wasn't any
> point to removing the attribute, since it's not that hard to create your
> own recursive mutex.
> 
> Remember that whenever you use recursive mutexes, you are losing
> performance -- recursive mutexes are more expensive to lock and unlock,
> even without mutex contention (and a recursive mutex created on top of
> POSIX thread synchronization is a lot more expensive than one using the
> mutex type attribute). You are also losing concurrency by keeping
> mutexes locked so long and across so much context that you become
> tempted to use recursive mutexes to deal with lock range conflicts.
> 
> Yes, it may be harder to avoid recursive mutexes. Although I've never
> yet seen a valid case proving that recursive mutexes are NECESSARY, I
> won't deny that there may be one or two. None of that changes the fact
> that an implementation avoiding recursive mutexes will perform, and
> scale, far better than one relying on recursive mutexes. If you're
> trying to take advantage of multithreading, all the extra effort in
> analysis and design will pay off in increased concurrency.
> 
> But, like any other aspect of performance analysis, you put the effort
> where the pay is big enough. There are non-critical areas of many
> libraries where avoiding recursive mutexes would be complicated and
> messy, and where the overhead of using them doesn't hurt performance
> significantly. Then, sure, use them. Just know what you're doing, and
> why.
> 
> /---[ Dave Butenhof ]-----------------------[ butenhof@zko.dec.com ]---\
> | Digital Equipment Corporation           110 Spit Brook Rd ZKO2-3/Q18 |
> | 603.881.2218, FAX 603.881.0120                  Nashua NH 03062-2698 |
> \-----------------[ Better Living Through Concurrency ]----------------/

=================================TOP===============================
 Q62: Can one thread read from a socket whil