Not sure where to ask this, I've been googling and not finding much that is helpful.
At work I've got a multi-threaded program targeted at Linux. It compiles on RHEL 2.1 and 3, and is targeted at 2.1, 3, and 4. UP until today, the binary built on 3 has worked fine on 4.
But on RHEL4 update 4 it dies a horrible death in pthread_create. I have reproduced the problem on Centos 4.4, where I am tyring to debug it.
I thought it might be some newly-introduced binary incompatibility in the U4 update, so I recompiled it on Centos 4.4, and it does the same thing.
This code has been running for 5 years both in-house and at dozens of customers, on versions of RH linux ranging from 6.2, and 7.x, as well as RHEL 2.1 and 3, so I'm leaning toward something that is not a blatant bug in the code.
Anyway, on the either the 2nd or the 3rd (doesn't seem to be consistently one or the other) call to pthread_create we get a SIGSEGV, and looking at a stack backtrace the stack appears to be trashed, or at least the backtrace does not reflect reality. Single-stepping into the call doesn't get you anywhere except a sigsegv. I'm about to delve into it at the instruction level, but I must admit I'm not very knowledeable about 32-bit intel assembly language.
Needless to say I've read and re-read and re-re-read the man page for pthread_create(), and have been tweaking the calling sequence in subtle ways to see if I can change the behavior, but so far to no avail.
Has anyone here any knowledge of possible problems or incompatibilities in the NPTL implementation in 4.4?
Thanks!