[CentOS] scheduling differences between CentOS 4 and CentOS 5?

Tue May 24 03:23:37 UTC 2011
Mag Gam <magawake at gmail.com>

I would like to confirm Matt's claim. I too experienced larger
latencies with Centos 5.x compared to 4.x. My application is very
network sensitive and its easy to prove using lat_tcp.

I am curious about identifying the problem. What tools do you
recommend to find where the latency is coming from in the application?

On Fri, May 20, 2011 at 2:46 PM, R P Herrold <herrold at owlriver.com> wrote:
> On Fri, 20 May 2011, Matt Garman wrote:
>> We have several latency-sensitive "pipeline"-style programs that have
>> a measurable performance degredation when run on CentOS 5.x versus
>> CentOS 4.x.
>> By "pipeline" program, I mean one that has multiple threads.  The
>> mutiple threads work on shared data.  Between each thread, there is a
>> queue.  So thread A gets data, pushes into Qab, thread B pulls from
>> Qab, does some processing, then pushes into Qbc, thread C pulls from
>> Qbc, etc.  The initial data is from the network (generated by a 3rd
>> party).
>> We basically measure the time from when the data is received to when
>> the last thread performs its task.  In our application, we see an
>> increase of anywhere from 20 to 50 microseconds when moving from
>> CentOS 4 to CentOS 5.
>> Anyone have any experience with this?  Perhaps some more areas to investigate?
> We do procesing similar to this with financials markets
> datastreams.  You do not say, but I assume you are blocking on
> a select, rather than polling [polling is bad here].  Also you
> do not say if all threds are under a common process'
> ownership.  If not, mod complexity of debugging threading, you
> may want to do so
> I say this, because in our testing (both with all housed in a
> single process, and when using co-processes fed through an
> anaoymous pipe), we will occasionally get hit with a context
> or process switch, which messes up the latencies something
> fierce.  An 'at' or 'cron' job firing off can ruin the day as
> well
> Also, system calls are to be avoided, as the timing on when
> (and if, and in what order) one gets returned to, is not
> something controllable in userspace
> Average latencies are not so meaningful here ... collecton of
> all dispatch and return data and explaining the outliers is
> probably a good place to continue with afer addresing the
> foregoing.  graphviz, and gnuplot are lovely for doing this
> kind of visualization
> -- Russ herrold
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> http://lists.centos.org/mailman/listinfo/centos