pdflush kernel thread pops up every 10 seconds or so and video decoding grinds to a halt for 1/2 a second

List overview All Threads
Download

newer

older

excel parser (preferably perl)?

Installing Adaptec Storage Manager...

Aleksey Tsalolikhin

19 Oct 2010 19 Oct '10

1:25 a.m.

Hi. A friend of mine was doing real-time video decoding on Fedora Core 13 and he had a performance glitch (1/2 a second freeze) every 5-10 seconds. "top" showed flush-253:0 process at the moment of the freeze.

Major device number 253 corresponds to device-mapper. I advised my friend to re-install his FC13 without LVM, to see if the glitch is related to LVM.

After re-installing FC13 without LVM, he is seeing the glitch every 10 seconds, and it shows flush-8:16 where before it said flush-253:0. 8 is scsi disk driver. So it's not an LVM thing... maybe a kernel thing?

I suggested he try Fedora 14 beta, as it has a newer kernel. Maybe this kernel thing is fixed in the newer kernel.

He also tried CentOS 5.5, and saw a "pdflush" process popping up with the same frequency, and resulting in a similar glitch.

Any other suggestions?

Best, -at

Show replies by date

JohnS

19 Oct 19 Oct

4:08 a.m.

New subject: pdflush kernel thread pops up every 10 seconds or so and video decoding grinds to a halt for 1/2 a second

On Mon, 2010-10-18 at 18:25 -0700, Aleksey Tsalolikhin wrote:

...

Hi. A friend of mine was doing real-time video decoding on Fedora Core 13 and he had a performance glitch (1/2 a second freeze) every 5-10 seconds. "top" showed flush-253:0 process at the moment of the freeze.

And what is the Priority of it running at? How many Cores also?

...

He also tried CentOS 5.5, and saw a "pdflush" process popping up with the same frequency, and resulting in a similar glitch.

Any other suggestions?

I would not even be concerned. ATM I am seeing pdflush on a server pop every second. load average: 4.51, 3.41, 3.51

I would only be concerned with the Freeze. What's uname -a on the CentOS machine?

John

Aleksey Tsalolikhin

7:34 p.m.

On Mon, Oct 18, 2010 at 9:08 PM, JohnS jses27@gmail.com wrote:

...

On Mon, 2010-10-18 at 18:25 -0700, Aleksey Tsalolikhin wrote:

...
Hi. A friend of mine was doing real-time video decoding on Fedora Core 13 and he had a performance glitch (1/2 a second freeze) every 5-10 seconds. "top" showed flush-253:0 process at the moment of the freeze.

And what is the Priority of it running at? How many Cores also?

He sees this issue at normal priority and at nice -n -19 / -20.

He has 6 cores with hyperthreading on

3.8 Ghz, the memory is 1.850 Mhz

The system is 980x Intel 6 core

He just told me he has two modes for his decoding program, in one mode the system does not write to disk at all, and there are NO GLITCHES doing it this way; another way, it writes lots of little files as it decodes, and the glitch happens actually every 5-20 seconds.

Would like to get to the bottom of this so he can decode with temp files and without glitches.

Cheers, Aleksey

Toby Bluhm

7:48 p.m.

On 10/19/2010 3:34 PM, Aleksey Tsalolikhin wrote:

...

On Mon, Oct 18, 2010 at 9:08 PM, JohnSjses27@gmail.com wrote:

...
On Mon, 2010-10-18 at 18:25 -0700, Aleksey Tsalolikhin wrote:

...
Hi. A friend of mine was doing real-time video decoding on Fedora Core 13 and he had a performance glitch (1/2 a second freeze) every 5-10 seconds. "top" showed flush-253:0 process at the moment of the freeze.

And what is the Priority of it running at? How many Cores also?

He sees this issue at normal priority and at nice -n -19 / -20.

He has 6 cores with hyperthreading on

3.8 Ghz, the memory is 1.850 Mhz

The system is 980x Intel 6 core

He just told me he has two modes for his decoding program, in one mode the system does not write to disk at all, and there are NO GLITCHES doing it this way; another way, it writes lots of little files as it decodes, and the glitch happens actually every 5-20 seconds.

Would like to get to the bottom of this so he can decode with temp files and without glitches.

Ext3 filesystem? Maybe altering the commit option at mount time would help:

http://www.mjmwired.net/kernel/Documentation/filesystems/ext3.txt#49

Aleksey Tsalolikhin

9:33 p.m.

On Tue, Oct 19, 2010 at 12:48 PM, Toby Bluhm toby.bluhm@alltechmedusa.com wrote:

...

Ext3 filesystem? Maybe altering the commit option at mount time would help:

http://www.mjmwired.net/kernel/Documentation/filesystems/ext3.txt#49

Good one, Tony! We'll try that. Thanks!!

Aleksey

Ross Walker

11 p.m.

New subject: pdflush kernel thread pops up every 10 seconds or so and video decoding grinds to a halt for 1/2 a second

On Oct 19, 2010, at 5:33 PM, Aleksey Tsalolikhin atsaloli.tech@gmail.com wrote:

...

On Tue, Oct 19, 2010 at 12:48 PM, Toby Bluhm toby.bluhm@alltechmedusa.com wrote:

...
Ext3 filesystem? Maybe altering the commit option at mount time would help:

http://www.mjmwired.net/kernel/Documentation/filesystems/ext3.txt#49

Good one, Tony! We'll try that. Thanks!!

You could also reduce the dirty interval in sysctl so it flushes sooner therefore flushes less data each time.

-Ross

Aleksey Tsalolikhin

11:34 p.m.

Still seeing the glitch every 5-20 secs after remounting with "commit=6000".

On Tue, Oct 19, 2010 at 4:00 PM, Ross Walker rswwalker@gmail.com wrote:

...

You could also reduce the dirty interval in sysctl so it flushes sooner therefore flushes less data each time.

OK. It's worth a shot. Any idea what the default value is? I'm not sure what value to put in here. I know I want to reduce it but I don't want to break my friend's system either.

http://www.mjmwired.net/kernel/Documentation/sysctl/vm.txt 109 dirty_expire_centisecs 110 111 This tunable is used to define when dirty data is old enough to be eligible 112 for writeout by the pdflush daemons. It is expressed in 100'ths of a second. 113 Data which has been dirty in-memory for longer than this interval will be 114 written out next time a pdflush daemon wakes up.

Thanks, -at

Aleksey Tsalolikhin

20 Oct 20 Oct

12:34 a.m.

"uname -a" shows:

Linux localhost.localdomain 2.6.18-194.17.1.el5 #1 SMP Wed Sep 29 12:50:31 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux

Ross Walker

1:21 p.m.

New subject: pdflush kernel thread pops up every 10 seconds or so and video decoding grinds to a halt for 1/2 a second

On Oct 19, 2010, at 7:34 PM, Aleksey Tsalolikhin atsaloli.tech@gmail.com wrote:

...

Still seeing the glitch every 5-20 secs after remounting with "commit=6000".

On Tue, Oct 19, 2010 at 4:00 PM, Ross Walker rswwalker@gmail.com wrote:

...
You could also reduce the dirty interval in sysctl so it flushes sooner therefore flushes less data each time.

OK. It's worth a shot. Any idea what the default value is? I'm not sure what value to put in here. I know I want to reduce it but I don't want to break my friend's system either.

http://www.mjmwired.net/kernel/Documentation/sysctl/vm.txt

109 dirty_expire_centisecs 110 111 This tunable is used to define when dirty data is old enough to be eligible 112 for writeout by the pdflush daemons. It is expressed in 100'ths of a second. 113 Data which has been dirty in-memory for longer than this interval will be 114 written out next time a pdflush daemon wakes up.

There are several dirty tunables, try a 'sysctl -a | grep dirty'

Try limiting both the amount if dirty memory to hold and the number of seconds to hold it. Defaults are way too liberal and if you are processing a lot of data can both expose you to extreme data loss in a system failure and bottle neck your storage during pdflush.

-Ross

JohnS

1:36 p.m.

New subject: pdflush kernel thread pops up every 10 seconds or so and video decoding grinds to a halt for 1/2 a second

On Tue, 2010-10-19 at 16:34 -0700, Aleksey Tsalolikhin wrote:

...

Still seeing the glitch every 5-20 secs after remounting with "commit=6000".

On Tue, Oct 19, 2010 at 4:00 PM, Ross Walker rswwalker@gmail.com wrote:

...
You could also reduce the dirty interval in sysctl so it flushes sooner therefore flushes less data each time.

OK. It's worth a shot. Any idea what the default value is? I'm not sure what value to put in here. I know I want to reduce it but I don't want to break my friend's system either.

http://www.mjmwired.net/kernel/Documentation/sysctl/vm.txt

109 dirty_expire_centisecs 110 111 This tunable is used to define when dirty data is old enough to be eligible 112 for writeout by the pdflush daemons. It is expressed in 100'ths of a second. 113 Data which has been dirty in-memory for longer than this interval will be 114 written out next time a pdflush daemon wakes up.

---- Maybe add in this also vm.dirty_ratio = 50 Start at 0 and work your way up.

John

Lamar Owen

9:50 p.m.

New subject: pdflush kernel thread pops up every 10 seconds or so and video decoding grinds to a halt for 1/2 a second

On Monday, October 18, 2010 09:25:41 pm Aleksey Tsalolikhin wrote:

...

Hi. A friend of mine was doing real-time video decoding on Fedora Core 13 and he had a performance glitch (1/2 a second freeze) every 5-10 seconds. "top" showed flush-253:0 process at the moment of the freeze.

[snip]

...

He also tried CentOS 5.5, and saw a "pdflush" process popping up with the same frequency, and resulting in a similar glitch.

Any other suggestions?

What kind of hard drive is this? How is/are your drive(s) set up?

You need the iostat program (in the sysstat package, I think) to give you more detail; there are some pointers to its use in this list's archives.

Aleksey Tsalolikhin

10:23 p.m.

Thanks, Ross, JohnS and Lamar for your kind responses. It turned out my friend is using 7200 RPM disk for his write-lots-of-little-files activity so we're looking at upgrading that to 15000 RPM or getting him a FusionIO memory card.

Thank you!

Aleksey

On Wed, Oct 20, 2010 at 2:50 PM, Lamar Owen lowen@pari.edu wrote:

...

On Monday, October 18, 2010 09:25:41 pm Aleksey Tsalolikhin wrote:

...
Hi. A friend of mine was doing real-time video decoding on Fedora Core 13 and he had a performance glitch (1/2 a second freeze) every 5-10 seconds. "top" showed flush-253:0 process at the moment of the freeze.

[snip]

...
He also tried CentOS 5.5, and saw a "pdflush" process popping up with the same frequency, and resulting in a similar glitch.

Any other suggestions?

What kind of hard drive is this? How is/are your drive(s) set up?

You need the iostat program (in the sysstat package, I think) to give you more detail; there are some pointers to its use in this list's archives. _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos

5614

Age (days ago)

5615

Last active (days ago)

discuss@lists.centos.org

11 comments

5 participants

tags (0)

participants (5)

Aleksey Tsalolikhin
JohnS
Lamar Owen
Ross Walker
Toby Bluhm