[CentOS] GFS performance issue

Mon Jul 19 15:41:07 UTC 2010
Fred Wittekind <rom at twister.dyndns.org>

 Two web servers, both virtualized with CentOS Xen servers as host
(residing on two different physical servers). 
GFS used to store home directories containing web document roots.

Shared block device used by GFS is an ISCSI target with the ISCSI
initiator residing on the Dom-0, and presented to Dom-U webservers as
drives.
Also, providing a second shared block device for quorum disk.

If I hit the web site on just one of the nodes, it behaves as expected. 
If I try to load the websites from both nodes at the same time (two web
browser instances), the load average on both nodes spikes, and page
loads very slowly.  The site I am trying to host is very high traffic,
and if the servers can be nearly brought to there knees with two web
browser instances running on a single workstation, that's not going to work.

I am not seeing any error messages in the logs regarding the cluster.

Any help or suggestions on how to troubleshoot this issue would be
greatly appreciated.


[root at www3 www]# find ./ | wc -l
64815

[root at www3 www]# gfs2_tool df /home/www
/home/www:
  SB lock proto = "lock_dlm"
  SB lock table = "Web:homewww"
  SB ondisk format = 1801
  SB multihost format = 1900
  Block size = 4096
  Journals = 2
  Resource Groups = 316
  Mounted lock proto = "lock_dlm"
  Mounted lock table = "Web:homewww"
  Mounted host data = "jid=1:id=393217:first=0"
  Journal number = 1
  Lock module flags = 0
  Local flocks = FALSE
  Local caching = FALSE

  Type           Total Blocks   Used Blocks    Free Blocks    use%
  ------------------------------------------------------------------------
  data           20707148       4347237        16359911       21%
  inodes         16426386       66475          16359911       0%

[root at www3 www]# gfs2_tool gettune /home/www
new_files_directio = 0
new_files_jdata = 0
quota_scale = 1.0000   (1, 1)
logd_secs = 1
recoverd_secs = 60
statfs_quantum = 30
stall_secs = 600
quota_cache_secs = 300
quota_simul_sync = 64
statfs_slow = 0
complain_secs = 10
max_readahead = 262144
quota_quantum = 60
quota_warn_period = 10
jindex_refresh_secs = 60
log_flush_secs = 60
incore_log_blocks = 1024

[root at www3 www]# cat /etc/cluster/cluster.conf |egrep '(dlm)|(gfs)'
        <dlm plock_ownership="1" plock_rate_limit="0"/>
        <gfs_controld plock_rate_limit="0"/>


Fred Wittekind