[CentOS] db2 is running out of shared memory

Mon Mar 30 15:32:19 UTC 2015
Harold Pritchett <harold at uga.edu>

Db2 is running our of shared memory.

The environment:
      Virtual computers running under VMware ESXi hypervisor version 5.5.
Two different virtual machines, each running CentOS 5, db2 express C,
and Apache/PHP and vmware tools.  The machines each have 8 GB of memory,
4 processors, and a couple of hundred GB of disk space available.
Everything is running on the 64 bit versions. The actual hardware for VMware
is a dual processor machine with two Xeon E5410 processors, each with four
cpu cores and 32 GB of ram.

The difference in the two virtual machines is that one is running db2
version 9.7.1 and the other is running db2 version 10.5.5.

The error occurs while restoring a backup taken on another server to the test
server in question.  Worse than that, it occurs when restoring a second
database after the first restore works OK.

The error message we are getting is this:
     On the 9.7 system:
RESTORE DATABASE XYZZY USER db2inst9 using
     FROM '/db2home/restore1' INTO XYZZZ REPLACE
     EXISTING WITHOUT ROLLING FORWARD WITHOUT PROMPTING
     SQL1084C  Shared memory segments cannot be allocated. SQLSTATE=57019

     On the 10.5 system:
RESTORE DATABASE XYZZY USER db2inst9 using
      FROM '/db2home/restore1' INTO XYZZZ REPLACE
     EXISTING WITHOUT ROLLING FORWARD WITHOUT PROMPTING
     SQL1084C  The database manager failed to allocate
     shared memory because an`operating system kernel
     memory limit has been reached.  SQLSTATE=57019

Everything I have read online says to increas the value of Shared Memory Size
in the /etc/sysctl.conf file.  However, on these systems it is already set to
a value eight times larger than the actual memory size.  The command "ipcs -l"
give:

[root at cowtest etc]# ipcs -l

------ Shared Memory Limits --------
max number of segments = 4096
max seg size (kbytes) = 67108864
max total shared memory (kbytes) = 17179869184
min seg size (bytes) = 1

------ Semaphore Limits --------
max number of arrays = 1024
max semaphores per array = 250
max semaphores system wide = 256000
max ops per semop call = 32
semaphore max value = 32767

------ Messages: Limits --------
max queues system wide = 2048
max size of message (bytes) = 65536
default max size of queue (bytes) = 65536

[root at cowtest etc]#

the file /etc/sysctl.conf has the following in it

# Controls the maximum size of a message, in bytes
kernel.msgmax = 65536

# Controls the maximum shared segment size, in bytes
kernel.shmmax = 68719476736

# Controls the maximum number of shared memory segments, in pages
kernel.shmall = 4294967296

These numbers just don't make sense.  kernel.shmmax is set to 64 GB, 8 times
the actual memory of the machine.  the value of "Total Max Shared Memory"
from the ipcs command is 16 TB!

I have other machine running on real hardware with only 6 GB of memory on
which this works fine.  What is so strange about the virtual machines?

I'm posting this to both the db2 and CentOS mailing lists.  Sorry for the
duplication if you are subscribed to both.

Harold

Attached below is the complete output of the backup.restore job which ran on
the db2 10.5 machine. The backup was taken on a db2 9.7 machine, hence the
message about database upgraded to current db2 version.

backup.restore 1.0.0 - submitted Mon Mar 30 05:45:01 EDT 2015 on cowtest
waiting...  1  05:45:01
waiting...  2  05:46:01
waiting...  3  05:47:01
waiting...  4  05:48:01
waiting...  5  05:49:03
backup.restore - started Mon Mar 30 05:50:03 EDT 2015 on cowtest
File to be restored is: /db2home/www/daily/ANIMALS.0.db2inst9.NODE0000.CATN0000.20150330030114.001.bz2
Testing file to be restored for integrity at Mon Mar 30 05:50:03 EDT 2015
unzipping backup file
------------ /db2home/db2inst9/archivelogs/animals ------------
total 0
------------ /db2home/db2inst9/archivelogs/animals ------------
SQL1064N  DB2STOP processing was successful.
Waiting for 1 minute...
SQL1063N  DB2START processing was successful.
ATTACH TO db2inst9

    Instance Attachment Information

  Instance server        = DB2/LINUXX8664 10.5.5
  Authorization ID       = DB2INST9
  Local instance alias   = DB2INST9


DEACTIVATE DATABASE ANIMALS
DB20000I  The DEACTIVATE DATABASE command completed successfully.

RESTORE DATABASE ANIMALS USER db2inst9 using
FROM '/db2home/restore1' REPLACE EXISTING WITHOUT
ROLLING FORWARD WITHOUT PROMPTING
SQL2555I  The database was restored and then successfully upgraded to the
current DB2 release where you issued the RESTORE DATABASE command.

CONNECT to animals user db2inst9 using

    Database Connection Information

  Database server        = DB2/LINUXX8664 10.5.5
  SQL authorization ID   = DB2INST9
  Local database alias   = ANIMALS


SELECT count(bhid) from is3.dna

1
-----------
         220

   1 record(s) selected.


DETACH
DB20000I  The DETACH command completed successfully.

DISCONNECT CURRENT
DB20000I  The SQL DISCONNECT command completed successfully.

ACTIVATE DATABASE ANIMALS
DB20000I  The ACTIVATE DATABASE command completed successfully.

TERMINATE
DB20000I  The TERMINATE command completed successfully.

ATTACH TO db2inst9

    Instance Attachment Information

  Instance server        = DB2/LINUXX8664 10.5.5
  Authorization ID       = DB2INST9
  Local instance alias   = DB2INST9


RESTORE DATABASE ANIMALS USER db2inst9 using
FROM '/db2home/restore1' INTO VERIFY REPLACE EXISTING
WITHOUT ROLLING FORWARD WITHOUT PROMPTING
SQL1084C  The database manager failed to allocate shared memory because an
operating system kernel memory limit has been reached. SQLSTATE=57019

connect to verify user db2inst9 using
SQL1084C  The database manager failed to allocate shared memory because an
operating system kernel memory limit has been reached. SQLSTATE=57019

select count(bhid) from is3.dna
SQL1024N  A database connection does not exist. SQLSTATE=08003

DETACH
DB20000I  The DETACH command completed successfully.

TERMINATE
DB20000I  The TERMINATE command completed successfully.

system is cowtest
SQL1064N  DB2STOP processing was successful.
SQL1063N  DB2START processing was successful.
DB20000I  The ACTIVATE DATABASE command completed successfully.
------------ /db2home/db2inst9/archivelogs/20150330 ------------
total 0
------------ /db2home/db2inst9/archivelogs/20150330 ------------
backup.complete file has already been removed...
backup.restore - ended Mon Mar 30 07:44:52 EDT 2015 on cowtest