[CentOS] Cluster installation CentOS 7.4 network problems

Thu Nov 23 20:51:07 UTC 2017
Vadim Bulst <vadim.bulst at uni-leipzig.de>

Hi there,

after using Foreman successful on our clusters for more than a year. I'd 
like to reinstall a 90 node cluster with Centos 7.4. It's now running on 
Centos 7.3 . I'm not able to just update to 7.4 because of zfsonlinux 
dependencies and well - some nodes died and had to bare metal install them.

So I was able to install these nodes successfully by pxe-booting and 
using a regular CentOS mirror with foreman and kickstart. After the 
final reboot the nodes got no network connection at all and puppet 
wasn't able to pull of course. After logging in locally and restart 
NetworkManager the connection came up - sometimes on the first try 
sometimes on the second try. I never discovered such behavior with 
Centos 7.3 or 7.2.

Network properties:

DHCP, MTU 9000

DHCP-Server not Foreman managed, on different network

TFTP-Server Foreman managed, on different network

I've read one thread on stackexchange which describes a simular problem 
using a kickstart installation and dhcp network configuration on Centos 7.4


Does any body of you discovered similar problems?

This is what my provisioning template / kickstart template looks like:

url --url http://mirror.centos.org/centos/7.4.1708/os/x86_64 
lang en_US.UTF-8
selinux --enforcing
keyboard de

network --bootproto dhcp --hostname galaxy110.sc.uni-leipzig.de 
rootpw --iscrypted foo
firewall --service=ssh
authconfig --useshadow --passalgo=SHA256 --kickstart
timezone --utc Europe/Berlin
services --disabled 

bootloader --location=mbr --append="nofb quiet splash=quiet"

clearpart --initlabel --all
ignoredisk --only-use=sda
part biosboot --size 1 --fstype=biosboot --asprimary
part / --fstype=xfs --size=20480 --asprimary --ondisk=sda
part swap --size=131072 --ondisk=sda
part /var/log --fstype=xfs --size=10240 --ondisk=sda
part /home --fstype=xfs --size=10240 --grow --ondisk=sda



%post --nochroot
exec < /dev/tty3 > /dev/tty3
#changing to VT 3 so that we can see whats going on....
/usr/bin/chvt 3
cp -va /etc/resolv.conf /mnt/sysimage/etc/resolv.conf
/usr/bin/chvt 1
) 2>&1 | tee /mnt/sysimage/root/install.postnochroot.log
logger "Starting anaconda galaxy110.sc.uni-leipzig.de postinstall"
exec < /dev/tty3 > /dev/tty3
#changing to VT 3 so that we can see whats going on....
/usr/bin/chvt 3

#update local time
echo "updating system time"
/usr/sbin/ntpdate -sub
/usr/sbin/hwclock --systohc

# Yum proxy
echo 'proxy = http://proxy.uni-leipzig.de:3128' >> /etc/yum.conf

rpm -Uvh --httpproxy proxy.uni-leipzig.de --httpport 3128 

# update all the base packages from the updates repository
if [ -f /usr/bin/dnf ]; then
   dnf -y update
   yum -t -y update

# SSH keys setup snippet for Remote Execution plugin
# Parameters:
# remote_execution_ssh_keys: public keys to be put in ~/.ssh/authorized_keys
# remote_execution_ssh_user: user for which remote_execution_ssh_keys 
will be
#                            authorized
# remote_execution_create_user: create user if it not already existing
# remote_execution_effective_user_method: method to switch from ssh user to
#                                         effective user
# This template sets up SSH keys in any host so that as long as your public
# SSH key is in remote_execution_ssh_keys, you can SSH into a host. This 
# works in combination with Remote Execution plugin.

# The Remote Execution plugin queries smart proxies to build the
# remote_execution_ssh_keys array which is then made available to this 
# via the host's parameters. There is currently no way of supplying this
# parameter manually.
# See http://projects.theforeman.org/issues/16107 for details.

rpm -Uvh --httpproxy proxy.uni-leipzig.de --httpport 3128 

if [ -f /usr/bin/dnf ]; then
   dnf -y install puppet-agent
   yum -t -y install puppet-agent

cat > /etc/puppetlabs/puppet/puppet.conf << EOF

vardir = /opt/puppetlabs/puppet/cache
logdir = /var/log/puppetlabs/puppet
rundir = /var/run/puppetlabs
ssldir = /etc/puppetlabs/puppet/ssl

pluginsync      = true
report          = true
ignoreschedules = true
ca_server       = urzlxdeploy.rz.uni-leipzig.de
certname        = galaxy110.sc.uni-leipzig.de
environment     = production
server          = urzlxdeploy.rz.uni-leipzig.de


/usr/bin/systemctl list-unit-files | grep -q puppetagent && 
/usr/bin/systemctl enable ${puppet_unit}
/sbin/chkconfig --level 345 puppet on

# export a custom fact called 'is_installer' to allow detection of the 
installer environment in Puppet modules
export FACTER_is_installer=true
# passing a non-existent tag like "no_such_tag" to the puppet agent only 
initializes the node
/opt/puppetlabs/bin/puppet agent --config 
/etc/puppetlabs/puppet/puppet.conf --onetime --tags no_such_tag --server 
urzlxdeploy.rz.uni-leipzig.de --no-daemonize


# Inform the build system that we are done.
echo "Informing Foreman that we are built"
wget -q -O /dev/null --no-check-certificate 
) 2>&1 | tee /root/install.post.log
exit 0


Thanks in advance for your suggestions.



Vadim Bulst

Universität Leipzig / URZ
04109  Leipzig, Augustusplatz 10

phone: +49-341-97-33380
mail:    vadim.bulst at uni-leipzig.de