[CentOS] Cluster installation CentOS 7.4 network problems

Thu Nov 23 20:51:07 UTC 2017
Vadim Bulst <vadim.bulst at uni-leipzig.de>

Hi there,

after using Foreman successful on our clusters for more than a year. I'd 
like to reinstall a 90 node cluster with Centos 7.4. It's now running on 
Centos 7.3 . I'm not able to just update to 7.4 because of zfsonlinux 
dependencies and well - some nodes died and had to bare metal install them.

So I was able to install these nodes successfully by pxe-booting and 
using a regular CentOS mirror with foreman and kickstart. After the 
final reboot the nodes got no network connection at all and puppet 
wasn't able to pull of course. After logging in locally and restart 
NetworkManager the connection came up - sometimes on the first try 
sometimes on the second try. I never discovered such behavior with 
Centos 7.3 or 7.2.

Network properties:

DHCP, MTU 9000

DHCP-Server not Foreman managed, on different network

TFTP-Server Foreman managed, on different network


I've read one thread on stackexchange which describes a simular problem 
using a kickstart installation and dhcp network configuration on Centos 7.4

https://unix.stackexchange.com/questions/396096/centos-7-network-service-failed-to-start-because-systemd-starts-the-daemon-too


Does any body of you discovered similar problems?

This is what my provisioning template / kickstart template looks like:



install
url --url http://mirror.centos.org/centos/7.4.1708/os/x86_64 
--proxy=http://proxy.uni-leipzig.de:3128
lang en_US.UTF-8
selinux --enforcing
keyboard de
skipx

network --bootproto dhcp --hostname galaxy110.sc.uni-leipzig.de 
--device=somemacaddress
rootpw --iscrypted foo
firewall --service=ssh
authconfig --useshadow --passalgo=SHA256 --kickstart
timezone --utc Europe/Berlin
services --disabled 
gpm,sendmail,cups,pcmcia,isdn,rawdevices,hpoj,bluetooth,openibd,avahi-daemon,avahi-dnsconfd,hidd,hplip,pcscd




bootloader --location=mbr --append="nofb quiet splash=quiet"


zerombr
clearpart --initlabel --all
ignoredisk --only-use=sda
part biosboot --size 1 --fstype=biosboot --asprimary
part / --fstype=xfs --size=20480 --asprimary --ondisk=sda
part swap --size=131072 --ondisk=sda
part /var/log --fstype=xfs --size=10240 --ondisk=sda
part /home --fstype=xfs --size=10240 --grow --ondisk=sda




text
reboot

%packages
yum
dhclient
ntp
wget
@Core
redhat-lsb-core
%end

%post --nochroot
exec < /dev/tty3 > /dev/tty3
#changing to VT 3 so that we can see whats going on....
/usr/bin/chvt 3
(
cp -va /etc/resolv.conf /mnt/sysimage/etc/resolv.conf
/usr/bin/chvt 1
) 2>&1 | tee /mnt/sysimage/root/install.postnochroot.log
%end
%post
logger "Starting anaconda galaxy110.sc.uni-leipzig.de postinstall"
exec < /dev/tty3 > /dev/tty3
#changing to VT 3 so that we can see whats going on....
/usr/bin/chvt 3
(




#update local time
echo "updating system time"
/usr/sbin/ntpdate -sub 139.18.1.2
/usr/sbin/hwclock --systohc

# Yum proxy
echo 'proxy = http://proxy.uni-leipzig.de:3128' >> /etc/yum.conf

rpm -Uvh --httpproxy proxy.uni-leipzig.de --httpport 3128 
https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm


# update all the base packages from the updates repository
if [ -f /usr/bin/dnf ]; then
   dnf -y update
else
   yum -t -y update
fi


# SSH keys setup snippet for Remote Execution plugin
#
# Parameters:
#
# remote_execution_ssh_keys: public keys to be put in ~/.ssh/authorized_keys
#
# remote_execution_ssh_user: user for which remote_execution_ssh_keys 
will be
#                            authorized
#
# remote_execution_create_user: create user if it not already existing
#
# remote_execution_effective_user_method: method to switch from ssh user to
#                                         effective user
#
# This template sets up SSH keys in any host so that as long as your public
# SSH key is in remote_execution_ssh_keys, you can SSH into a host. This 
only
# works in combination with Remote Execution plugin.

# The Remote Execution plugin queries smart proxies to build the
# remote_execution_ssh_keys array which is then made available to this 
template
# via the host's parameters. There is currently no way of supplying this
# parameter manually.
# See http://projects.theforeman.org/issues/16107 for details.






rpm -Uvh --httpproxy proxy.uni-leipzig.de --httpport 3128 
https://yum.puppetlabs.com/puppetlabs-release-pc1-el-7.noarch.rpm




if [ -f /usr/bin/dnf ]; then
   dnf -y install puppet-agent
else
   yum -t -y install puppet-agent
fi

cat > /etc/puppetlabs/puppet/puppet.conf << EOF


[main]
vardir = /opt/puppetlabs/puppet/cache
logdir = /var/log/puppetlabs/puppet
rundir = /var/run/puppetlabs
ssldir = /etc/puppetlabs/puppet/ssl

[agent]
pluginsync      = true
report          = true
ignoreschedules = true
ca_server       = urzlxdeploy.rz.uni-leipzig.de
certname        = galaxy110.sc.uni-leipzig.de
environment     = production
server          = urzlxdeploy.rz.uni-leipzig.de

EOF

puppet_unit=puppet
/usr/bin/systemctl list-unit-files | grep -q puppetagent && 
puppet_unit=puppetagent
/usr/bin/systemctl enable ${puppet_unit}
/sbin/chkconfig --level 345 puppet on

# export a custom fact called 'is_installer' to allow detection of the 
installer environment in Puppet modules
export FACTER_is_installer=true
# passing a non-existent tag like "no_such_tag" to the puppet agent only 
initializes the node
/opt/puppetlabs/bin/puppet agent --config 
/etc/puppetlabs/puppet/puppet.conf --onetime --tags no_such_tag --server 
urzlxdeploy.rz.uni-leipzig.de --no-daemonize






sync

# Inform the build system that we are done.
echo "Informing Foreman that we are built"
wget -q -O /dev/null --no-check-certificate 
http://urzlxdeploy.rz.uni-leipzig.de/unattended/built
) 2>&1 | tee /root/install.post.log
exit 0

%end

Thanks in advance for your suggestions.

Cheers,

Vadim


-- 
Vadim Bulst

Universität Leipzig / URZ
04109  Leipzig, Augustusplatz 10

phone: +49-341-97-33380
mail:    vadim.bulst at uni-leipzig.de