Hi,
As discussed with Attila this morning, there were quite some issues on the rdo-jenkins.ci.centos.org VM in the last days (leading to jenkins slave not running and all jobs not running either). Lot of xfs issues on the /home partition, and no xfs_repair didn't seem to have helped. As agreed with Attila, I deployed a rdo-jenkins-2.ci.centos.org VM (on the same hypervisor) with a reduced vcpu/memory so that he can then configure it (through the RDO ansible playbooks) and migrate what would need to be migrated (like probably the jenkins workspace on that node)
After we'll have confirmation that it seems to work fine, we'll shutdown the previous VM, and we'll change the vcpu/memory settings on the newly configured VM.
On 09/16/2016 02:18 PM, Fabian Arrotin wrote:
Hi,
As discussed with Attila this morning, there were quite some issues on the rdo-jenkins.ci.centos.org VM in the last days (leading to jenkins slave not running and all jobs not running either). Lot of xfs issues on the /home partition, and no xfs_repair didn't seem to have helped. As agreed with Attila, I deployed a rdo-jenkins-2.ci.centos.org VM (on the same hypervisor) with a reduced vcpu/memory so that he can then configure it (through the RDO ansible playbooks) and migrate what would need to be migrated (like probably the jenkins workspace on that node)
After we'll have confirmation that it seems to work fine, we'll shutdown the previous VM, and we'll change the vcpu/memory settings on the newly configured VM.
First, thanks to Fabian for the awesome and quick help.
What happened from my side so far:
- copied the keys and .ssh dir from /home/rhos-ci to the new slave in /home/rdo-ci (setting this slave up with that user to be less confusing) - added the crontab entries from the old slave to start the java slave client automatically - installed java jdk and did quickstart.sh --install-deps to add the necessary packages to the system - reduced the executors on the slave from 13 to 8 to not overload the slave with the smaller VCPU/memory count
I started a periodic job on it and it seems to be running fine, so I added back all the regular labels to it and now it's running jobs.
David should probably install a couple of more stuff to make the weirdo jobs work as well.
The user ssh key stayed the same, but now it's for rdo-ci instead of rhos-ci, and the host is rdo-jenkins-2.ci.centos.org
Attila
I had to install some other things that were expected in the slave (i.e, pip, tox, cicoclient) but it was otherwise okay.
I added the slave to our monitoring. When are we planning to bump the resources on the new slave ? I brought up the three cloud slaves (4 threads each) in order to clear our growing job queue.
Also, can we keep the old slave around for a while longer, in a "just in case" basis ? Barebones 1 core/2GB RAM should be plenty, just to make sure we don't forget anything.
David Moreau Simard Senior Software Engineer | Openstack RDO
dmsimard = [irc, github, twitter]
On Fri, Sep 16, 2016 at 9:05 AM, Attila Darazs adarazs@redhat.com wrote:
On 09/16/2016 02:18 PM, Fabian Arrotin wrote:
Hi,
As discussed with Attila this morning, there were quite some issues on the rdo-jenkins.ci.centos.org VM in the last days (leading to jenkins slave not running and all jobs not running either). Lot of xfs issues on the /home partition, and no xfs_repair didn't seem to have helped. As agreed with Attila, I deployed a rdo-jenkins-2.ci.centos.org VM (on the same hypervisor) with a reduced vcpu/memory so that he can then configure it (through the RDO ansible playbooks) and migrate what would need to be migrated (like probably the jenkins workspace on that node)
After we'll have confirmation that it seems to work fine, we'll shutdown the previous VM, and we'll change the vcpu/memory settings on the newly configured VM.
First, thanks to Fabian for the awesome and quick help.
What happened from my side so far:
- copied the keys and .ssh dir from /home/rhos-ci to the new slave in
/home/rdo-ci (setting this slave up with that user to be less confusing)
- added the crontab entries from the old slave to start the java slave
client automatically
- installed java jdk and did quickstart.sh --install-deps to add the
necessary packages to the system
- reduced the executors on the slave from 13 to 8 to not overload the slave
with the smaller VCPU/memory count
I started a periodic job on it and it seems to be running fine, so I added back all the regular labels to it and now it's running jobs.
David should probably install a couple of more stuff to make the weirdo jobs work as well.
The user ssh key stayed the same, but now it's for rdo-ci instead of rhos-ci, and the host is rdo-jenkins-2.ci.centos.org
Attila
On 16/09/16 19:38, David Moreau Simard wrote:
I had to install some other things that were expected in the slave (i.e, pip, tox, cicoclient) but it was otherwise okay.
I added the slave to our monitoring. When are we planning to bump the resources on the new slave ? I brought up the three cloud slaves (4 threads each) in order to clear our growing job queue.
Also, can we keep the old slave around for a while longer, in a "just in case" basis ? Barebones 1 core/2GB RAM should be plenty, just to make sure we don't forget anything.
I can do that when you want : just shutdown the old box , (but don't delete -yet- the underlying disks), and bump memory/vpcu on the new slave. When can I proceed ?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 16/09/16 13:18, Fabian Arrotin wrote:
Hi,
As discussed with Attila this morning, there were quite some issues on the rdo-jenkins.ci.centos.org VM in the last days (leading to jenkins slave not running and all jobs not running either). Lot of xfs issues on the /home partition, and no xfs_repair didn't seem to have helped. As agreed with Attila, I deployed a rdo-jenkins-2.ci.centos.org VM (on the same hypervisor) with a reduced vcpu/memory so that he can then configure it (through the RDO ansible playbooks) and migrate what would need to be migrated (like probably the jenkins workspace on that node)
After we'll have confirmation that it seems to work fine, we'll shutdown the previous VM, and we'll change the vcpu/memory settings on the newly configured VM.
thanks for the quick turn around on this Fabian,
- -- Karanbir Singh, Project Lead, The CentOS Project +44-207-0999389 | http://www.centos.org/ | twitter.com/CentOS GnuPG Key : http://www.karan.org/publickey.asc