Re: [Ci-users] Changes to CentOS CI: reminder of Phase 1 and 2

22 Aug 2022


      On 19/08/2022 15:31, František Šumšal wrote:
...
Hey,
On 8/19/22 14:23, Camila Granella wrote:
...
Hello!
I understand that the metal machines are expensive, and I'm not 
sure how many other projects are eventually going to migrate over to 
them, but I guess in the future some balance will need to be found out 
between the cost and available metal nodes. Is this even up to a 
discussion, or the size of the metal pools is given and can't/won't be 
adjusted?
We're looking to optimize resource usage with the recent changes to 
CentOS CI. From our side, the goal is to find a balance between 
adjusting to tenants' needs (there are adaptations we could do to have 
more nodes available with an increase in resource consumption) and 
adjusting projects workflows to use EC2.
I'd appreciate your suggestions on mitigating how to make workflows 
more adaptable to EC2.
The main blocker for many projects is that EC2 VMs don't support nested 
virtualization, which is really unfortunate, since using the EC2 metal 
machines is indeed a "bit" overkill in many scenarios (ours included). I 
spent a week playing with various approaches to avoid this requirement, 
but failed (in our case it would be running the VMs with TCG instead of 
KVM, but that makes the tests flaky/unreliable in many cases, and some 
of them run for several hours with this change).
Going through many online resources just confirms this - EC2 VMs don't 
support nested virt[0], which is sad, since, for example, Microsoft's 
Azure apparently supports it[1][2] (and Google's Compute Engine 
apparently supports it as well from a quick lookup).
I'm not really sure if there's an easy solution for this (if any). I'm 
at least trying to spread the workload on the machine "to the limits" to 
utilize as much of the metal resources as possible, which shortens the 
runtime of each job quite considerably, but even that's not ideal 
(resource-wise).
As I mentioned on IRC, maybe having Duffy changing the pool size 
dynamically based on the demand for the past hour or so would help with 
the overall balance (to avoid wasting resources in "quiet periods"), but 
that's just an idea from top of my head, I'm not sure how feasible it is 
or if it even makes sense.
Yes, that was always communicated that default EC2 instances don't 
support nested virt, as one request a cloud vm, so not an hypervisor :)
It's just before migrating to ec2 that we saw it was possible to deploy 
bare-metal options at AWS side, but with a higher cost (obviousy) than 
traditional EC2 instances (VMs)
Can you explain why you'd need to have an hypervisor instead of VMs ? I 
guess that troubleshooting comes to mind (`virsh console` to the rescue 
while it's not even possible with the ec2 instance as VM) ?
-- 
Fabian Arrotin
The CentOS Project | https://www.centos.org
gpg key: 17F3B7A1 | twitter: @arrfab

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Ci-users] Changes to CentOS CI: reminder of Phase 1 and 2