Hello,
While the Cloud SIG is still being established, let's get to actual work and think of a set of features for a CentOS cloud template. I am referring here to VMs, not containers (e.g. docker).
This is how I see it so far, please feel free to come with suggestions/comments/questions.
A - Single partition for simplicity (and lack of good arguments against it) - dracut-modules-growroot included so the template partition will expand to match target, cloud-init in charge of resize2fs
B - To swap or not to swap?
C - "tuned-adm profile virtual-host" which translates to: - kern.sched_min_granularity_ns 10ms - kernel.sched_wakeup_granularity_ns 15ms - vm.dirty_ratio 40% - vm.swappiness 30 - IO scheduler "deadline" - fs barriers off - CPU governor "performance" - disk readahead 4x
D - tso and gso off on the network interfaces http://s.nux.ro/gsotso
E - network interface remapping (75-persistent-net-generator.rules, BZ 912801)
F - Selinux on. Do we relabel for uniqueness? Seen small VMs run out of memory while relabelling..
G - PERSISTENT_DHCLIENT="1" (BZ 1011013)
H - Bundle all the paravirt drivers in the ramdisk (virtio/xen/vmware/hyperv) so the same image can boot everywhere?
I - Per "stack" requirements (e.g. cloudstack relies a lot on root user and password logins, openstack tends not to, SSH key only logins etc etc)
That's about all that crosses my mind for now.
Thoughts?
Lucian
On Tue, Apr 8, 2014 at 2:24 PM, Nux! nux@li.nux.ro wrote:
Hello,
While the Cloud SIG is still being established, let's get to actual work and think of a set of features for a CentOS cloud template. I am referring here to VMs, not containers (e.g. docker).
This is how I see it so far, please feel free to come with suggestions/comments/questions.
A - Single partition for simplicity (and lack of good arguments against it)
I was wondering about LVM. It makes reconfiguration much easier (like adding swap). But growroot doesn't support LVM.
- dracut-modules-growroot included so the template partition will
expand to match target, cloud-init in charge of resize2fs
Only required for kernel < 3.8. Later kernels can do online partition resizing (handled by cloud-init post initrd).
B - To swap or not to swap?
Some service providers charge for disk IOs and nobody wants to pay for swap activity, so I vote against swap.
C - "tuned-adm profile virtual-host" which translates to: - kern.sched_min_granularity_ns 10ms - kernel.sched_wakeup_granularity_ns 15ms - vm.dirty_ratio 40% - vm.swappiness 30 - IO scheduler "deadline" - fs barriers off - CPU governor "performance" - disk readahead 4x
Where do these come from? What's the rational?
D - tso and gso off on the network interfaces http://s.nux.ro/gsotso
These seem to be settings on the host, not the guest.
E - network interface remapping (75-persistent-net-generator.rules, BZ 912801)
Not authorized to access that bug.
F - Selinux on. Do we relabel for uniqueness? Seen small VMs run out of memory while relabelling..
Ack.
G - PERSISTENT_DHCLIENT="1" (BZ 1011013)
Ack.
H - Bundle all the paravirt drivers in the ramdisk (virtio/xen/vmware/hyperv) so the same image can boot everywhere?
Seems reasonable. What's the impact on the initrd size?
I - Per "stack" requirements (e.g. cloudstack relies a lot on root user and password logins, openstack tends not to, SSH key only logins etc etc)
Can we have a single image that fits all the different requirements?
That's about all that crosses my mind for now.
K - No firwall. Handled by the service provider.
L - Timezone is set to UTC, Hostname is set to 'centos', lang is en_US.UTF-8, keyboard is us (or whatever you guys think makes sense).
M - NOZEROCONF=yes
N - Along with the image, we'll also provide md5/sha1/sha256 checksums, gpg signed files and a manifest (list of installed packages and their versions).
...Juerg
Thoughts?
Lucian
-- Sent from the Delta quadrant using Borg technology!
Nux! www.nux.ro _______________________________________________ CentOS-devel mailing list CentOS-devel@centos.org http://lists.centos.org/mailman/listinfo/centos-devel
Hi,
On Tue, Apr 8, 2014 at 11:03 PM, Juerg Haefliger juergh@gmail.com wrote:
On Tue, Apr 8, 2014 at 2:24 PM, Nux! nux@li.nux.ro wrote:
Hello,
While the Cloud SIG is still being established, let's get to actual work and think of a set of features for a CentOS cloud template. I am referring here to VMs, not containers (e.g. docker).
This is how I see it so far, please feel free to come with suggestions/comments/questions.
A - Single partition for simplicity (and lack of good arguments against it)
I was wondering about LVM. It makes reconfiguration much easier (like adding swap). But growroot doesn't support LVM.
Single partition is good for both simplicity and LVM support. LVM also adds a performance overhead which is fine for some customer use cases but not others.
- dracut-modules-growroot included so the template partition will
expand to match target, cloud-init in charge of resize2fs
Only required for kernel < 3.8. Later kernels can do online partition resizing (handled by cloud-init post initrd).
Unless we plan to ignore CentOS 6, we need to handle kernels before 3.8 as well as CentOS 7's 3.10.
B - To swap or not to swap?
Some service providers charge for disk IOs and nobody wants to pay for swap activity, so I vote against swap.
I also don't see a need for swap in a cloud image.
C - "tuned-adm profile virtual-host" which translates to:
- kern.sched_min_granularity_ns 10ms - kernel.sched_wakeup_granularity_ns 15ms - vm.dirty_ratio 40% - vm.swappiness 30 - IO scheduler "deadline" - fs barriers off - CPU governor "performance" - disk readahead 4x
Where do these come from? What's the rational?
This might be a good place to link to GCE's recommendations for image settings, assembled from several different teams inside Google, with a bent toward maximal security but also discussing other areas: https://developers.google.com/compute/docs/building-image
Some of them are more important than others, and clearly distributions will make the decisions that are right for them. Examples might be compiling virtio-scsi/virtio-net support as kernel modules for generality even though kernel modules would be disabled in the most single-vendor locked-down security-minded kernel.
Very few of these beyond the bare hardware support are strictly mandatory, but e.g. disabling password-based SSH, disabling root SSH login, and having root's password field locked are good cloud defaults except anywhere a specific vendor's environment needs otherwise.
We should also consider installing yum-cron by default; that adds a lot of automatic security protection for hands-off cloud users, but some behaviors or software versions occasionally change between 6.x and 6.{x+1}. Interesting tradeoff, and one that many users of a configuration management system handle through that software. GCE's image currently does preinstall yum-cron, though of course the CentOS community will eventually own the image and have the final say.
More recommendations might surface over time through things like performance testing or advice from our hypervisor or kernel hackers.
D - tso and gso off on the network interfaces http://s.nux.ro/gsotso
These seem to be settings on the host, not the guest.
No opinion here, though if this is a guest-side setting, I can ask around within Google to give a well-informed GCE perspective.
E - network interface remapping (75-persistent-net-generator.rules, BZ 912801)
Not authorized to access that bug.
Same.
F - Selinux on. Do we relabel for uniqueness? Seen small VMs run out of memory while relabelling..
Ack.
I don't think GCE's current image does anything specific here beyond leaving SELinux on and ensuring some of our environment-specific hacks get properly labeled. No opinion on what's optimal, but we do offer small VMs as well as normal-sized ones, so handling both use cases is good.
G - PERSISTENT_DHCLIENT="1" (BZ 1011013)
Ack.
Seems reasonable based on the RHBA linked from the BZ - we haven't noticed a problem without this but it could be useful.
H - Bundle all the paravirt drivers in the ramdisk
(virtio/xen/vmware/hyperv) so the same image can boot everywhere?
Seems reasonable. What's the impact on the initrd size?
Seems good to me too. The ones GCE cares about are virtio-scsi, virtio-net, and virtio-pci/virtio-blk, but no objection to the others in the initrd if the result is reasonably sized.
I - Per "stack" requirements (e.g. cloudstack relies a lot on root user and password logins, openstack tends not to, SSH key only logins etc etc)
Can we have a single image that fits all the different requirements?
We are unlikely to have that in the end, but we can certainly start with one base and customize the output slightly for each environment.
Examples: GCE currently has two (Apache2-licensed Python) daemons running in our instances: one handles SSH keys via our metadata server in a way that's tied in to Google accounts and Google Cloud project access control lists, the other one facilitates some of our advanced networking features. We also ship gcutil and gsutil, two (Apache2-licensed Python) command-line utilities which are useful for interacting with the environment. The container format varies across environments too.
That's about all that crosses my mind for now.
K - No firwall. Handled by the service provider.
Mostly the same in GCE too. To avoid breaking configs which expect the firewall on by default, we're currently going with a default-open iptables firewall (at least for TCP/UDP - I'd have to check for ICMP). If CentOS prefers to disable it entirely, no strong objection from me.
L - Timezone is set to UTC, Hostname is set to 'centos', lang is
en_US.UTF-8, keyboard is us (or whatever you guys think makes sense).
Agreed, although in the GCE case the hostname is set dynamically via DHCP based on the instance name given to the API. We also set the NTP server to metadata.google.internal, served by the host the VM is running on. While this is baked into our images via kickstart, the DHCP server also recently started providing this via NTP option.
M - NOZEROCONF=yes
No opinion from me here. The same RHBA as before makes this seem wise to enable, although I haven't noticed a problem without it (our metadata server is at 169.254.169.254).
N - Along with the image, we'll also provide md5/sha1/sha256 checksums, gpg signed files and a manifest (list of installed packages and their versions).
Sounds reasonable.
- Jimmy
On 09.04.2014 07:48, Jimmy Kaplowitz wrote:
I was wondering about LVM. It makes reconfiguration much easier (like adding swap). But growroot doesn't support LVM.
Single partition is good for both simplicity and LVM support. LVM also adds a performance overhead which is fine for some customer use cases but not others.
Swap could be easily added via a file if really needed.
- dracut-modules-growroot included so the template partition
will expand to match target, cloud-init in charge of resize2fs
Only required for kernel < 3.8. Later kernels can do online partition resizing (handled by cloud-init post initrd).
Unless we plan to ignore CentOS 6, we need to handle kernels before 3.8 as well as CentOS 7's 3.10.
Yes, EL6 is our main concern here I guess, though it'd be good if we could reuse as much as possible for future versions, I imagine.
B - To swap or not to swap?
Some service providers charge for disk IOs and nobody wants to pay for swap activity, so I vote against swap.
I also don't see a need for swap in a cloud image.
Cool, nobody likes the swap. As I said above, it's trivial to add a swap file later on.
C - "tuned-adm profile virtual-host" which translates to:
- kern.sched_min_granularity_ns 10ms - kernel.sched_wakeup_granularity_ns 15ms - vm.dirty_ratio 40% - vm.swappiness 30 - IO scheduler "deadline" - fs barriers off - CPU governor "performance" - disk readahead 4x
Where do these come from? What's the rational?
They come from RedHat, maybe Sam Kottler or some other RH dev can clarify some of this for us. I would have expected to see NOOP scheduler here. Maybe it's worth opening another thread to discuss this profile. I imagine they must have some reasons for choosing this since they build both the guest/host OS and the hypervisor.
This might be a good place to link to GCE's recommendations for image settings, assembled from several different teams inside Google, with a bent toward maximal security but also discussing other areas: https://developers.google.com/compute/docs/building-image
Yes, many of the modifications do make sense, but once we start "optimising" where do we stop? This could lead to a slippery slope. Maybe KB can weigh in on this.
D - tso and gso off on the network interfaces http://s.nux.ro/gsotso
These seem to be settings on the host, not the guest.
These settings should be off on the guest, but seeing as there is no mention of this for newer versions, maybe it's something not necessarily needed. AFAIK the virtio device can't do "hardware" TCP segmentation offloading and so on, but perhaps this is forwarded to the hypervisor. To be looked at later on, doesn't seem like of big importance.
E - network interface remapping (75-persistent-net-generator.rules, BZ 912801)
Not authorized to access that bug.
Same.
It's about preventing udev to mapping MACs to NICs, so that when the VM gets transformed into a template it will not retain this and therefore have its NIC called eth1 or whatever name is available next. I'm sure everyone has hit this problem when building templates. "echo explanation > /etc/udev/rules.d/70-persistent-cd.rules" should do the trick.
F - Selinux on. Do we relabel for uniqueness? Seen small VMs run out of memory while relabelling..
Ack.
I don't think GCE's current image does anything specific here beyond leaving SELinux on and ensuring some of our environment-specific hacks get properly labeled. No opinion on what's optimal, but we do offer small VMs as well as normal-sized ones, so handling both use cases is good.
Ok, so this needs further debate.
G - PERSISTENT_DHCLIENT="1" (BZ 1011013)
Ack.
Seems reasonable based on the RHBA linked from the BZ - we haven't noticed a problem without this but it could be useful.
I have seen the problem first hand in Cloudstack; if the virtual router (dhcp provider) goes away the instance loses its IP and becomes unreachable ...
H - Bundle all the paravirt drivers in the ramdisk
(virtio/xen/vmware/hyperv) so the same image can boot everywhere?
Seems reasonable. What's the impact on the initrd size?
Seems good to me too. The ones GCE cares about are virtio-scsi, virtio-net, and virtio-pci/virtio-blk, but no objection to the others in the initrd if the result is reasonably sized.
The default initrd already carries most of them, here's a normal initrd on my workstation:
17595362 Feb 12 13:13 initramfs-2.6.32-431.5.1.el6.x86_64.img
and here's anther one based on the same kernel, but with: add_drivers+="vmw_pvscsi vmxnet3 hv_vmbus hv_utils hv_storvsc hv_netvsc xenfs xen-netfront xen-blkfront virtio_scsi virtio_net virtio_console virtio-rng virtio_blk virtio_pci"
17688533 Apr 9 15:34 paravirt.img
So they are almost identical in size.
I - Per "stack" requirements (e.g. cloudstack relies a lot on root user and password logins, openstack tends not to, SSH key only logins etc etc)
Can we have a single image that fits all the different requirements?
It would require building some logic into it so the instance is aware it's running in ACS/OS/AWS/GCE ... possible, not sure how feasible.
We are unlikely to have that in the end, but we can certainly start with one base and customize the output slightly for each environment.
+1
Examples: GCE currently has two (Apache2-licensed Python) daemons running in our instances: one handles SSH keys via our metadata server in a way that's tied in to Google accounts and Google Cloud project access control lists, the other one facilitates some of our advanced networking features. We also ship gcutil and gsutil, two (Apache2-licensed Python) command-line utilities which are useful for interacting with the environment. The container format varies across environments too.
No chance to actually get involved with cloud-init instead of running different scripts? Either way, it looks like %post will have a lot of work to do for all these images. :-)
K - No firwall. Handled by the service provider.
+1 for default-open iptables
L - Timezone is set to UTC, Hostname is set to 'centos', lang is
en_US.UTF-8, keyboard is us (or whatever you guys think makes sense).
+1 - The hostname not very important as most people use DHCP.
NTP/ntpdate is of course, a must.
M - NOZEROCONF=yes
No problem with that.
N - Along with the image, we'll also provide md5/sha1/sha256 checksums, gpg signed files and a manifest (list of installed packages and their versions).
+1
It would look like we need to allow enough room in %post for customisations required for all the various platforms, but if we can have a common base, that'd be great.
KB, what's your opinion on the above and what should we do next?
Lucian