On 09.04.2014 07:48, Jimmy Kaplowitz wrote:
I was wondering about LVM. It makes reconfiguration much easier (like adding swap). But growroot doesn't support LVM.
Single partition is good for both simplicity and LVM support. LVM also adds a performance overhead which is fine for some customer use cases but not others.
Swap could be easily added via a file if really needed.
- dracut-modules-growroot included so the template partition
will expand to match target, cloud-init in charge of resize2fs
Only required for kernel < 3.8. Later kernels can do online partition resizing (handled by cloud-init post initrd).
Unless we plan to ignore CentOS 6, we need to handle kernels before 3.8 as well as CentOS 7's 3.10.
Yes, EL6 is our main concern here I guess, though it'd be good if we could reuse as much as possible for future versions, I imagine.
B - To swap or not to swap?
Some service providers charge for disk IOs and nobody wants to pay for swap activity, so I vote against swap.
I also don't see a need for swap in a cloud image.
Cool, nobody likes the swap. As I said above, it's trivial to add a swap file later on.
C - "tuned-adm profile virtual-host" which translates to:
- kern.sched_min_granularity_ns 10ms - kernel.sched_wakeup_granularity_ns 15ms - vm.dirty_ratio 40% - vm.swappiness 30 - IO scheduler "deadline" - fs barriers off - CPU governor "performance" - disk readahead 4x
Where do these come from? What's the rational?
They come from RedHat, maybe Sam Kottler or some other RH dev can clarify some of this for us. I would have expected to see NOOP scheduler here. Maybe it's worth opening another thread to discuss this profile. I imagine they must have some reasons for choosing this since they build both the guest/host OS and the hypervisor.
This might be a good place to link to GCE's recommendations for image settings, assembled from several different teams inside Google, with a bent toward maximal security but also discussing other areas: https://developers.google.com/compute/docs/building-image
Yes, many of the modifications do make sense, but once we start "optimising" where do we stop? This could lead to a slippery slope. Maybe KB can weigh in on this.
D - tso and gso off on the network interfaces http://s.nux.ro/gsotso
These seem to be settings on the host, not the guest.
These settings should be off on the guest, but seeing as there is no mention of this for newer versions, maybe it's something not necessarily needed. AFAIK the virtio device can't do "hardware" TCP segmentation offloading and so on, but perhaps this is forwarded to the hypervisor. To be looked at later on, doesn't seem like of big importance.
E - network interface remapping (75-persistent-net-generator.rules, BZ 912801)
Not authorized to access that bug.
Same.
It's about preventing udev to mapping MACs to NICs, so that when the VM gets transformed into a template it will not retain this and therefore have its NIC called eth1 or whatever name is available next. I'm sure everyone has hit this problem when building templates. "echo explanation > /etc/udev/rules.d/70-persistent-cd.rules" should do the trick.
F - Selinux on. Do we relabel for uniqueness? Seen small VMs run out of memory while relabelling..
Ack.
I don't think GCE's current image does anything specific here beyond leaving SELinux on and ensuring some of our environment-specific hacks get properly labeled. No opinion on what's optimal, but we do offer small VMs as well as normal-sized ones, so handling both use cases is good.
Ok, so this needs further debate.
G - PERSISTENT_DHCLIENT="1" (BZ 1011013)
Ack.
Seems reasonable based on the RHBA linked from the BZ - we haven't noticed a problem without this but it could be useful.
I have seen the problem first hand in Cloudstack; if the virtual router (dhcp provider) goes away the instance loses its IP and becomes unreachable ...
H - Bundle all the paravirt drivers in the ramdisk
(virtio/xen/vmware/hyperv) so the same image can boot everywhere?
Seems reasonable. What's the impact on the initrd size?
Seems good to me too. The ones GCE cares about are virtio-scsi, virtio-net, and virtio-pci/virtio-blk, but no objection to the others in the initrd if the result is reasonably sized.
The default initrd already carries most of them, here's a normal initrd on my workstation:
17595362 Feb 12 13:13 initramfs-2.6.32-431.5.1.el6.x86_64.img
and here's anther one based on the same kernel, but with: add_drivers+="vmw_pvscsi vmxnet3 hv_vmbus hv_utils hv_storvsc hv_netvsc xenfs xen-netfront xen-blkfront virtio_scsi virtio_net virtio_console virtio-rng virtio_blk virtio_pci"
17688533 Apr 9 15:34 paravirt.img
So they are almost identical in size.
I - Per "stack" requirements (e.g. cloudstack relies a lot on root user and password logins, openstack tends not to, SSH key only logins etc etc)
Can we have a single image that fits all the different requirements?
It would require building some logic into it so the instance is aware it's running in ACS/OS/AWS/GCE ... possible, not sure how feasible.
We are unlikely to have that in the end, but we can certainly start with one base and customize the output slightly for each environment.
+1
Examples: GCE currently has two (Apache2-licensed Python) daemons running in our instances: one handles SSH keys via our metadata server in a way that's tied in to Google accounts and Google Cloud project access control lists, the other one facilitates some of our advanced networking features. We also ship gcutil and gsutil, two (Apache2-licensed Python) command-line utilities which are useful for interacting with the environment. The container format varies across environments too.
No chance to actually get involved with cloud-init instead of running different scripts? Either way, it looks like %post will have a lot of work to do for all these images. :-)
K - No firwall. Handled by the service provider.
+1 for default-open iptables
L - Timezone is set to UTC, Hostname is set to 'centos', lang is
en_US.UTF-8, keyboard is us (or whatever you guys think makes sense).
+1 - The hostname not very important as most people use DHCP.
NTP/ntpdate is of course, a must.
M - NOZEROCONF=yes
No problem with that.
N - Along with the image, we'll also provide md5/sha1/sha256 checksums, gpg signed files and a manifest (list of installed packages and their versions).
+1
It would look like we need to allow enough room in %post for customisations required for all the various platforms, but if we can have a common base, that'd be great.
KB, what's your opinion on the above and what should we do next?
Lucian