Remove Centos from AWS marketplace

List overview All Threads
Download

newer

older

Are the "centos.org" AMIs in the...

Xen4CentOS kernel panic on dom0...

JacobV

9 Mar 2014 9 Mar '14

5:05 a.m.

https://forums.aws.amazon.com/thread.jspa?messageID=481859%F1%B5%A9%83 https://forums.aws.amazon.com/thread.jspa?messageID=453572%F1%AE%AF%84

This is a timebomb waiting to strike so many people who like do daily snapshot backups and keep them for few weeks and not realizing their snapshots are useless if they had accidentally mess up some boot related file earlier on.

Another scenario you mess up the sudoers file or the root authorized keys - you'd have to loose a whole days data and go to previous nights restore just for single file error like that ?

If AWS marketplace is unable to remove this hardcoded rule then it's only prudent to remove Centos from AWS marketplace and release it in the community section instead ?

Thoughts ?

Attachments:

attachment.html (text/html — 2.1 KB)

Show replies by date

Nico Kadel-Garcia

9 Mar 9 Mar

8:44 p.m.

Disk image snapshots are grossly inefficient as configuration backups. On any systems with significant file system churn, such as a proxy server or logging server, each snapshot may represent a significant and unexpected disk usage on the back end storage array. It's also not safe or reliable for any database backups, inclding MySQL, PostgreSQL, or even source control tools like CVS and Subversion.

If you need to preserve system files for just such problems, learn to do efficient backups. I recommend the old "rsnapshot" tool, available at EPEL.

On Sat, Mar 8, 2014 at 6:35 PM, JacobV nycfresh@yahoo.com wrote:

...

https://forums.aws.amazon.com/thread.jspa?messageID=481859%F1%B5%A9%83 https://forums.aws.amazon.com/thread.jspa?messageID=453572%F1%AE%AF%84

This is a timebomb waiting to strike so many people who like do daily snapshot backups and keep them for few weeks and not realizing their snapshots are useless if they had accidentally mess up some boot related file earlier on.

Another scenario you mess up the sudoers file or the root authorized keys - you'd have to loose a whole days data and go to previous nights restore just for single file error like that ?

If AWS marketplace is unable to remove this hardcoded rule then it's only prudent to remove Centos from AWS marketplace and release it in the community section instead ?

Thoughts ?

CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt

Digimer

8:58 p.m.

Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)?

I'm asking in general terms... no idea if this is something AWS specific.

digimer

On 09/03/14 11:14 AM, Nico Kadel-Garcia wrote:

...

Disk image snapshots are grossly inefficient as configuration backups. On any systems with significant file system churn, such as a proxy server or logging server, each snapshot may represent a significant and unexpected disk usage on the back end storage array. It's also not safe or reliable for any database backups, inclding MySQL, PostgreSQL, or even source control tools like CVS and Subversion.

If you need to preserve system files for just such problems, learn to do efficient backups. I recommend the old "rsnapshot" tool, available at EPEL.

On Sat, Mar 8, 2014 at 6:35 PM, JacobV nycfresh@yahoo.com wrote:

...
https://forums.aws.amazon.com/thread.jspa?messageID=481859%F1%B5%A9%83 https://forums.aws.amazon.com/thread.jspa?messageID=453572%F1%AE%AF%84

This is a timebomb waiting to strike so many people who like do daily snapshot backups and keep them for few weeks and not realizing their snapshots are useless if they had accidentally mess up some boot related file earlier on.

Another scenario you mess up the sudoers file or the root authorized keys - you'd have to loose a whole days data and go to previous nights restore just for single file error like that ?

If AWS marketplace is unable to remove this hardcoded rule then it's only prudent to remove Centos from AWS marketplace and release it in the community section instead ?

Thoughts ?

CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt

CentOS-virt mailing list CentOS-virt@centos.org http://lists.centos.org/mailman/listinfo/centos-virt

-- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?

Stephen Harris

9:13 p.m.

On Sun, Mar 09, 2014 at 11:28:07AM -0400, Digimer wrote:

...

Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)?

I'm asking in general terms... no idea if this is something AWS specific.

Database disk snapshots may include "transactions in flight" and the on-disk image may not be in a consistent state. Databases such as Oracle try to work around this by ensuring that writes occur in a specific order and have a good "recovery" process (each data file has a change number; determine the best change number to start from, roll forward from there to recover, then roll back any incomplete transactions) but it's considered "crash recovery" and shouldn't be part of BAU activity. Other databases may not be so good at recovery (mysql?) and so you run the risk of database corruption if you need to restore the snapshot.

If you rely on disk snapshots then it's recommended you do a proper db dump before the snapshot is taken, so that you can recover the database from the dump file and not the snapshot.

-- rgds Stephen

Digimer

10 p.m.

On 09/03/14 11:43 AM, Stephen Harris wrote:

...

On Sun, Mar 09, 2014 at 11:28:07AM -0400, Digimer wrote:

...
Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)?

I'm asking in general terms... no idea if this is something AWS specific.

Database disk snapshots may include "transactions in flight" and the on-disk image may not be in a consistent state. Databases such as Oracle try to work around this by ensuring that writes occur in a specific order and have a good "recovery" process (each data file has a change number; determine the best change number to start from, roll forward from there to recover, then roll back any incomplete transactions) but it's considered "crash recovery" and shouldn't be part of BAU activity. Other databases may not be so good at recovery (mysql?) and so you run the risk of database corruption if you need to restore the snapshot.

If you rely on disk snapshots then it's recommended you do a proper db dump before the snapshot is taken, so that you can recover the database from the dump file and not the snapshot.

Thanks for the reply, Stephen. I also replied to Nico, and my comments there can be directed to you, as well. :)

-- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?

Nico Kadel-Garcia

9:22 p.m.

On Sun, Mar 9, 2014 at 11:28 AM, Digimer lists@alteeve.ca wrote:

...

Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)?

I'm asking in general terms... no idea if this is something AWS specific.

digimer

It's a general issue. If a system snapshot is used to correctly preserve both the disk image, and the "state" of the VM including memory, well and good. The state is recoverable. There's always a risk that interrupted network transactions left things in an unexpectedly inconsistent state that the VM is not equipped to handle: I'm thinking particularly of "wget" or other download transactions where the download software was not intelligent enough to verify the download before proceeding. I've been through this a lot lately with "chef" software. It's compounded by network based filesystem transactions, such as interactions with NFS or CIFS filesystems, which cannot be synchronized with the OS snapshot.

But simply relying on the disk image from such an AWS snapshot, without recovering the full system state, is a potential adventure. I've not myself had opportunity to play with this kind of restoration, so I'm uncertain whether AWS allows access to the plain disk image, or automatically would bring the full VM state with it for re-activation of the snapshot. If you're just getting at the disk images, using "fsync" before the snapshots is helpful, but any atomic transaction that is in progress at the time of the disk image snapshot is not verifiable in the atomicity of that transaction. This particularly includes precisely the sort of "page mapped" data, sitting in RAM, that the "fsync" command helps write to disk.

And snapshots cheduled from outside controllers, such as automatic snapshots, cannot be reliably synced with system specific "fsync" database suspension commands without a great deal of integration between the outside system, and the local host, that VM's are not supposed to normally need. I went through great deal of this some years back, shutting down databases, running "LVM" to get a disk snapshot, then running "rsnapshot" against the *snapshot* to avoid getting an inconsistent state of the database into the backup system.

And there are some *funky* databases out there. Ask sometime about the "Use hardlinked RCS files for source control of multiple project branches" sometime, if you'd like to wince a lot.

Digimer

9:59 p.m.

On 09/03/14 11:52 AM, Nico Kadel-Garcia wrote:

...

On Sun, Mar 9, 2014 at 11:28 AM, Digimer lists@alteeve.ca wrote:

...
Would you mind elaborating on this? If a snapshot is a point-in-time image of a VM (or even normal FS), why would DB backups be at risk (assuming things like fsync are used)?

I'm asking in general terms... no idea if this is something AWS specific.

digimer

It's a general issue. If a system snapshot is used to correctly preserve both the disk image, and the "state" of the VM including memory, well and good. The state is recoverable. There's always a risk that interrupted network transactions left things in an unexpectedly inconsistent state that the VM is not equipped to handle: I'm thinking particularly of "wget" or other download transactions where the download software was not intelligent enough to verify the download before proceeding. I've been through this a lot lately with "chef" software. It's compounded by network based filesystem transactions, such as interactions with NFS or CIFS filesystems, which cannot be synchronized with the OS snapshot.

But simply relying on the disk image from such an AWS snapshot, without recovering the full system state, is a potential adventure. I've not myself had opportunity to play with this kind of restoration, so I'm uncertain whether AWS allows access to the plain disk image, or automatically would bring the full VM state with it for re-activation of the snapshot. If you're just getting at the disk images, using "fsync" before the snapshots is helpful, but any atomic transaction that is in progress at the time of the disk image snapshot is not verifiable in the atomicity of that transaction. This particularly includes precisely the sort of "page mapped" data, sitting in RAM, that the "fsync" command helps write to disk.

And snapshots cheduled from outside controllers, such as automatic snapshots, cannot be reliably synced with system specific "fsync" database suspension commands without a great deal of integration between the outside system, and the local host, that VM's are not supposed to normally need. I went through great deal of this some years back, shutting down databases, running "LVM" to get a disk snapshot, then running "rsnapshot" against the *snapshot* to avoid getting an inconsistent state of the database into the backup system.

And there are some *funky* databases out there. Ask sometime about the "Use hardlinked RCS files for source control of multiple project branches" sometime, if you'd like to wince a lot.

This is very useful, thank you kindly for sharing. I suppose I always considered the "it's like recovering for the server losing power" as "usually works" and equating that to "good enough" backup.

So I suppose, at best, using snapshot images as a backup ... backup method would be valid... I could see the benefit of recovering the VM, and then if anything wasn't right, using it as the target for restoring data from the proper backup.

Thanks again!

-- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education?

4149

Age (days ago)

4150

Last active (days ago)

virt@lists.centos.org

6 comments

4 participants

tags (0)

participants (4)

Digimer
JacobV
Nico Kadel-Garcia
Stephen Harris