Hi all, I plan to implement two file servers on CentOS 6 i a two remote location. i need to backup all data from second server on first. First server will be a virtual machine on Esxi, and second server will be physical machine.
I plan to use rsync to sync data from second to first server. It is OK ? Any suggestion ?
Njegos
On 8/9/2011 2:50 PM, Railic Njegos wrote:
Hi all, I plan to implement two file servers on CentOS 6 i a two remote location. i need to backup all data from second server on first. First server will be a virtual machine on Esxi, and second server will be physical machine.
I plan to use rsync to sync data from second to first server. It is OK ? Any suggestion ?
Rsync is probably the best thing you will find for this. As long as whatever you are doing can tolerate the possible differences between rsync runs it should be fine. Rysnc normally creates a new file under a tmp name, renaming only when the transfer is complete so programs accessing the data will only see one version or the other, not an inconsistent copy as the transfer progresses.
On Wed, Aug 10, 2011 at 8:05 AM, Les Mikesell lesmikesell@gmail.com wrote:
On 8/9/2011 2:50 PM, Railic Njegos wrote:
Hi all, I plan to implement two file servers on CentOS 6 i a two remote location. i need to backup all data from second server on first. First server will be a virtual machine on Esxi, and second server will be physical machine.
I plan to use rsync to sync data from second to first server. It is OK ? Any suggestion ?
Rsync is probably the best thing you will find for this. As long as whatever you are doing can tolerate the possible differences between rsync runs it should be fine. Rysnc normally creates a new file under a tmp name, renaming only when the transfer is complete so programs accessing the data will only see one version or the other, not an inconsistent copy as the transfer progresses.
rsync has its own issues. I still use it, but I've learned not to trust it completely. If you have a deep directory hierarchy and lots of files, it may run out of memory and crash. I've also had it fail silently to copy files. In the past I've written wrapper scripts that break down the rsync into several 'chunks', and check the number of files on source and target servers at the end. Some people run rsync and then immediately run it again!
rsync, in spite of its idiosyncrasies is still the best tool.
Cheers,
Cliff
On 8/9/11 7:37 PM, Cliff Pratt wrote:
On Wed, Aug 10, 2011 at 8:05 AM, Les Mikeselllesmikesell@gmail.com wrote:
On 8/9/2011 2:50 PM, Railic Njegos wrote:
Hi all, I plan to implement two file servers on CentOS 6 i a two remote location. i need to backup all data from second server on first. First server will be a virtual machine on Esxi, and second server will be physical machine.
I plan to use rsync to sync data from second to first server. It is OK ? Any suggestion ?
Rsync is probably the best thing you will find for this. As long as whatever you are doing can tolerate the possible differences between rsync runs it should be fine. Rysnc normally creates a new file under a tmp name, renaming only when the transfer is complete so programs accessing the data will only see one version or the other, not an inconsistent copy as the transfer progresses.
rsync has its own issues. I still use it, but I've learned not to trust it completely. If you have a deep directory hierarchy and lots of files, it may run out of memory and crash.
I'm not sure I'd blame rsync if you don't have enough RAM... But the 3.x versions are probably better about that.
I've also had it fail silently to copy files.
That's odd, unless it actually was killed by the OOM killer.
In the past I've written wrapper scripts that break down the rsync into several 'chunks', and check the number of files on source and target servers at the end. Some people run rsync and then immediately run it again!
Running twice is a reasonable thing - maybe even running until no files are changing.
On Wed, Aug 10, 2011 at 1:05 PM, Les Mikesell lesmikesell@gmail.com wrote:
On 8/9/11 7:37 PM, Cliff Pratt wrote:
On Wed, Aug 10, 2011 at 8:05 AM, Les Mikeselllesmikesell@gmail.com wrote:
On 8/9/2011 2:50 PM, Railic Njegos wrote:
Hi all, I plan to implement two file servers on CentOS 6 i a two remote location. i need to backup all data from second server on first. First server will be a virtual machine on Esxi, and second server will be physical machine.
I plan to use rsync to sync data from second to first server. It is OK ? Any suggestion ?
Rsync is probably the best thing you will find for this. As long as whatever you are doing can tolerate the possible differences between rsync runs it should be fine. Rysnc normally creates a new file under a tmp name, renaming only when the transfer is complete so programs accessing the data will only see one version or the other, not an inconsistent copy as the transfer progresses.
rsync has its own issues. I still use it, but I've learned not to trust it completely. If you have a deep directory hierarchy and lots of files, it may run out of memory and crash.
I'm not sure I'd blame rsync if you don't have enough RAM... But the 3.x versions are probably better about that.
Well, up to a point I'd agree with you. However, I can't going to my boss and asking for more RAM to get rsync to work on top of what was specced for the app, he'd probably walk away muttering things like "Windows"....
My point was however not to diss a really good utility but to give some hints and tips. I started to use rsync when I had the need and everyone told me how good it was. And it is. But it does have its little quirks.
I've also had it fail silently to copy files.
That's odd, unless it actually was killed by the OOM killer.
You are likely correct, but I didn't have time (at the time) to investigate further.
In the past I've written wrapper scripts that break down the rsync into several 'chunks', and check the number of files on source and target servers at the end. Some people run rsync and then immediately run it again!
Running twice is a reasonable thing - maybe even running until no files are changing.
Yes, indeed.
Cheers,
Cliff
On 08/09/11 12:50 PM, Railic Njegos wrote:
I plan to use rsync to sync data from second to first server. It is OK ? Any suggestion ?
rsync doesn't much tolerate network glitches in my experience. its also a incremental file backup/copy, and won't be doing a 'snapshot' so if any of these files you're copying are things that are randomly updated like a database, its quite possible for the copy to be useless.
as a backup strategy, having a single copy that you overwrite when you make a new copy is weak. you have no history, you can't recover the file that the user overwrote 2 days ago and forgot to tell you until today, as you just overwrote your backup with his mistake last night.
it really depends on what the point of this replica is, what the usage patterns are, what the data archive expectancies are.
On 8/9/11 7:54 PM, John R Pierce wrote:
On 08/09/11 12:50 PM, Railic Njegos wrote:
I plan to use rsync to sync data from second to first server. It is OK ? Any suggestion ?
rsync doesn't much tolerate network glitches in my experience. its also a incremental file backup/copy, and won't be doing a 'snapshot' so if any of these files you're copying are things that are randomly updated like a database, its quite possible for the copy to be useless.
as a backup strategy, having a single copy that you overwrite when you make a new copy is weak. you have no history, you can't recover the file that the user overwrote 2 days ago and forgot to tell you until today, as you just overwrote your backup with his mistake last night.
it really depends on what the point of this replica is, what the usage patterns are, what the data archive expectancies are.
Yes, if what you really want is an on-line history you might like backuppc. It can use rsync for the transport and keeps all duplicate files pooled as hardlinks with optional compression to be able to store more history than you would expect. But then you have to restore to get usable copies.
On Tue, 2011-08-09 at 17:54 -0700, John R Pierce wrote:
rsync doesn't much tolerate network glitches in my experience. its also a incremental file backup/copy, and won't be doing a 'snapshot' so if any of these files you're copying are things that are randomly updated like a database, its quite possible for the copy to be useless.
as a backup strategy, having a single copy that you overwrite when you make a new copy is weak. you have no history, you can't recover the file that the user overwrote 2 days ago and forgot to tell you until today, as you just overwrote your backup with his mistake last night.
it really depends on what the point of this replica is, what the usage patterns are, what the data archive expectancies are.
What do you suggest for scheduled incremental saves of files changing irregularly ?
On 08/09/11 6:23 PM, Always Learning wrote:
What do you suggest for scheduled incremental saves of files changing irregularly ?
well, there's always a classic monthly/weekly/daily incremental/differential sequence using dump (assuming its extfs).
backuppc is neat, but you end up with a really huge gnarly file system with bazillions of files and links, becoming unmanagable over time
On 8/9/11 9:35 PM, John R Pierce wrote:
On 08/09/11 6:23 PM, Always Learning wrote:
What do you suggest for scheduled incremental saves of files changing irregularly ?
well, there's always a classic monthly/weekly/daily incremental/differential sequence using dump (assuming its extfs).
backuppc is neat, but you end up with a really huge gnarly file system with bazillions of files and links, becoming unmanagable over time
As a filesystem it continues to work just fine. It is only 'unmanageable' in the sense that it becomes impractical to back it up with file-oriented methods because reconstructing that number of hardlinks is time consuming. But (a) it _is_ the backup, and (b) image-oriented filesystem copies work fine - or splitting and resync'ing raid mirrors.
I plan to use copy as backup, because second server will be old physical computer(about 2TB disk) in remote office and first server will be virtual machine on storage. On first server i plan to have one folder where i plan to copy over rsync all files from second server.
On Wed, Aug 10, 2011 at 5:37 AM, Les Mikesell lesmikesell@gmail.com wrote:
On 8/9/11 9:35 PM, John R Pierce wrote:
On 08/09/11 6:23 PM, Always Learning wrote:
What do you suggest for scheduled incremental saves of files changing irregularly ?
well, there's always a classic monthly/weekly/daily incremental/differential sequence using dump (assuming its extfs).
backuppc is neat, but you end up with a really huge gnarly file system with bazillions of files and links, becoming unmanagable over time
As a filesystem it continues to work just fine. It is only 'unmanageable' in the sense that it becomes impractical to back it up with file-oriented methods because reconstructing that number of hardlinks is time consuming. But (a) it _is_ the backup, and (b) image-oriented filesystem copies work fine - or splitting and resync'ing raid mirrors.
-- Les Mikesell lesmikesell@gmail.com
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 8/10/11 1:20 AM, Railic Njegos wrote:
I plan to use copy as backup, because second server will be old physical computer(about 2TB disk) in remote office and first server will be virtual machine on storage. On first server i plan to have one folder where i plan to copy over rsync all files from second server.
That plan will work, but it won't protect against things like accidental deletions or overwriting important files that aren't noticed until after the next rsync run wipes out your copy. Backuppc or a similar backup framework can keep a history of copies online and cover both scenarios. Backuppc is particularly nice in that it's compression and pooling makes it not use a lot of space for the history and it provides a web interface for browsing the backups and restoring - and you can download files directly from the browser if you want. 2TB is a lot to copy remotely, though. You may want to use some other means to get the initial copy over - like copying to an external drive. Once the first copy is in place rsync will only need to copy the changes.
Backuppc is better solutions? Anyone similar software for this problem ?
On Wed, Aug 10, 2011 at 3:03 PM, Les Mikesell lesmikesell@gmail.com wrote:
On 8/10/11 1:20 AM, Railic Njegos wrote:
I plan to use copy as backup, because second server will be old physical computer(about 2TB disk) in remote office and first server will be virtual machine on storage. On first server i plan to have one folder where i plan to copy over rsync all files from second server.
That plan will work, but it won't protect against things like accidental deletions or overwriting important files that aren't noticed until after the next rsync run wipes out your copy. Backuppc or a similar backup framework can keep a history of copies online and cover both scenarios. Backuppc is particularly nice in that it's compression and pooling makes it not use a lot of space for the history and it provides a web interface for browsing the backups and restoring - and you can download files directly from the browser if you want. 2TB is a lot to copy remotely, though. You may want to use some other means to get the initial copy over - like copying to an external drive. Once the first copy is in place rsync will only need to copy the changes.
-- Les Mikesell lesmikesell@gmail.com
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
--On Wednesday, August 10, 2011 4:43 PM +0200 Railic Njegos railic.njegos@gmail.com wrote:
Backuppc is better solutions? Anyone similar software for this problem ?
That's what I'm using to back up Windows shares, using rsync mode to do the actual transfer. I'm using a Windows port of rsyncd to serve the files.