Whats available for doing near-realtime master->slave file replication between two CentOS systems?
Cronjobs running rsync won't cut it.
Ideally, I'd like something that works like SLONY-I does for Postgres databases, where all file system block level transactions on the master get replicated to the 2nd system on a 'as fast as practical' basis (a second or two of lag is acceptable). If the slave is offline or connectivity is interrupted, these transactions should be journaled so they can be played back when things resume.
I'm open to other approaches, so far I've found very little in this category... I briefly considered using LVM mirroring over iscsi to the slave, but A) this is synchronous, B) recovery from a service interruption would require remirroring the whole volume.
I've been reading about GFS but am somewhat confused as to its capabilities. The examples given on redhat's pages seem to involve shared storage (SAN or whatever) and distributed cluster access, I don't need any of that, just simple master->slave one way asynchronous replication.
John R Pierce schrieb:
Whats available for doing near-realtime master->slave file replication between two CentOS systems? Cronjobs running rsync won't cut it. Ideally, I'd like something that works like SLONY-I does for Postgres databases, where all file system block level transactions on the master get replicated to the 2nd system on a 'as fast as practical' basis (a second or two of lag is acceptable). If the slave is offline or connectivity is interrupted, these transactions should be journaled so they can be played back when things resume.
I'm open to other approaches, so far I've found very little in this category... I briefly considered using LVM mirroring over iscsi to the slave, but A) this is synchronous, B) recovery from a service interruption would require remirroring the whole volume.
I've been reading about GFS but am somewhat confused as to its capabilities. The examples given on redhat's pages seem to involve shared storage (SAN or whatever) and distributed cluster access, I don't need any of that, just simple master->slave one way asynchronous replication.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
For GFS you dont't need SAN's. You can do it simly. Or take DRBD, but for it you must compile an kernel module. http://www.drbd.org/
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
For GFS you dont't need SAN's. You can do it simly. Or take DRBD, but for it you must compile an kernel module. http://www.drbd.org/
CentOS provides DRBD. See:
http://wiki.centos.org/HowTos/Ha-Drbd
Akemi
Akemi Yagi schrieb:
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
For GFS you dont't need SAN's. You can do it simly. Or take DRBD, but for it you must compile an kernel module. http://www.drbd.org/
CentOS provides DRBD. See:
http://wiki.centos.org/HowTos/Ha-Drbd
Akemi _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
But not at the latest version:( See Bug# 0002339, so you have often trouble when an new kernel update is waiting at the door.
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
Akemi Yagi schrieb:
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
For GFS you dont't need SAN's. You can do it simly. Or take DRBD, but for it you must compile an kernel module. http://www.drbd.org/
CentOS provides DRBD. See:
But not at the latest version:( See Bug# 0002339, so you have often trouble when an new kernel update is waiting at the door.
I believe they are up-to-date now. My understanding is that there is some delay after a new kernel comes out, but it should not be more than a few days. But we have to remember that the whole CentOS project is based on voluntary work by both CentOS team members and the community.
Akemi
If using a file system for the file storage isn't mandatory then I would go with an rdbms such as Postgres with the file stored in a blob field of a table. If the systems are in the same location and can use a shared Filesystem then I would look into getting a netraid scsi card or something similar. The netraid cards have the very nice capability of all sitting on a single scsi bus along with the datastore device (a hard drive or hdd array). There's no replication that occurs because they are all using the same data source jointly.
Geoff
Sent from my BlackBerry wireless handheld.
-----Original Message----- From: "Akemi Yagi" amyagi@gmail.com
Date: Sat, 6 Oct 2007 12:01:44 To:"CentOS mailing list" centos@centos.org Subject: Re: [CentOS] near-realtime file system replication
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
Akemi Yagi schrieb:
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
For GFS you dont't need SAN's. You can do it simly. Or take DRBD, but for it you must compile an kernel module. http://www.drbd.org/
CentOS provides DRBD. See:
But not at the latest version:( See Bug# 0002339, so you have often trouble when an new kernel update is waiting at the door.
I believe they are up-to-date now. My understanding is that there is some delay after a new kernel comes out, but it should not be more than a few days. But we have to remember that the whole CentOS project is based on voluntary work by both CentOS team members and the community.
Akemi _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
gjgowey@tmo.blackberry.net wrote:
If using a file system for the file storage isn't mandatory then I would go with an rdbms such as Postgres with the file stored in a blob field of a table. If the systems are in the same location and can use a shared Filesystem then I would look into getting a netraid scsi card or something similar. The netraid cards have the very nice capability of all sitting on a single scsi bus along with the datastore device (a hard drive or hdd array). There's no replication that occurs because they are all using the same data source jointly.
The application for which I have an immediate requirement actually uses postgres *and* files. the files are required because of stupid external reasons, they are collected and FTP'd to another system beyond our control. I'm already intending to evaluate Slony-I for the postgres replication.
A program that acts as an interface between the rdbms and whatever wants the files can be constructed easilly. Personally I would look into using java to create an applet that uses jdbc to get the files from the rdbms and then uses regular java libraries to "ftp" it. Stream to stream essentially. The files extracted from the rdbms never need to exist on the fs, only in memory.
Geoff
Sent from my BlackBerry wireless handheld.
-----Original Message----- From: John R Pierce pierce@hogranch.com
Date: Sat, 06 Oct 2007 13:42:42 To:CentOS mailing list centos@centos.org Subject: Re: [CentOS] near-realtime file system replication
gjgowey@tmo.blackberry.net wrote:
If using a file system for the file storage isn't mandatory then I would go with an rdbms such as Postgres with the file stored in a blob field of a table. If the systems are in the same location and can use a shared Filesystem then I would look into getting a netraid scsi card or something similar. The netraid cards have the very nice capability of all sitting on a single scsi bus along with the datastore device (a hard drive or hdd array). There's no replication that occurs because they are all using the same data source jointly.
The application for which I have an immediate requirement actually uses postgres *and* files. the files are required because of stupid external reasons, they are collected and FTP'd to another system beyond our control. I'm already intending to evaluate Slony-I for the postgres replication.
_______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Sat, 2007-10-06 at 12:01 -0700, Akemi Yagi wrote:
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
Akemi Yagi schrieb:
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
For GFS you dont't need SAN's. You can do it simly. Or take DRBD, but for it you must compile an kernel module. http://www.drbd.org/
CentOS provides DRBD. See:
But not at the latest version:( See Bug# 0002339, so you have often trouble when an new kernel update is waiting at the door.
I believe they are up-to-date now. My understanding is that there is some delay after a new kernel comes out, but it should not be more than a few days. But we have to remember that the whole CentOS project is based on voluntary work by both CentOS team members and the community.
Would not dkms, as described here http://wiki.centos.org/HowTos/BuildingKernelModules
Akemi _______________________________________________ CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
Previous reply: wrong key. Sorry.
On Sat, 2007-10-06 at 12:01 -0700, Akemi Yagi wrote:
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
Akemi Yagi schrieb:
On 10/6/07, Frank Büttner frank-buettner@gmx.net wrote:
For GFS you dont't need SAN's. You can do it simly. Or take DRBD, but for it you must compile an kernel module. http://www.drbd.org/
CentOS provides DRBD. See:
But not at the latest version:( See Bug# 0002339, so you have often trouble when an new kernel update is waiting at the door.
I believe they are up-to-date now. My understanding is that there is some delay after a new kernel comes out, but it should not be more than a few days. But we have to remember that the whole CentOS project is based on voluntary work by both CentOS team members and the community.
Would not the dkms facility, as described here
http://wiki.centos.org/HowTos/BuildingKernelModules
ameliorate version problems? I'm not sure, but it seems to be useful from my reading.
Akemi
<snip>
-- Bill
On 10/6/07, William L. Maltby CentOS4Bill@triad.rr.com wrote:
I believe they are up-to-date now. My understanding is that there is some delay after a new kernel comes out, but it should not be more than a few days. But we have to remember that the whole CentOS project is based on voluntary work by both CentOS team members and the community.
Would not the dkms facility, as described here
http://wiki.centos.org/HowTos/BuildingKernelModules
ameliorate version problems? I'm not sure, but it seems to be useful from my reading.
dkms is useful for automating rebuilding kernel modules upon booting a new kernel. RPMForge offers a number of dkms-based modules (like the one for the nvidia driver). dkms is, however, not a supported/preferred method by CentOS (kmod is). Among other things, it requires gcc which is not favored in the server environment (so I understand). In this particular case, drbd is provided as SRPMS, and CentOS simply rebuilds from them.
Akemi
On Sat, 2007-10-06 at 13:18 -0700, Akemi Yagi wrote:
On 10/6/07, William L. Maltby CentOS4Bill@triad.rr.com wrote:
<snip>
Would not the dkms facility, as described here
http://wiki.centos.org/HowTos/BuildingKernelModules
ameliorate version problems? I'm not sure, but it seems to be useful from my reading.
dkms is useful for automating rebuilding kernel modules upon booting a new kernel. RPMForge offers a number of dkms-based modules (like the one for the nvidia driver). dkms is, however, not a supported/preferred method by CentOS (kmod is). Among other things, it requires gcc which is not favored in the server environment (so I understand). In this particular case, drbd is provided as SRPMS, and CentOS simply rebuilds from them.
Akemi
<snip sig stuff>
Aha! Thanks for the clarification. Not being an admin in a server environment, I looked at it from my WS home-user POV.
-- Bill