hi Guys
One of the key problems we need to solve / resolve and do so with some level of urgency is the mirror problem : How do we get content 'out there' fast enough to reduce the time-to-release lag. Typically, from the time everything is ready to go and the release notes / release email are done, it takes 3 days to get to a releaseable stage for the mirror network.
The rsync tree that we are using at this point is clearly not good-enough ( either because of how we use it, or because of what it is and how it works ). Also, release time activity and how donors react to them makes it tricky to rely on specific nodes - something we really need to get to in order to improve the rsync tree cascading setup we have in place.
Thoughts, ideas ? Discuss!
- KB
On 08/11/13 22:00, Karanbir Singh wrote:
hi Guys
One of the key problems we need to solve / resolve and do so with some level of urgency is the mirror problem : How do we get content 'out there' fast enough to reduce the time-to-release lag. Typically, from the time everything is ready to go and the release notes / release email are done, it takes 3 days to get to a releaseable stage for the mirror network.
The rsync tree that we are using at this point is clearly not good-enough ( either because of how we use it, or because of what it is and how it works ). Also, release time activity and how donors react to them makes it tricky to rely on specific nodes - something we really need to get to in order to improve the rsync tree cascading setup we have in place.
Thoughts, ideas ? Discuss!
- KB
How about some sort of private bittorrent network?
T
On Fri, Nov 08, 2013 at 10:20:24PM +0000, Trevor Hemsley wrote:
On 08/11/13 22:00, Karanbir Singh wrote:
hi Guys
One of the key problems we need to solve / resolve and do so with some level of urgency is the mirror problem : How do we get content 'out there' fast enough to reduce the time-to-release lag. Typically, from the time everything is ready to go and the release notes / release email are done, it takes 3 days to get to a releaseable stage for the mirror network.
The rsync tree that we are using at this point is clearly not good-enough ( either because of how we use it, or because of what it is and how it works ). Also, release time activity and how donors react to them makes it tricky to rely on specific nodes - something we really need to get to in order to improve the rsync tree cascading setup we have in place.
Thoughts, ideas ? Discuss!
- KB
How about some sort of private bittorrent network?
Indeed this was my idea as well. Get a nice master with gig/10g and party on wayne.
On 11/08/2013 10:20 PM, Trevor Hemsley wrote:
How about some sort of private bittorrent network?
Given how many people use the bittorrent word everytime we come up with 'what options are there', it might be good to actually try this. I guess the biggest challenge would be to share a torrent tracker or some torrent info at different levels : 1) internally for .centos.org edge-nodes 2) with external mirrors 3) with everyone
Technically, (1) and (2) could be merged to get better all around 'performance' but as has been argued ( and proven ) in the past, when saturating straight lines, http traffice gets there faster than bittorrent does.
On Fri, Nov 08, 2013 at 10:45:56PM +0000, Karanbir Singh wrote:
On 11/08/2013 10:20 PM, Trevor Hemsley wrote:
How about some sort of private bittorrent network?
Given how many people use the bittorrent word everytime we come up with 'what options are there', it might be good to actually try this. I guess the biggest challenge would be to share a torrent tracker or some torrent info at different levels :
- internally for .centos.org edge-nodes
- with external mirrors
- with everyone
Technically, (1) and (2) could be merged to get better all around 'performance' but as has been argued ( and proven ) in the past, when saturating straight lines, http traffice gets there faster than bittorrent does.
I think you will need some beefy initial distribution boxes to make it worth your while. What is the overall architecture / speed of each device in #1 ?
On 11/08/2013 10:51 PM, Bryan Seitz wrote:
I think you will need some beefy initial distribution boxes to make it worth your while. What is the overall architecture / speed of each device in #1 ?
Its pretty variable. We've got a few 100mbps box's - a few gigabit machines - and a bunch in the middle. Also, lots of geo-holes : eg. we dont have a .centos.org machine that is within 3 to 5 hops of anything in Africa.
Add to that we have a pretty constant and consistent Chaos Monkey situation going on in the network with random machines doing random things at random points and .centos.org needing to run inspite of that.
Regards
On 08.11.2013 22:20, Trevor Hemsley wrote:
On 08/11/13 22:00, Karanbir Singh wrote:
hi Guys
One of the key problems we need to solve / resolve and do so with some level of urgency is the mirror problem : How do we get content 'out there' fast enough to reduce the time-to-release lag. Typically, from the time everything is ready to go and the release notes / release email are done, it takes 3 days to get to a releaseable stage for the mirror network.
The rsync tree that we are using at this point is clearly not good-enough ( either because of how we use it, or because of what it is and how it works ). Also, release time activity and how donors react to them makes it tricky to rely on specific nodes - something we really need to get to in order to improve the rsync tree cascading setup we have in place.
Thoughts, ideas ? Discuss!
- KB
How about some sort of private bittorrent network?
Or something like BTSync? Or maybe a push-based method?
One minor change to ease the situation would be to not push everything out to the mirrors at the same time, but to do it in smaller chunks.
For example, once the rpm's are pushed to CR, those packages (and the unchanged packages from a prior release) could be pushed to the new os directory. Perhaps along with a note saying that the packages are here only for seeding the mirror network for the eventual release and should not be used yet. Repodata wouldn't need to be included at this stage. The centosplus, extras etc. directories could be populated at this stage as well.
Syncing those out would take a few days. During syncing, the devs + QA team would be busy spinning and testing the .isos. Once they're deemed OK, they could be pushed to the master for syncing to mirrors, along with the few missing packages and repodata for the os directory.
The amount of data transferred would be the same as before. The advantage of this approach is that there would be less demand for bandwidth at the .iso sync stage, because the other bits were already synced to most of the mirrors some days ago.
Also might help to incrementally sync as things are built ?
On Sat, Nov 09, 2013 at 01:23:19AM +0200, Anssi Johansson wrote:
One minor change to ease the situation would be to not push everything out to the mirrors at the same time, but to do it in smaller chunks.
For example, once the rpm's are pushed to CR, those packages (and the unchanged packages from a prior release) could be pushed to the new os directory. Perhaps along with a note saying that the packages are here only for seeding the mirror network for the eventual release and should not be used yet. Repodata wouldn't need to be included at this stage. The centosplus, extras etc. directories could be populated at this stage as well.
Syncing those out would take a few days. During syncing, the devs + QA team would be busy spinning and testing the .isos. Once they're deemed OK, they could be pushed to the master for syncing to mirrors, along with the few missing packages and repodata for the os directory.
The amount of data transferred would be the same as before. The advantage of this approach is that there would be less demand for bandwidth at the .iso sync stage, because the other bits were already synced to most of the mirrors some days ago. _______________________________________________ CentOS-devel mailing list CentOS-devel@centos.org http://lists.centos.org/mailman/listinfo/centos-devel
Bryan Seitz kirjoitti:
Also might help to incrementally sync as things are built ?
I believe we still want to test things out a bit before the packages are made public. So, pushing things out immediately after they've been built might not be appropriate. When the packages hit CR, they've already been tested a bit by the QA team.
An addition to my previous idea -- when upstream releases the new release and we know which packages get updated, we could copy the previous release's unchanged packages to the new release's os tree immediately. This would also spread the load a bit.
rsync's -H flag would mean that the well-behaving mirrors would need to transfer only a tiny bit of data at that point. However, there are also mirrors that don't use the -H flag. Those mirrors would need to download a few gigabytes of data each at this stage.
One more thing -- if the bandwidth of the msync servers is the bottleneck, we could reduce their bandwidth requirements by serving the CR packages from public mirrors instead of mirror.centos.org. Most of the mirror.centos.org machines are also in the msync.centos.org rotation, thus they share the same bandwidth. I don't have statistics of the bandwidth consumed by CR. Someone who has access to such stats could see if this change would be worth implementing or not.
On 11/09/2013 01:07 AM, Anssi Johansson wrote:
One more thing -- if the bandwidth of the msync servers is the bottleneck, we could reduce their bandwidth requirements by serving the CR packages from public mirrors instead of mirror.centos.org. Most of the mirror.centos.org machines are also in the msync.centos.org rotation, thus they share the same bandwidth. I don't have statistics of the bandwidth consumed by CR. Someone who has access to such stats could see if this change would be worth implementing or not.
We certainly need to get smarter about the mirror/msync split - and go back to the older model we used where msync machines that were seeding the mirror network were taken out from being in mirror.centos.org - and we had dedicated box's for some high capacity mirrors ( like kernel.org and heanet.ie and mirrorservice.org ).
Again, this is a great thing to try and PoC and workout a functional setup - it would, ofcourse, need to be automated via puppet and managed mostly via zabbix. If there are manual steps involved, its going to fail and as has happened already, get discarded since its just more stuff to do.
We usually run hardlink over the entire mirror tree, so that should solve a large part of the dupe-content-being-rsync'd again.
Another issue we have and we should try and work around is that the entire tree os/ isos/ repo's/ is moved in at one point - maybe we should spread those a bit, allowing more rsync' runs to finish sooner ( and therefore allow subtree rsync's to start earlier ).
The old speedmatrix code I wrote back in 2010 mostly still works, and gives us a fair idea of what capacity the various msync and mirror.c.o machines have at a given point in time, and doing it a few times through a 24 hr cycle is almost always a true picture of reality.
I can get a bunch of VM's online to trial some of these things, but they will need to be in one play-cloud or something such, does anyone haves ideas on how one might create 'real world network problems' on a bunch of VM's ? eg. how do we cap say 25% of the network interfaces to 10mbps, and how might we generate latency ? I have a bunch of ideas on howto do this between machines, but not so much on VMs running on the same ( or just a few ) machines.
- KB
On Sat, Nov 9, 2013 at 3:46 PM, Karanbir Singh mail-lists@karan.org wrote:
[...]
I can get a bunch of VM's online to trial some of these things, but they will need to be in one play-cloud or something such, does anyone haves ideas on how one might create 'real world network problems' on a bunch of VM's ? eg. how do we cap say 25% of the network interfaces to 10mbps, and how might we generate latency ? I have a bunch of ideas on howto do this between machines, but not so much on VMs running on the same ( or just a few ) machines.
- KB
-- Karanbir Singh +44-207-0999389 | http://www.karan.org/ | twitter.com/kbsingh GnuPG Key : http://www.karan.org/publickey.asc _______________________________________________ CentOS-devel mailing list CentOS-devel@centos.org http://lists.centos.org/mailman/listinfo/centos-devel
Bandwidth limiting could probably be accomplished using 'tc', otherwise the "bwlimit" option to rsync might work well enough for a simulation. Looks like 'tc' can help with latency too ( http://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packe... ).
What kind of control do you have over the mirrors? One option is to reduce the amount of data that needs to be transferred. This could be accomplished by: a) Transferring only package files to mirrors, then running scripts to build the ISOs in place or b) Transferring only the ISOs, mounting them through loopback, then building the repos with symlinks into the mounted ISOs.
I am assuming that the packages in the repos are identical to those on the ISOs, so there's already a bunch of duplicate data getting sent over.
❧ Brian Mathis
On 10.11.2013 06:46, Brian Mathis wrote:
Bandwidth limiting could probably be accomplished using 'tc', otherwise the "bwlimit" option to rsync might work well enough for a simulation. Looks like 'tc' can help with latency too ( http://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packe... ).
What kind of control do you have over the mirrors? One option is to reduce the amount of data that needs to be transferred. This could be accomplished by: a) Transferring only package files to mirrors, then running scripts to build the ISOs in place or b) Transferring only the ISOs, mounting them through loopback, then building the repos with symlinks into the mounted ISOs.
I am assuming that the packages in the repos are identical to those on the ISOs, so there's already a bunch of duplicate data getting sent over.
This might not work, many mirrors do not carry all or any ISOs. I still think bittorrent/btsync/similar between the primary mirrors or a select number of mirrors that agree to this should work fine.
Related read: https://blog.twitter.com/2010/murder-fast-datacenter-code-deploys-using-bitt...
On Sun, Nov 10, 2013 at 09:19:45AM +0000, Nux! wrote:
On 10.11.2013 06:46, Brian Mathis wrote:
Bandwidth limiting could probably be accomplished using 'tc', otherwise the "bwlimit" option to rsync might work well enough for a simulation. Looks like 'tc' can help with latency too ( http://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packe... ).
What kind of control do you have over the mirrors? One option is to reduce the amount of data that needs to be transferred. This could be accomplished by: a) Transferring only package files to mirrors, then running scripts to build the ISOs in place or b) Transferring only the ISOs, mounting them through loopback, then building the repos with symlinks into the mounted ISOs.
I am assuming that the packages in the repos are identical to those on the ISOs, so there's already a bunch of duplicate data getting sent over.
This might not work, many mirrors do not carry all or any ISOs. I still think bittorrent/btsync/similar between the primary mirrors or a select number of mirrors that agree to this should work fine.
Related read: https://blog.twitter.com/2010/murder-fast-datacenter-code-deploys-using-bitt...
I was purely talking about the centos infra mirrors btw, external mirrors are still a wildcard and under their owners control.
On 11/10/2013 06:46 AM, Brian Mathis wrote:
Bandwidth limiting could probably be accomplished using 'tc', otherwise the "bwlimit" option to rsync might work well enough for a simulation. Looks like 'tc' can help with latency too (http://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packe...).
the setup uses openvswitch, and for some reason I've failed to get tc really going properly with ovs. help appreciated!
- KB
On 11/09/2013 01:23 AM, Anssi Johansson wrote:
One minor change to ease the situation would be to not push everything out to the mirrors at the same time, but to do it in smaller chunks.
For example, once the rpm's are pushed to CR, those packages (and the unchanged packages from a prior release) could be pushed to the new os directory. Perhaps along with a note saying that the packages are here only for seeding the mirror network for the eventual release and should not be used yet.
Just make sure that the top level directory is not accessible for reading and the privacy is ensured. And yes, I think that pushing the content of the /cr/ directory while we still test|fix the isos is a great idea, it halfs the content that needs to be transferred "just before" actual release
On 11/09/2013 12:47 AM, Manuel Wolfshant wrote:
Just make sure that the top level directory is not accessible for reading and the privacy is ensured. And yes, I think that pushing the content of the /cr/ directory while we still test|fix the isos is a great idea, it halfs the content that needs to be transferred "just before" actual release
the cr/ to os/ and previous os/ + updates/ hardlinking into the new os/ release tree solves that problem ( I think ) - but it does not help for isos/ - other repos like plus and extras are mostly just moved along.
and we cant really rely on the 700 mode on mirror-root/ since lots of mirrors dont run in a way that makes it usable ( I guess quite a few do, but not all ). Maybe the trick here is to seed up msync well in advance to once QA says GO, its a matter of opening up rsync from say 30 odd machines to the public mirrors.
regards