[CentOS-mirror] Setting up mirror from scratch

Alexandre Leonenko

alex at esecuredata.com
Sat Oct 17 07:13:21 UTC 2020


Thanks! I'll take a look and let you know if I have any questions.


From: CentOS-mirror <centos-mirror-bounces at centos.org> on behalf of Moshe M. Katz <mmkatz at umd.edu>
Sent: Thursday, October 15, 2020 8:49:39 PM
To: Mailing list for CentOS mirrors. <centos-mirror at centos.org>
Subject: Re: [CentOS-mirror] Setting up mirror from scratch

Here's a little bit about how we run mirror.umd.edu<http://mirror.umd.edu> (serving almost two dozen free software projects, up to 5 TB of traffic per day).

Hardware:
We are running on a Dell PowerEdge R720xd with 10x 8TB hard drives using ZFS in a raidz2 configuration. We also have two 512GB SSDs for L2ARC caching, and two 1TB hard drives in a mirrored configuration for the operating system. We have 128GB of RAM and we let ZFS use most of it (right now ZFS is using 90GB).

Software:
We serve files using nginx, with a pretty basic configuration that only has a few changes from the default. You can see it here:  https://github.com/umd-mirror/nginx
We have symlinks in our web root directory (`/home/mirror/web`) which point to the actual storage locations (in `/pool/mirrors/NAME`).

Management:
We use a set of Python scripts to manage syncing (mostly using rsync). You can find them here:  https://github.com/umd-mirror/scripts
We use Jenkins to schedule running the python scripts because it allows us to track success/failure of the syncing over time. (I don't like this, but it was this way before I became involved and it does work pretty well). I can't share the whole Jenkins configuration, but here is how our CentOS job is configured:
Run every six hours using: Build Triggers -> Build Periodically (checked) -> Schedule: "H */6 * * *"Build -> Execute Shell: "sudo -u mirror /home/mirror/scripts/run_mirror.py -v centos"
In order to allow the Jenkins user to use `sudo`, we have the following content in `/etc/sudoers.d/jenkins`:
jenkins ALL=(mirror) NOPASSWD: /home/mirror/scripts/run_mirror.py

Storage:
We created a ZFS ZPOOL at /pool/mirrors and then also a separate "sub pool" for each software project we mirror. That way we can manage disk space very easily and add quotas if we think a project is growing too quickly and might use too much disk space. (We have not had to do that now that we have our big new disks, but we used to need it on our old server and we are prepared to use it again in the future if needed.) It also allows us to show the used space very quickly and easily on our web page (look at the end of each row on our home page). To allow the `mirror` user to run ZFS commands, we created a shell script in `/usr/local/bin/get_lused.sh` which can be seen in our `scripts` repository. There is also a comment there with the necessary sudoers entry to allow it to work.

Feel free to ask me for details if you have any questions about our configuration or scripts.

Moshe

--
Moshe Katz
mmkatz at umd.edu<mailto:mmkatz at umd.edu>




On Thu, Oct 15, 2020 at 8:10 PM Alexandre Leonenko < alex at esecuredata.com<mailto:alex at esecuredata.com>> wrote:
Hey guys,

I'm setting up a new mirror as the existing one is going to be eol due to using CentOS 6 as a base. ( mirror.esecuredata.com<http://mirror.esecuredata.com>)
It's also running on old Pentium and is very slow nowadays.

My question is this, what do you guys usually use, apache or nginx for serving http?
Also if you have nifty scripts to share those would be welcome.

Regards,
Alex

_______________________________________________
CentOS-mirror mailing list
CentOS-mirror at centos.org<mailto:CentOS-mirror at centos.org>
https://lists.centos.org/mailman/listinfo/centos-mirror
________________________________
From: CentOS-mirror <centos-mirror-bounces at centos.org> on behalf of Moshe M. Katz <mmkatz at umd.edu>
Sent: Thursday, October 15, 2020 8:49:39 PM
To: Mailing list for CentOS mirrors. <centos-mirror at centos.org>
Subject: Re: [CentOS-mirror] Setting up mirror from scratch

Here's a little bit about how we run mirror.umd.edu<http://mirror.umd.edu> (serving almost two dozen free software projects, up to 5 TB of traffic per day).

Hardware:
We are running on a Dell PowerEdge R720xd with 10x 8TB hard drives using ZFS in a raidz2 configuration. We also have two 512GB SSDs for L2ARC caching, and two 1TB hard drives in a mirrored configuration for the operating system. We have 128GB of RAM and we let ZFS use most of it (right now ZFS is using 90GB).

Software:
We serve files using nginx, with a pretty basic configuration that only has a few changes from the default. You can see it here: https://github.com/umd-mirror/nginx
We have symlinks in our web root directory (`/home/mirror/web`) which point to the actual storage locations (in `/pool/mirrors/NAME`).

Management:
We use a set of Python scripts to manage syncing (mostly using rsync). You can find them here: https://github.com/umd-mirror/scripts
We use Jenkins to schedule running the python scripts because it allows us to track success/failure of the syncing over time. (I don't like this, but it was this way before I became involved and it does work pretty well). I can't share the whole Jenkins configuration, but here is how our CentOS job is configured:

  *   Run every six hours using: Build Triggers -> Build Periodically (checked) -> Schedule: "H */6 * * *"
  *   Build -> Execute Shell: "sudo -u mirror /home/mirror/scripts/run_mirror.py -v centos"

In order to allow the Jenkins user to use `sudo`, we have the following content in `/etc/sudoers.d/jenkins`:
jenkins ALL=(mirror) NOPASSWD: /home/mirror/scripts/run_mirror.py

Storage:
We created a ZFS ZPOOL at /pool/mirrors and then also a separate "sub pool" for each software project we mirror. That way we can manage disk space very easily and add quotas if we think a project is growing too quickly and might use too much disk space. (We have not had to do that now that we have our big new disks, but we used to need it on our old server and we are prepared to use it again in the future if needed.) It also allows us to show the used space very quickly and easily on our web page (look at the end of each row on our home page). To allow the `mirror` user to run ZFS commands, we created a shell script in `/usr/local/bin/get_lused.sh` which can be seen in our `scripts` repository. There is also a comment there with the necessary sudoers entry to allow it to work.

Feel free to ask me for details if you have any questions about our configuration or scripts.

Moshe

--
Moshe Katz
mmkatz at umd.edu<mailto:mmkatz at umd.edu>




On Thu, Oct 15, 2020 at 8:10 PM Alexandre Leonenko <alex at esecuredata.com<mailto:alex at esecuredata.com>> wrote:
Hey guys,

I'm setting up a new mirror as the existing one is going to be eol due to using CentOS 6 as a base. (mirror.esecuredata.com<http://mirror.esecuredata.com>)
It's also running on old Pentium and is very slow nowadays.

My question is this, what do you guys usually use, apache or nginx for serving http?
Also if you have nifty scripts to share those would be welcome.

Regards,
Alex

_______________________________________________
CentOS-mirror mailing list
CentOS-mirror at centos.org<mailto:CentOS-mirror at centos.org>
https://lists.centos.org/mailman/listinfo/centos-mirror
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-mirror/attachments/20201017/6cd5224a/attachment.html>


More information about the CentOS-mirror mailing list