[CentOS-mirror] Setting up mirror from scratch

Fri Oct 16 03:49:39 UTC 2020
Moshe M. Katz <mmkatz at umd.edu>

Here's a little bit about how we run mirror.umd.edu (serving almost two
dozen free software projects, up to 5 TB of traffic per day).

We are running on a Dell PowerEdge R720xd with 10x 8TB hard drives using
ZFS in a raidz2 configuration. We also have two 512GB SSDs for L2ARC
caching, and two 1TB hard drives in a mirrored configuration for the
operating system. We have 128GB of RAM and we let ZFS use most of it (right
now ZFS is using 90GB).

We serve files using nginx, with a pretty basic configuration that only has
a few changes from the default. You can see it here:
We have symlinks in our web root directory (`/home/mirror/web`) which point
to the actual storage locations (in `/pool/mirrors/NAME`).

We use a set of Python scripts to manage syncing (mostly using rsync). You
can find them here: https://github.com/umd-mirror/scripts
We use Jenkins to schedule running the python scripts because it allows us
to track success/failure of the syncing over time. (I don't like this, but
it was this way before I became involved and it does work pretty well). I
can't share the whole Jenkins configuration, but here is how our CentOS job
is configured:

   - Run every six hours using: Build Triggers -> Build Periodically
   (checked) -> Schedule: "H */6 * * *"
   - Build -> Execute Shell: "sudo -u mirror
   /home/mirror/scripts/run_mirror.py -v centos"

In order to allow the Jenkins user to use `sudo`, we have the following
content in `/etc/sudoers.d/jenkins`:
jenkins ALL=(mirror) NOPASSWD: /home/mirror/scripts/run_mirror.py

We created a ZFS ZPOOL at /pool/mirrors and then also a separate "sub pool"
for each software project we mirror. That way we can manage disk space very
easily and add quotas if we think a project is growing too quickly and
might use too much disk space. (We have not had to do that now that we have
our big new disks, but we used to need it on our old server and we are
prepared to use it again in the future if needed.) It also allows us to
show the used space very quickly and easily on our web page (look at the
end of each row on our home page). To allow the `mirror` user to run ZFS
commands, we created a shell script in `/usr/local/bin/get_lused.sh` which
can be seen in our `scripts` repository. There is also a comment there with
the necessary sudoers entry to allow it to work.

Feel free to ask me for details if you have any questions about our
configuration or scripts.


Moshe Katz
mmkatz at umd.edu

On Thu, Oct 15, 2020 at 8:10 PM Alexandre Leonenko <alex at esecuredata.com>

> Hey guys,
> I'm setting up a new mirror as the existing one is going to be eol due to
> using CentOS 6 as a base. (mirror.esecuredata.com)
> It's also running on old Pentium and is very slow nowadays.
> My question is this, what do you guys usually use, apache or nginx for
> serving http?
> Also if you have nifty scripts to share those would be welcome.
> Regards,
> Alex
> _______________________________________________
> CentOS-mirror mailing list
> CentOS-mirror at centos.org
> https://lists.centos.org/mailman/listinfo/centos-mirror
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-mirror/attachments/20201015/3ed452d8/attachment-0003.html>