Kudos Fabian for having taken care of this during your PTO. On Mon, Mar 4, 2024 at 2:07 PM Amy Marrich <amy at redhat.com> wrote: > Thank you so much Fabian for doing all that while on PTO! > > Amy > > *Amy Marrich* > > She/Her/Hers > > Principal Technical Marketing Manager - Cloud Platforms > > Red Hat, Inc <https://www.redhat.com/> > > amy at redhat.com > > Mobile: 954-818-0514 > > Slack: amarrich > > IRC: spotz > <https://www.redhat.com/> > > > On Sun, Mar 3, 2024 at 3:47 PM Fabian Arrotin <arrfab at centos.org> wrote: > >> On 03/03/2024 20:27, Fabian Arrotin wrote: >> > On 03/03/2024 19:48, Fabian Arrotin wrote: >> >> Today evening (Sunday), I got zabbix notification that some services >> >> hosted on same hypervisor were down. >> >> A quick investigation showed me that despite running on a hardware >> >> raid controller, said server firware confirm data loss and corruption. >> >> >> >> As I'm myself normally on PTO, I still wanted to restore services to >> >> quickly working on trying to redeploy from scratch services, and >> >> restore data from last backup and hope to have news soon ... >> >> >> > >> > Status update : cbs.centos.org kojihub was fully reinstalled from >> > scratch on a different hypervisor, reconfigured by Ansible and DB >> > restored from backup that happened earlier today. >> > >> > Quickly checked and it seems all operations are working fine. >> > The only issue you should eventually see is if you submitted a build >> > today, *after* postgresql backup operation took place, so if that's the >> > case, reconsider rebuilding your rpm (but it's usually quite during the >> > weekend, especially on Sunday) >> > >> > Next item to reinstall/restore : git.centos.org >> > >> >> https://git.centos.org is now also fully redeployed from scratch on a >> different hypervisor, reconfigured fully by ansible and data restored >> from backup (that's the step that needed more time as I had to restore >> ~1TiB of data from remote backup server to local pagure instance) >> >> What I (quicky) tried after service was restored : >> - git pull from various repositories >> - git commit and push to one specific branch (test only) >> - verified mqtt notifications were also working >> - push a random file to lookaside cache (testing identified fasjson api >> call to verify if I was allowed to push to a specific sig-infra branch) >> >> Everything seems to work but here are some interesting informations , as >> we fully redeployed the machine, sshd_host_key changed and can be viewed >> through web ui : https://git.centos.org/ssh_info >> >> Also worth knowing that if you trust our CA, you shouldn't need to worry >> about key change , as new sshd_host_key is also signed by same CA. >> >> That just means that you should trust this in your ~/.ssh/known_hosts >> >> @cert-authority *.centos.org ssh-rsa >> >> AAAAB3NzaC1yc2EAAAADAQABAAABAQDXmhva/yVOS6y/sR1Pjd+Gflzkl7azfl3ZIhex5kSHilUjT3DSjfXK0TgSHT93BCKs1/mT84ZKv6s+Ulfc3kC9aykJQnkWJ6I6CjIgfIM547VT2Egx5fKJZ/7yRedYf6HoVPZSAW5WYKZ0fq/DDoAFUuZJkkp3QEzh6TUiXif9qjCu3liXNgkS2uVIWc7+1QTLRxqU3/MCD1YxuOL8ShyMSHlGJTRMMTYq6aAFmlQ/FsA8deb9HeR3PaAZx7Q7jqmiJD5cx9XtrmgM4CCZNFxP9i0s+L7yDKzFQ1ecm1/vzouOsAVcSh7MiAexuBLgbUdhmBDGVEJYQDNENKOdaoiP >> >> >> WRT content/git repositories: same remark as for kojihub/cbs : we >> restored from backup so it can be that you'll have to push again commits >> (if any) and/or assets to lookaside cache if you used git.centos.org >> this Sunday >> >> >> PS: I'm myself normally on PTO/Away/Grief mode so not normally paying >> attention to the list nor irc. If you encounter any issue due to this >> unscheduled outage, feel free to open a ticket on >> pagure.io/centos-infra/issues >> >> Kind Regards, >> -- >> Fabian Arrotin >> The CentOS Project | https://www.centos.org >> gpg key: 17F3B7A1 | @arrfab[@fosstodon.org] >> >> _______________________________________________ >> CentOS-devel mailing list >> CentOS-devel at centos.org >> https://lists.centos.org/mailman/listinfo/centos-devel >> > _______________________________________________ > CentOS-devel mailing list > CentOS-devel at centos.org > https://lists.centos.org/mailman/listinfo/centos-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-devel/attachments/20240304/9d0e6549/attachment.html>