[Ci-users] Unexpected outage 17:00 UTC Today - Service Restored
Ari LiVigni
alivigni at redhat.com
Wed Jun 14 17:31:55 UTC 2017
On Wed, Jun 14, 2017 at 1:28 PM, Karanbir Singh <kbsingh at centos.org> wrote:
>
> hi Ari,
>
> Absolutely! Lets see if we can get Brian for sometime later this week,
> or early next week, and thrash through some options.
>
> Regards,
>
+1
Scott Hebert on our team has a lot of Jenkins knowledge and has written
plugins as well.
I added him to the thread
>
>
> On 14/06/17 18:05, Ari LiVigni wrote:
> > Hi KB,
> >
> > In the future our team would like to help with Jenkins maintenance and
> > issues. This is something I have spoken about with Brian.
> > Let me know if this is an option you would like to pursue in the near
> term.
> >
> >
> >
> > On Wed, Jun 14, 2017 at 12:20 PM, Karanbir Singh <kbsingh at centos.org
> > <mailto:kbsingh at centos.org>> wrote:
> >
> >
> >
> > On 14/06/17 10:51, Karanbir Singh wrote:
> > >
> > >
> > > On 14/06/17 08:18, Daniel Horák wrote:
> > >> Hi Brian,
> > >> I see lots of slaves offline, is it connected to the yesterday's
> outage
> > >> or is it different issue?
> > >>
> > >> Thanks,
> > >> Daniel
> > >>
> > >> On 06/13/17 19:57, Brian Stinson wrote:
> > >>> Hi Folks,
> > >>>
> > >>> Jenkins was leaking file descriptors and hit a limit today at
> 17:00 UTC,
> > >>> service was degraded for about 10 minutes, and service was fully
> > >>> restored at around 17:24.
> > >>>
> > >>> I've increased the open-files limit for jenkins and am working
> on tuning
> > >>> the garbage collector to mitigate this in the future.
> > >>>
> > >>> Thanks for your patience, and apologies for any inconvenience.
> > >>>
> > >
> > > I noticed a lot of slaves were down, and was pointed to this by a
> few
> > > people - on chat.openshift.io <http://chat.openshift.io> and
> irc.freenode : on
> > investigation it
> > > looked like jenkins master had exhausted ram and other jobs on the
> > > machine were killing the cpu with loads upto 50.x; I had to
> restart the
> > > jenkins master to bring services back.
> > >
> > > once Brian is online, he will likely do a more through
> investigation and
> > > get back with details.
> > >
> >
> > service went down again a few minutes back, I have restarted jenkins
> and
> > its up again.
> >
> > Brian is on a long haul flight out of the US at the moment, I will
> try
> > and keep an eye on things, but were going to need him to look when
> > he can
> >
> >
>
>
> --
> Karanbir Singh, Project Lead, The CentOS Project
> +44-207-0999389 | http://www.centos.org/ | twitter.com/CentOS
> GnuPG Key : http://www.karan.org/publickey.asc
>
>
> _______________________________________________
> Ci-users mailing list
> Ci-users at centos.org
> https://lists.centos.org/mailman/listinfo/ci-users
>
>
--
-== @ri ==-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/ci-users/attachments/20170614/d4032cf5/attachment.html>
More information about the Ci-users
mailing list