On Tue, Jul 8, 2014 at 2:55 PM, Reindl Harald h.reindl@thelounge.net wrote:
Unless you are offering to do that for me, for free, on all my systems, having to do it certainly does take something away.
then just don't upgrade to RHEL7 so what
I expect our systems to still have services running past 2020.
Generally speaking, if a service is broken to the point that it needs something to automatically restart it I'd rather have it die gracefully and not do surprising things until someone fixes it. But then again, doesn't mysqld manage to accomplish that in a fully-compatible manner on Centos6?
generally speaking if my webserver dies for whatever reason in want it to get restarted *now* and seek for the reason while the services are up and running
Then I hope I'm never a customer of that service that doesn't know/care why it is failing. I consider it a much better approach to let you load balancing shift the connections to predictably working servers.
generally speaking: there is more than only mysqld on that world
generally speaking if i restart a server i want SSH tunnels to them get restarted on other machines automatically, see below
Seems awkward, compared to openvpn.
generally speaking if the OpenVPN service on the location some hundret kilometers away fails because the poor internet connection their i want it to be restarted
You don't have to restart openvpn to have it reconnect itself after network outages.
On 07/08/2014 04:19 PM, Les Mikesell wrote:
On Tue, Jul 8, 2014 at 2:55 PM, Reindl Harald h.reindl@thelounge.net wrote:
Unless you are offering to do that for me, for free, on all my systems, having to do it certainly does take something away.
then just don't upgrade to RHEL7 so what
I expect our systems to still have services running past 2020.
Ah, we will be at Centos 11 by then :)
Systemd will be a thing of the past and we will be dealing with systemq.
Robert Moskowitz wrote:
On 07/08/2014 04:19 PM, Les Mikesell wrote:
On Tue, Jul 8, 2014 at 2:55 PM, Reindl Harald h.reindl@thelounge.net wrote:
Unless you are offering to do that for me, for free, on all my systems, having to do it certainly does take something away.
then just don't upgrade to RHEL7 so what
I expect our systems to still have services running past 2020.
Ah, we will be at Centos 11 by then :)
Systemd will be a thing of the past and we will be dealing with systemq.
Actually, I preferred systemm
m(ark)....
systemv? Too soon?
On July 8, 2014 4:07:43 PM CDT, m.roth@5-cent.us wrote:
Robert Moskowitz wrote:
On 07/08/2014 04:19 PM, Les Mikesell wrote:
On Tue, Jul 8, 2014 at 2:55 PM, Reindl Harald
wrote:
Unless you are offering to do that for me, for free, on all my systems, having to do it certainly does take something away.
then just don't upgrade to RHEL7 so what
I expect our systems to still have services running past 2020.
Ah, we will be at Centos 11 by then :)
Systemd will be a thing of the past and we will be dealing with
systemq.
Actually, I preferred systemm
m(ark)....
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On Jul 8, 2014, at 2:03 PM, Robert Moskowitz rgm@htt-consult.com wrote:
] Ah, we will be at Centos 11 by then :)
Systemd will be a thing of the past and we will be dealing with systemq.
It'll be named "kitchensink", there will be only one process in the process table, and every bit of computation will be handled using kernel threads. All services will have been moved into the kernel for "speed", and exceptions will be handled by everything being virtualized - when the kernel crashes, the guest will just kill itself and respawn.
I really wish I was joking or being facetious. I'm not. This is pretty much the logical end result of the abomination that's systemd, and the appallingly stupid idea of putting dbus into the kernel. There's a reason for privilege and process separation, and people seem to have forgotten it.
More facetiously, Poettering will have rejoined a BSD project after effectively having killed off Linux for any production use, and laughing all the way to the bank. :)
--Russell
On Tue, 2014-07-08 at 17:44 -0700, Russell Miller wrote:
I really wish I was joking or being facetious. I'm not. This is pretty much the logical end result of the abomination that's systemd, and the appallingly stupid idea of putting dbus into the kernel. There's a reason for privilege and process separation, and people seem to have forgotten it.
More facetiously, Poettering will have rejoined a BSD project after effectively having killed off Linux for any production use, and laughing all the way to the bank. :)
That is a fundamental worry. Everything, except the kernel, dependent on Poettering's (employed by Red Hat) windows-style gigantic systemd. Nothing can run without systemd's prior consent. One tiny bug in systemd and everything crashes. Is that RH's new "resilience" strategy?
Have I really got this wrong?
Remember the old fashioned sayings?
*** Keep it simple stupid (KISS)
*** If it ain't broke, don't fix it (= If it is not broken, do not attempt to repair it)
M$-style script kiddies are improving Linux?
Poettering-kraft ? Nein danke.
On Jul 8, 2014, at 6:27 PM, Always Learning centos@u62.u22.net wrote:
That is a fundamental worry. Everything, except the kernel, dependent on Poettering's (employed by Red Hat) windows-style gigantic systemd. Nothing can run without systemd's prior consent. One tiny bug in systemd and everything crashes. Is that RH's new "resilience" strategy?
I've been not so subtly hinting that I think this kind of thing could and will destroy Linux - at least in any context other than rolling your other distribution and hoping for the best. (at least until they start putting dbus into the kernel - and I cannot say strongly enough how utterly DUMB that is.)
I don't mean that it will make it go away and people will stop using it and all that. It'll be going strong in one form or another for decades. What I mean is, that people who don't know what they're doing will be the only ones to actually be using it, and those who know better will have long ran off for greener pastures. And then the large distros will start catering to those inexperienced people, and mark my words, we're going to end up with a catastrophe sooner rather than later. Someone's going to put the wrong thing into the kernel, open up a security hole, and heartbleed all over again. And no one will learn the lesson.
We were worried years ago about Microsoft embracing and extending. That turned out to be the wrong worry. Looks like the right worry was people making stupid decisions and killing it from the inside. Congratulations, RedHat and Poettering - you did (or are doing) what Microsoft couldn't.
I've been toying with the idea of rolling a distribution similar to OpenBSD - with a focus on security and doing the right thing - no matter what other stupid crap other people are doing. The problem is that rolling a distro is a lot of work.
But... it may need to be done. The inmates are running the asylum.
--Russell
On 07/08/2014 09:27 PM, Always Learning wrote:
Everything, except the kernel, dependent on Poettering's (employed by Red Hat) windows-style gigantic systemd. Nothing can run without systemd's prior consent. One tiny bug in systemd and everything crashes.
How is this any different from any other init? Init is the boss, regardless of which flavor of init, full stop.
SystemV init has many many problems. The worst problem is that it only deals with start and forget and stop and forget, with relatively fragile shell scripts running as root doing the grunt work. A resilient system init should be a bit more hands-on about making sure its children continue to live... (yuck; you can tell I'm a parent (of five)!). Or, in Bill Cosby's words as Cliff Huxtable to Theo, "I brought you into this world, and I can take you out!" But an init that takes a bit more care to its offspring, making sure they stay alive until such time as they are needed to die (yuck again!) is a vast improvement over 'start it and forget it.'
And, of course, CentOS 6 doesn't use straight SysVInit anyway, but it uses upstart, which lived for quite a while.
Incidentally, I'm old enough to remember the recursive acronym MUNG and hereby apply that acronym to this thread......
I'm also familiar with feeping creaturism.....
On Wed, Jul 9, 2014 at 12:21 PM, Lamar Owen lowen@pari.edu wrote:
But an init that takes a bit more care to its offspring, making sure they stay alive until such time as they are needed to die (yuck again!) is a vast improvement over 'start it and forget it.'
So your solution to the problems that happen in complex daemon software is to use even more complex software as a manager for all of them??? Remind me why (a) you think that will be perfect, and (b) why you think an unpredictable daemon should be resurrected to continue its unpredictable behavior.
On 07/09/2014 01:38 PM, Les Mikesell wrote:
Remind me why
Sure.
(a) you think that will be perfect,
Nothing is ever perfect, and I didn't use that word. I think it will be, after some bug-wrangling, an improvement for many use cases, but not all.
and (b) why you think an unpredictable daemon should be resurrected to continue its unpredictable behavior.
I have had services that would reliably crash under certain reproduceable and consistent circumstances that were relatively harmless otherwise. Restarting the process if certain conditions were met was the documented by the vendor solution.
One of those processes was a live audio stream encoder program; occasionally the input sound card would hiccup and the encoder would crash. Restarting the encoder process was both harmless and necessary. While the solution was eventually found years later (driver problems) in the meantime the process restart was the correct method.
There are other init packages that do the same thing; it's a feature that many want.
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip>
and (b) why you think an unpredictable daemon should be resurrected to continue its unpredictable behavior.
I have had services that would reliably crash under certain reproduceable and consistent circumstances that were relatively harmless otherwise. Restarting the process if certain conditions were met was the documented by the vendor solution.
One of those processes was a live audio stream encoder program; occasionally the input sound card would hiccup and the encoder would crash. Restarting the encoder process was both harmless and necessary. While the solution was eventually found years later (driver problems) in the meantime the process restart was the correct method.
<snip> On the other hand, restarting can be the *wrong* answer for some things. For example, a bunch of our sites use SiteMinder from CA*. I do *not* restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
mark
* And CA appears to have never heard of selinux, and isn't that great with linux in general....
On Wed, Jul 9, 2014 at 2:00 PM, m.roth@5-cent.us wrote:
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip> >> and (b) why you think an unpredictable daemon should be resurrected to >> continue its unpredictable behavior. > > I have had services that would reliably crash under certain > reproduceable and consistent circumstances that were relatively harmless > otherwise. Restarting the process if certain conditions were met was > the documented by the vendor solution. > > One of those processes was a live audio stream encoder program; > occasionally the input sound card would hiccup and the encoder would > crash. Restarting the encoder process was both harmless and necessary. > While the solution was eventually found years later (driver problems) in > the meantime the process restart was the correct method. <snip> On the other hand, restarting can be the *wrong* answer for some things. For example, a bunch of our sites use SiteMinder from CA*. I do *not* restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
mark
- And CA appears to have never heard of selinux, and isn't that great with
linux in general....
Automatically restarting services is always a bad idea, especially when they are customer facing services. There is nothing worse than a problem that hides behind an automatic restart, especially while it's corrupting data since it's happily starting right back up after dying in the middle of a transaction and potentially creating new transactions that will also terminate when the app crashes again (and it most often will).
The least important aspect of a service dying is the state of the service itself, the most important is what has happened to the data when it abended. Restarting the service automatically after failure is a recipe for disaster.
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 9.7.2014 22:00, m.roth@5-cent.us wrote:
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip> >> and (b) why you think an unpredictable daemon should be resurrected to >> continue its unpredictable behavior. > I have had services that would reliably crash under certain > reproduceable and consistent circumstances that were relatively harmless > otherwise. Restarting the process if certain conditions were met was > the documented by the vendor solution. > > One of those processes was a live audio stream encoder program; > occasionally the input sound card would hiccup and the encoder would > crash. Restarting the encoder process was both harmless and necessary. > While the solution was eventually found years later (driver problems) in > the meantime the process restart was the correct method. <snip> On the other hand, restarting can be the *wrong* answer for some things. For example, a bunch of our sites use SiteMinder from CA*. I do *not* restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
mark
- And CA appears to have never heard of selinux, and isn't that great with
linux in general....
My limited understanding you are actually describing problem which systemd should be answer. It should take care of these things for you. Now you wait minute or two which is wrong way of doing it. Right way would be script that actually checks that nothing of the stuff is left around. It's same kind of hack solution that restarting dying service is. Sometimes hack solutions are needed and sometimes not.
In my again limited experience with systemd as running Fedora as "hobby"-server I have gathered that you can decide case by case basis should the process be restarted or not.
-vpk
On 9.7.2014 22:00, m.roth@5-cent.us wrote:
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip> On the other hand, restarting can be the *wrong* answer for some things. For example, a bunch of our sites use SiteMinder from CA*. I do *not* restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
- And CA appears to have never heard of selinux, and isn't that great
with linux in general....
My limited understanding you are actually describing problem which systemd should be answer. It should take care of these things for you. Now you wait minute or two which is wrong way of doing it. Right way would be script that actually checks that nothing of the stuff is left around. It's same kind of hack solution that restarting dying service is. Sometimes hack solutions are needed and sometimes not.
No, the *correct* answer I cannot begin to push, since I don't have an account with CA, and so can't file a bug against *THEIR* commercial $$$ crap code, and the one time I tried to push the team who actually owns it, they sort of mentioned it to CA (maybe, or maybe they were just lying to me), and it got blown off.
And no, not when we have this many servers, and my job depends on doing it correctly.
mark
On 9.7.2014 22:46, m.roth@5-cent.us wrote:
On 9.7.2014 22:00, m.roth@5-cent.us wrote:
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip> On the other hand, restarting can be the *wrong* answer for some things. For example, a bunch of our sites use SiteMinder from CA*. I do *not* restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
- And CA appears to have never heard of selinux, and isn't that great
with linux in general....
My limited understanding you are actually describing problem which systemd should be answer. It should take care of these things for you. Now you wait minute or two which is wrong way of doing it. Right way would be script that actually checks that nothing of the stuff is left around. It's same kind of hack solution that restarting dying service is. Sometimes hack solutions are needed and sometimes not.
No, the *correct* answer I cannot begin to push, since I don't have an account with CA, and so can't file a bug against *THEIR* commercial $$$ crap code, and the one time I tried to push the team who actually owns it, they sort of mentioned it to CA (maybe, or maybe they were just lying to me), and it got blown off.
And no, not when we have this many servers, and my job depends on doing it correctly.
So you actually go trough everytime to make sure that all the things are properly closed and shut down instead of just waiting few minutes? As sometimes something could go wrong and waiting few minutes isn't enough. I would prefer the software to do it for me. Even more prefer someone else to write it so that I can do all the other things I need to do and not bill customer of busywork of reinventing the wheel. It's pipedream that broken software is fixed so I am glad of any solutions which help deal with it.
-vpk
On 9.7.2014 22:46, m.roth@5-cent.us wrote:
On 9.7.2014 22:00, m.roth@5-cent.us wrote:
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip> On the other hand, restarting can be the *wrong* answer for some things.For example, a bunch of our sites use SiteMinder from CA*. I
do *not*
restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
- And CA appears to have never heard of selinux, and isn't that great
with linux in general....
<snip
No, the *correct* answer I cannot begin to push, since I don't have an account with CA, and so can't file a bug against *THEIR* commercial $$$ crap code, and the one time I tried to push the team who actually owns it, they sort of mentioned it to CA (maybe, or maybe they were just
lying to
me), and it got blown off.
And no, not when we have this many servers, and my job depends on doing it correctly.
So you actually go trough everytime to make sure that all the things are
Trough? I don't understand. I do a service httpd stop, and then a ps -ef to grep for siteminder still running, and then start it again. If there are problems getting into the website, I shut it down again, then check using ipcs, and ipcrm to manually get rid of their crap, then service httpd start.
properly closed and shut down instead of just waiting few minutes? As
Waiting a few minutes is not appreciated in either a real production or development environment. <snip> mark
On 9.7.2014 23:07, m.roth@5-cent.us wrote:
On 9.7.2014 22:46, m.roth@5-cent.us wrote:
On 9.7.2014 22:00, m.roth@5-cent.us wrote:
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip> On the other hand, restarting can be the *wrong* answer for some things.For example, a bunch of our sites use SiteMinder from CA*. I
do *not*
restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
- And CA appears to have never heard of selinux, and isn't that great
with linux in general....
<snip
No, the *correct* answer I cannot begin to push, since I don't have an account with CA, and so can't file a bug against *THEIR* commercial $$$ crap code, and the one time I tried to push the team who actually owns it, they sort of mentioned it to CA (maybe, or maybe they were just
lying to
me), and it got blown off.
And no, not when we have this many servers, and my job depends on doing it correctly.
So you actually go trough everytime to make sure that all the things are
Trough? I don't understand. I do a service httpd stop, and then a ps -ef to grep for siteminder still running, and then start it again. If there are problems getting into the website, I shut it down again, then check using ipcs, and ipcrm to manually get rid of their crap, then service httpd start.
"Go trough everyting" meant all of the checking you just described. (Translates more or less directly like that from my native language.) My point is that I would have made script to deal with that. Not necessary automatic. And with systemd it should automatically check for any children left behind httpd automatically. So no need for the script and not really need for the other checking provided things work. Of course if things don't work then it's reason to complain.
properly closed and shut down instead of just waiting few minutes? As
Waiting a few minutes is not appreciated in either a real production or development environment.
So waiting isn't appreciated except for waiting done for you to log in (dragging yourself) to the server and doing the things described above? I would prefer it to be automated with message to me what happened. (In this instance it's not solution to everything)
-vpk
On 07/09/2014 03:00 PM, m.roth@5-cent.us wrote:
Lamar Owen wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
<snip> >> and (b) why you think an unpredictable daemon should be resurrected to >> continue its unpredictable behavior. > I have had services that would reliably crash under certain > reproduceable and consistent circumstances that were relatively harmless > otherwise. Restarting the process if certain conditions were met was > the documented by the vendor solution. > > One of those processes was a live audio stream encoder program; > occasionally the input sound card would hiccup and the encoder would > crash. Restarting the encoder process was both harmless and necessary. > While the solution was eventually found years later (driver problems) in > the meantime the process restart was the correct method. <snip> On the other hand, restarting can be the *wrong* answer for some things. For example, a bunch of our sites use SiteMinder from CA*. I do *not* restart httpd; I stop it, and wait half a minute or so to make sure sitenanny has shut down correctly and completely, closed all of its sockets, and released all of its IPC semaphores and shared memory segments, and *then* start it up. Otherwise, no happiness.
mark
- And CA appears to have never heard of selinux, and isn't that great with
linux in general....
So develop your own. I have some scripts around here somewhere that I have used in the past to help me make policies for things that tell you to turn off selinux (I think I did it for roundcube).
On Wed, Jul 9, 2014 at 2:56 PM, Lamar Owen lowen@pari.edu wrote:
On 07/09/2014 01:38 PM, Les Mikesell wrote:
Remind me why
Sure.
(a) you think that will be perfect,
Nothing is ever perfect, and I didn't use that word. I think it will be, after some bug-wrangling, an improvement for many use cases, but not all.
and (b) why you think an unpredictable daemon should be resurrected to continue its unpredictable behavior.
I have had services that would reliably crash under certain reproduceable and consistent circumstances that were relatively harmless otherwise. Restarting the process if certain conditions were met was the documented by the vendor solution.
One of those processes was a live audio stream encoder program; occasionally the input sound card would hiccup and the encoder would crash. Restarting the encoder process was both harmless and necessary. While the solution was eventually found years later (driver problems) in the meantime the process restart was the correct method.
There are other init packages that do the same thing; it's a feature that many want.
Since I missed most of the story, can you specify that it is ok for this program to restart whenever it crashes, but this one you will stop restarting after N crashes (N<=0) and then report?
CentOS mailing list CentOS@centos.org http://lists.centos.org/mailman/listinfo/centos
On 07/09/2014 03:03 PM, Mauricio Tavares wrote:
Since I missed most of the story, can you specify that it is ok for this program to restart whenever it crashes, but this one you will stop restarting after N crashes (N<=0) and then report?
While I am certainly not an expert with systemd, it appears that you have a more generic mechanism than that in the OnFailure directive in the unit files. So you can basically do any sort of thing on a unit failure, including restart or start a different unit or whatever.