[CentOS-virt] HCI Cluster - CentOS8 to Streams Upgrade Broken

Sandro Bonazzola

sbonazzo at redhat.com
Mon Jan 18 09:39:23 UTC 2021


Il giorno gio 7 gen 2021 alle ore 15:58 Jeremey Wise <jeremey.wise at gmail.com>
ha scritto:

>
> I have a test environment.  Three node HCI cluster.  CentOS8 build.
> Gluster as file system with standard cockpit deploy of HCI.
>

Hi, I would recommend to reach users at ovirt.org mailing list for oVirt
related issues.



>
>
> Converted to CentOS Streams which seemed to go fine.  Did a yum update and
> no issues.
>
> Did a reboot.. and now engine will no longer start. So I can no longer
> start my Virtual machines.  I posted as bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1911910   I posted to CentOS
> forum https://forums.centos.org/viewtopic.php?f=54&t=76716   but no
> responses.
>
> Can anyone provide means or next step to root cause and or fix?
>

We pushed a fix which is in current nightly but please keep using CentOS
Linux for oVirt till it will be officially announced its full compatibility
with CentOS Stream.
Currently it's a tech preview up to oVirt 4.4.4.


> It would take me days to rebuild the entire solution and I really hate to
> "reload" as a fix.. but after two weeks... nothing changing... just have to
> find some means to get cluster back working.
>
> Thanks.
>
> ##### all three servers are looping below events in /var/log/messages ###
> Jan  7 09:48:03 thor journal[2375050]: ovirt-ha-broker
> ovirt_hosted_engine_ha.broker.broker.Broker ERROR Failed initializing the
> broker: [Errno 107] Transport endpoint is not connected:
> '/rhev/data-center/mnt/glusterSD/thorst.penguinpages.local:_engine/3afc47ba-afb9-413f-8de5-8d9a2f45ecde/ha_agent/hosted-engine.metadata'
> Jan  7 09:48:03 thor journal[2375050]: ovirt-ha-broker
> ovirt_hosted_engine_ha.broker.broker.Broker ERROR Traceback (most recent
> call last):#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py",
> line 64, in run#012    self._storage_broker_instance =
> self._get_storage_broker()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/broker.py",
> line 143, in _get_storage_broker#012    return
> storage_broker.StorageBroker()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/broker/storage_broker.py",
> line 97, in __init__#012    self._backend.connect()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 408, in connect#012    self._check_symlinks(self._storage_path,
> volume.path, service_link)#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/storage_backends.py",
> line 105, in _check_symlinks#012    os.unlink(service_link)#012OSError:
> [Errno 107] Transport endpoint is not connected:
> '/rhev/data-center/mnt/glusterSD/thorst.penguinpages.local:_engine/3afc47ba-afb9-413f-8de5-8d9a2f45ecde/ha_agent/hosted-engine.metadata'
> Jan  7 09:48:03 thor journal[2375050]: ovirt-ha-broker
> ovirt_hosted_engine_ha.broker.broker.Broker ERROR Trying to restart the
> broker
> Jan  7 09:48:03 thor platform-python[2375050]: detected unhandled Python
> exception in '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
> Jan  7 09:48:03 thor abrt-server[2375084]: Not saving repeating crash in
> '/usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker'
> Jan  7 09:48:03 thor systemd[1]: ovirt-ha-broker.service: Main process
> exited, code=exited, status=1/FAILURE
> Jan  7 09:48:03 thor systemd[1]: ovirt-ha-broker.service: Failed with
> result 'exit-code'.
> Jan  7 09:48:04 thor systemd[1]: ovirt-ha-broker.service: Service
> RestartSec=100ms expired, scheduling restart.
> Jan  7 09:48:04 thor systemd[1]: ovirt-ha-broker.service: Scheduled
> restart job, restart counter is at 44569.
> Jan  7 09:48:04 thor systemd[1]: Stopped oVirt Hosted Engine High
> Availability Communications Broker.
> Jan  7 09:48:04 thor systemd[1]: Started oVirt Hosted Engine High
> Availability Communications Broker.
> Jan  7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Service
> RestartSec=10s expired, scheduling restart.
> Jan  7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Scheduled restart
> job, restart counter is at 22270.
> Jan  7 09:48:06 thor systemd[1]: Stopped oVirt Hosted Engine High
> Availability Monitoring Agent.
> Jan  7 09:48:06 thor systemd[1]: Started oVirt Hosted Engine High
> Availability Monitoring Agent.
> Jan  7 09:48:06 thor systemd[1]: Started Session c44598 of user root.
> Jan  7 09:48:06 thor systemd[1]: session-c44598.scope: Succeeded.
> Jan  7 09:48:06 thor journal[2375091]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine ERROR Failed to
> start necessary monitors
> Jan  7 09:48:06 thor journal[2375091]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Traceback (most recent call
> last):#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 85, in start_monitor#012    response = self._proxy.start_monitor(type,
> options)#012  File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in
> __call__#012    return self.__send(self.__name, args)#012  File
> "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request#012
>  verbose=self.__verbose#012  File "/usr/lib64/python3.6/xmlrpc/client.py",
> line 1154, in request#012    return self.single_request(host, handler,
> request_body, verbose)#012  File "/usr/lib64/python3.6/xmlrpc/client.py",
> line 1166, in single_request#012    http_conn = self.send_request(host,
> handler, request_body, verbose)#012  File
> "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request#012
>  self.send_content(connection, request_body)#012  File
> "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content#012
>  connection.endheaders(request_body)#012  File
> "/usr/lib64/python3.6/http/client.py", line 1264, in endheaders#012
>  self._send_output(message_body, encode_chunked=encode_chunked)#012  File
> "/usr/lib64/python3.6/http/client.py", line 1040, in _send_output#012
>  self.send(msg)#012  File "/usr/lib64/python3.6/http/client.py", line 978,
> in send#012    self.connect()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py",
> line 74, in connect#012
>  self.sock.connect(base64.b16decode(self.host))#012FileNotFoundError:
> [Errno 2] No such file or directory#012#012During handling of the above
> exception, another exception occurred:#012#012Traceback (most recent call
> last):#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 131, in _run_agent#012    return action(he)#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py",
> line 55, in action_proper#012    return he.start_monitoring()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 437, in start_monitoring#012    self._initialize_broker()#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py",
> line 561, in _initialize_broker#012    m.get('options', {}))#012  File
> "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py",
> line 91, in start_monitor#012    ).format(t=type, o=options,
> e=e)#012ovirt_hosted_engine_ha.lib.exceptions.RequestError: brokerlink -
> failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or
> directory, [monitor: 'network', options: {'addr': '172.16.100.1',
> 'network_test': 'dns', 'tcp_t_address': '', 'tcp_t_port': ''}]
> Jan  7 09:48:06 thor journal[2375091]: ovirt-ha-agent
> ovirt_hosted_engine_ha.agent.agent.Agent ERROR Trying to restart agent
> Jan  7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Main process
> exited, code=exited, status=157/n/a
> Jan  7 09:48:06 thor systemd[1]: ovirt-ha-agent.service: Failed with
> result 'exit-code'.
> ####
>
> Is their a means to start a VM... when oVirt / engine is offline?
>
>
> --
> penguinpages
> _______________________________________________
> CentOS-virt mailing list
> CentOS-virt at centos.org
> https://lists.centos.org/mailman/listinfo/centos-virt
>


-- 

Sandro Bonazzola

MANAGER, SOFTWARE ENGINEERING, EMEA R&D RHV

Red Hat EMEA <https://www.redhat.com/>

sbonazzo at redhat.com
<https://www.redhat.com/>

*Red Hat respects your work life balance. Therefore there is no need to
answer this email out of your office hours.
<https://mojo.redhat.com/docs/DOC-1199578>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20210118/370a9f89/attachment.html>


More information about the CentOS-virt mailing list