[CentOS] tigervnc-server-module crashes after EL 6.3 update

Fri Aug 24 13:53:18 UTC 2012
Cal Webster <cwebster at ec.rr.com>

On Fri, 2012-08-17 at 19:46 -0500, Johnny Hughes wrote:
> On 08/17/2012 01:40 PM, Cal Webster wrote:
> > On Thu, 2012-08-16 at 17:01 -0500, Johnny Hughes wrote:
> >> On 08/16/2012 11:43 AM, Cal Webster wrote:
> >>> On Wed, 2012-08-15 at 13:56 -0500, Johnny Hughes wrote:
> >>>> On 08/15/2012 09:47 AM, Cal Webster wrote:
> >>>>> On Tue, 2012-08-14 at 20:55 -0500, Johnny Hughes wrote:
> >>>>>> On 08/14/2012 05:23 PM, Cal Webster wrote:
> >>>>>>> We began experiencing failed vnc connections to the console display on
> >>>>>>> servers that have been updated to EL 6.3. No such failures have occurred
> >>>>>>> on similar connections to EL 6.2 servers.
> >>>>>>>
> >>>>>>> On the client machine a normal vncviewer display appears with the
> >>>>>>> expected graphical login until the mouse pointer is moved within the
> >>>>>>> boundaries of the vncviewer window. At this point the window closes and
> >>>>>>> an error message appears in both a pop-up window and in the terminal
> >>>>>>> window in which the session was initiated stating "read: Connection
> >>>>>>> reset by peer (104)".
> >>>>>>>
> >>>>>>> On the server end, a core dump is generated and a abrt bug report is
> >>>>>>> created.
> >>>>>>>
> >>>>>>> /var/log/messages
> >>>>>>> ----------------------------------------------
> >>>>>>> Aug 14 11:00:30 jato2 abrt[11411]: File '/usr/bin/Xorg' seems to be
> >>>>>>> deleted
> >>>>>>> Aug 14 11:00:30 jato2 abrt[11411]: Saved core dump of pid 7892
> >>>>>>> (/usr/bin/Xorg) to /var/spool/abrt/ccpp-2012-08-14-11:00:30-7892
> >>>>>>> (42041344 bytes)
> >>>>>>> Aug 14 11:00:30 jato2 abrtd: Directory 'ccpp-2012-08-14-11:00:30-7892'
> >>>>>>> creation detected
> >>>>>>> ----------------------------------------------
> >>>>>>>
> >>>>>>> This bug has been reported in the CentOS bug tracker here:
> >>>>>>>
> >>>>>>> 0005824: tigervnc-server-module keep crashing
> >>>>>>> http://bugs.centos.org/view.php?id=5824
> >>>>>>>
> >>>>>>> However, this appears to be a bug upstream. The source RPM provided with
> >>>>>>> CentOS is identical to that of upstream with no modifications. Also,
> >>>>>>> there is an upstream bug reported that appears to have the same
> >>>>>>> symptoms. I have added a comment to the upstream bug report (listed
> >>>>>>> below) if anyone wishes to see the details.
> >>>>>>>
> >>>>>>> tigervnc-server-module crashes with dual screen setup
> >>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=820443
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> We have verified that rebuilding the unmodified source RPM for tigervnc
> >>>>>>> produces a tigervnc-server-module RPM that does not suffer from this
> >>>>>>> bug.
> >>>>>>>
> >>>>>>> Removing the original tigervnc-server-module package and replacing it
> >>>>>>> with the rebuilt one fixes the problem.
> >>>>>>>
> >>>>>>> I've duplicated the problem on 2 EL 6.3 x86_64 single-head display
> >>>>>>> machines and have verified the fix.
> >>>>>>>
> >>>>>>> Tomorrow, I'll duplicate the problem on a dual-head x86_64 machine that
> >>>>>>> currently still works after updating to EL 6.2 then confirm the the fix.
> >>>>>>>
> >>>>>> Did you rebuild the SRPM using mock or directly on a physical machine
> >>>>>> with rpmbuild?
> >>>>> No mock, just a simple "rpmbuild -ba SPEC/tigervnc.spec"
> >>>> OK, if you find that this solves your problems for sure, I will build
> >>>> the SRPM outside of mock and see if it is different.
> >>> I've confirmed the same faulty behavior for the update to 6.3 on our
> >>> dual-head systems.
> >>>
> >>> Also confirmed is that replacing the 6.3 base tigervnc-server-module rpm
> >>> with the rebuilt one does fix the problem on the dual-head systems.
> >>>
> >>> One disturbing difference between single and dual headed systems is that
> >>> on the dual-head systems Xorg generates a core dump and completely
> >>> freezes up when the mouse movement is detected. Single-head systems just
> >>> fail to connect. This complication could be somehow caused by our
> >>> proprietary "ATI FirePro 2270" drivers, though. Once the rebuilt module
> >>> is installed the systems run fine.
> >>>
> >>> I've also updated the upstream bug report.
> >> can you see if either or both of these work for you:
> >>
> >> http://people.centos.org/hughesjr/tigervnc/
> >>
> >> One set was built inside of mock, the other outside of mock in a virtual
> >> machine with only the build requirements of the SRPM installed.
> > Both builds work without problems on single and dual-head systems here.
> > As with all the other tests, I only replaced the tigervnc-server-module
> > package on each host.
> >
> > I've also confirmed that i686 platforms suffer from the same bug. These
> > too, however, are easily remedied by replacing the base
> > tigervnc-server-module RPM with a locally re-build one.
> 
> Would you also test that these work:
> 
> http://people.centos.org/hughesjr/tigervnc/
> 
> (same link, newer files :D)
> 
> NOTE:  It is CentOS policy that we do not correct upstream bugs in our
> distributions directly ... therefore these will not be released into the
> main distro until upstream releases an update.  I know that is a PITA
> for people, however, it is our policy and we can't break it.

Got them and tested on 3 different machines, 2 x86_64 (one single and
one dual-head) and one i386. All work well. This time I updated all
tigervnc packages, not just tigervnc-server-module.

I added all the packages to our internal "localcentos" repo so all our
systems will pickup the update, thanks to your incremental version
number change. I use the localcentos repo for one-off rpmbuilds and
things like this that don't come through normal channels.

Thanks for the work you put into this Johnny, as well as your continuing
efforts in this project. You often don't get the credit you deserve.

./Cal