Bug #18748

Fix behavior (was: reproduce crashes on, provide fix)

Added by Dennis Box about 2 years ago. Updated almost 2 years ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:
First Occurred:
Occurs In:


glideinwms/lib/ was changed in v3.2.20 to use epoll() instead of select() for #17067 .

It was tested heavily on the factory but not well enough on the frontend side of things.

When was upgraded to 3.2.20, it started throwing uncaught exceptions, eventually crashing the frontend. A temporary fix was made to roll back code to the previous release.

This urgently needs to be reproduced and understood.

As a side note, changes to the rrd files during the upgrade make rolling back to the previous release difficult. If there is a way to read/write the rrd files that doesn't care if new fields are tacked on to the end of the metadata it should be adopted. (13 KB) Marco Mambelli, 01/30/2018 05:28 PM


#1 Updated by Marco Mambelli almost 2 years ago

Findings: In 3.2.20 the code was changed to use epoll instead of select to improve scalability, still falling back on select if epoll is not available.
And was also changed to catch specific exceptions instead of the generic “except:"
There was a bug in the code and a function was returning only the first file descriptor instead of the expected list of file descriptors, backing-up on loaded systems and a OSError triggered down the road if caught could have allowed the Frontend to continue to operate but was no more caught.

In the new code I’m taking care of both: fixing the epoll behavior and catching the OSError
I'm also optimizing epoll/poll adding a timeout of 100 milliseconds.

Changes are in v3/18748 and attached to this ticket (new

#2 Updated by Marco Mambelli almost 2 years ago

  • Subject changed from reproduce crashes on, provide fix to Fix behavior (was: reproduce crashes on, provide fix)

#3 Updated by Marco Mambelli almost 2 years ago

To patch you can replace with the one attached to this ticket.
lib/ in the source tree
glideinwms/lib/ in the python site-packages for an installed RPM (e.g. /usr/lib/python2.6/site-packages/glideinwms/lib/

#4 Updated by Dennis Box almost 2 years ago

  • Status changed from Feedback to Resolved
  • Assignee changed from Dennis Box to Marco Mambelli

#5 Updated by Parag Mhashilkar almost 2 years ago

  • Status changed from Resolved to Closed

Also available in: Atom PDF