Project

General

Profile

Support #18887

art/Utilities/LinuxProcMgr treats non-fatal (transient) errors (e.g. EINTR) as fatal.

Added by Christopher Green over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Infrastructure
Target version:
Start date:
02/05/2018
Due date:
% Done:

100%

Estimated time:
Spent time:
Scope:
Internal
Experiment:
-
SSI Package:
art
Duration:

Description

Under heavy load, an art job is likely to throw an exception of the form

%MSG-s ArtException:  Raw2HDF5:raw2hdf5@EndJob 03-Feb-2018 07:34:44 CST  ModuleEndJob
cet::exception caught in art
---- OtherArt BEGIN
  ---- Configuration BEGIN
     Failed to open: cat /proc/21650/status
  ---- Configuration END
  ---- OtherArt BEGIN
    ---- Configuration BEGIN
       Failed to open: /proc/21650/stat for schedule: 0
    ---- Configuration END
  ---- OtherArt END
---- OtherArt END
%MSG

The code should check for transient errno errors and loop as appropriate. See (e.g.) the read(2) man page for details.

History

#1 Updated by Kyle Knoepfel over 1 year ago

  • Status changed from Accepted to Assigned
  • Assignee set to Kyle Knoepfel

#2 Updated by Kyle Knoepfel over 1 year ago

  • Tracker changed from Bug to Support
  • Category set to Infrastructure
  • Status changed from Assigned to Resolved
  • % Done changed from 0 to 100
  • SSI Package art added

After discussion, it was deemed sufficient to report the value of errno upon a file-open failure.

Implemented with commit art:c28eff0.

#3 Updated by Kyle Knoepfel over 1 year ago

  • Target version set to 2.10.03

#4 Updated by Kyle Knoepfel over 1 year ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF