Project

General

Profile

NOvA Error Codes

The known return values for the Nova experiment are:

1

Context Symptoms Diagnostic Treatment Date
R2R keepup Return value equal to 1 Problem transfering an ouput file Temporal problem. Retry later. 05/29/16
In *.err job log ---> "...error: globus_xio: System error in send: Connection reset by peer globus_xio: A system call failed: Connection reset by peer ifdh cp failed at: Sun May 29 02:04:40 2016 Traceback (most recent call last): File "/grid/fermiapp/products/nova/externals/NovaGridUtils/v01.83/NULL/bin/runNovaSAM.py", line 538, in <module> copyOutFiles(dest, args.hashDirs, args.runDirs, fileMetaDataMgr.runNum, args.noCleanup, args.declareLocations, args.declareFiles) File "/grid/fermiapp/products/nova/externals/NovaGridUtils/v01.83/NULL/bin/runNovaSAM.py", line 230, in copyOutFiles raise Exception("Copy out failed for file: " + fileWPath)
Exception: Copy out failed for file: ./results/fardet_r00023120_s43_t05_S15-03-11_v1_data.artdaq.root ..."
05/29/16

2

Context Symptoms Diagnostic Treatment Date
R2R keepup Return value equal to 2 Problem transfering an input file Temporal problem. Retry later. 05/29/16
In *.err job log ---> "... error: globus_ftp_client: the server responded with an error 451 General problem: Problem while connected to [131.225.166.222:38700, 198.49.208.26:58392]: Connection reset by peer ifdh cp failed at: Sun May 29 02:08:36 2016 Exception:Copy Failed for: gsiftp://fndca1.fnal.gov:2811/pnfs/fnal.gov/usr/nova/rawdata/FarDet/000231/23119/fardet_r00023119_s21_t02.raw: exception: see error output at:Sun May 29 02:08:36 2016..." Fails when updating the status of the input file (not enough parameters) 05/29/16

3

Context Symptoms Diagnostic Treatment Date
Reported by POMS Not seen by OPOS monitoring Confirm with Marc Not available 06/02/16

9

Context Symptoms Diagnostic Treatment Date
Reco keepup, offsite, Nebraska, all jobs fail in the same node (7e474065a898) Return value 9, nebraska, worker node like 7e474065a898. The *.out log file reports missing libraries: "... library_shim not setup. Setting up... Library libGLU.so is missing. Library libXmu.so is missing. Library libXpm.so is missing. Library libXt.so is missing. Library libXxf86vm.so is missing. Library libgstapp-0.10.so is missing. Library libgstaudio-0.10.so is missing. Library libgstinterfaces-0.10.so is missing. Library libgstpbutils-0.10.so is missing. Library libgstriff-0.10.so is missing. Library libgsttag-0.10.so is missing. Library libgstvideo-0.10.so is missing. ..." Missing required libraries in workernode Report node to Ken Herner and run files in other node 05/29/16

11

Context Symptoms Diagnostic Treatment Date
Reported by POMS Not seen by OPOS monitoring Confirm with Marc Not available 06/02/16

65

Context Symptoms Diagnostic Treatment Date
R2R Keepup Return value 65 Corrupt file Report to collaboration and ask production coordinator to flag it as bad 03/14/16

90

Context Symptoms Diagnostic Treatment Date
Listed in table of Reco keepup NA Not a single ocurrence listed Should remove this error code 06/02/16

127

Context Symptoms Diagnostic Treatment Date
Reco keepup, offsite Return value 127. In *.out: "... nova: error while loading shared libraries: libtbb.so.2: cannot open shared object file: No such file or directory ..." Missing library in two nodes: red-d17n15.unl.edu (Nebraska) and compute-20-44.tier2 (CIT_CMS_T2) Files retried and made it through when landing in other nodes. report nodes to computing division (Ken or Kevin) 05/11/16

135

Context Symptoms Diagnostic Treatment Date
Listed in table of Reco keepup NA Not a single ocurrence listed Should remove this error code 06/02/16

137

Context Symptoms Diagnostic Treatment Date
Reported by POMS Not seen by OPOS monitoring Confirm with Marc Not available 06/02/16

139

Context Symptoms Diagnostic Treatment Date
Reported by POMS Not seen by OPOS monitoring Confirm with Marc Not available 06/02/16

141

Context Symptoms Diagnostic Treatment Date
R2R keepUp Return value 141 Not available Retry (?) 11/09/15

245

Context Symptoms Diagnostic Treatment Date
R2R Keepup Return value equal to 245 and file status as skipped, process status as bad. Not a single error message in logs *.err o *.out Not clear Retry later 05/31/16

249

Context Symptoms Diagnostic Treatment Date
NovA Reco Keepup Return value equal to 245 Not available. Frequency once/year Not available 01/15/16

250

Context Symptoms Diagnostic Treatment Date
Reco keep up Return value 250 and in *.out "No mask found in DB" Corrupt data in experiment's DB. Report to collaboration, more specifically to Ryan Murphy and Jonathan Paley. Send the list of files failing with this error code. 06/01/16

255

Context Symptoms Diagnostic Treatment Date
Reco keep up Return value 255 Not available. Needs further exploration if seen again. Retry (?) 04/15/16