Project

General

Profile

Troubleshooting » History » Version 5

« Previous - Version 5/8 (diff) - Next » - Current version
Adam Lister, 10/27/2020 04:42 PM


The Big List Of Production Issues (and Solutions)

This page is expected to get long. It will list any issues we run into and what the solution was.


ISSUE

***** Batch is interrupted!! *****
perform_with_timeout: curl_easy_perform() failed: Failure when receiving data from the peer
perform_with_timeout: curl_easy_perform() failed: Failure when receiving data from the peer
...
perform_with_timeout: curl_easy_perform() failed: Failure when receiving data from the peer
perform_with_timeout: curl_easy_perform() failed: Failure when receiving data from the peer
Table::Load: Web Service returned HTTP status -1: 
int nova::dbi::RunHistory::DetGainSetting()No gain setting found in DB!./alister1-detgenie_tbcry_fcl_to_g4-20201023_1000.sh: line 555:  1546 Aborted                 "nova" -c "/srv/no_xfer/ifdh_501_0/tb_cry_artdaq_prod5p1_ideal-gain100_none_r1074803_s00_c0_v04.80_1_20201020_164322.fcl" "--sam-file-type=importedSimulated" "--sam-application-family=nova" "--sam-data-tier=g4" "--sam-application-version=R19-09-24-testbeam-production.f" 
Fri Oct 23 15:43:35 UTC 2020 alister1-detgenie_tbcry_fcl_to_g4-20201023_1000.sh COMPLETED with exit status 134

SOLUTION
If the release is older than production 5, you'll need to configure it to run only onsite.


ISSUE

error: globus_ftp_client: the server responded with an error
451 No write pools online for [net=147.231.25.39,protocol=GFtp/2,store=nova.scratch@enstore,cache=,linkgroup=]

SOLUTION
appears to be transient, wait a few days


ISSUE
trying to stop that project results in the following:

samweb stop-project lasquith-batch1_detgenie_rhc_fcl_to_g4-20201009_0613-testjobs
Project lasquith-batch1_detgenie_rhc_fcl_to_g4-20201009_0613-testjobs is not known to the station and has state reserved

SOLUTION
(from Marc Mengel)
The project is in a half-started state (the state "reserved"); if you first establish a consumer process, you should be able to end it. Try doing a:

samweb start-process --appversion demo --appname demo --appfamily demo lasquith-batch1_detgenie_rhc_fcl_to_g4-20201009_0613-testjobs

and then try to end it.