Project

General

Profile

Bug #17941

Project.py lar jobs fail to process on the grid due to 'lar' command missing

Added by Dominic Brailsford almost 2 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Immediate
Start date:
10/18/2017
Due date:
% Done:

0%

Estimated time:
Duration:

Description

Using the v06_53_00_SBNWorkshop1017 and v01_29_00_SBNWorkshop1017, all jobs are successfully submitted but all fail on the worker node with error code 127: lar command not found.
I've attached log files from one of the sub jobs.
Line 58 of the log output (not error output): an attempt to setup sbndcode is made but the following lines show no sbndcode ups product.

Associated revisions

Revision 5d718a3c (diff)
Added by Dominic Brailsford almost 2 years ago

Fix for issue #17941
Source the strict (CVMFS) version of the sbnd setup script.

History

#1 Updated by Dominic Brailsford almost 2 years ago

As a test, I've tried submitting one job using sbndcode v06_48_00_MCC which was the version used for the last MC production (which ran successfully).
The job also failed to find the lar command.
This may be a larger problem that I originally thought... Could this be down to the recent upgrade we got email about?

#2 Updated by Dominic Brailsford almost 2 years ago

I've managed to semi-replicate the problem outside of project.py
I've submitted a very simple bash script via jobsub which does four things:

source the experiment environment (either sbnd_setup.sh or sbnd_setup_strict.sh)
ups active
setup sbndcode v06_48_00_MCC -q e14:prof
ups active

When sourcing the strict version of the setup script, the log file shows sbndcode is an active product whereas the old version (sbnd_setup.sh) only shows the same ups products that the broken project.py jobs show.

#3 Updated by Dominic Brailsford almost 2 years ago

Changing the script sourcing in experiment_utilities.py from setup_sbnd.sh to setup_sbnd_strict.sh has allowed a test job to successfully run.

#4 Updated by Dominic Brailsford almost 2 years ago

  • Status changed from New to Resolved

Fixed in commit 5d718a3c

I'm going to assume that there was some upstream change which now blocks the old way of setting up the environment from working.

The gen-stage of one of the workshop samples is now successfully finishing so I'm marking this as resolved.

#5 Updated by Dominic Brailsford almost 2 years ago

  • Status changed from Resolved to Closed

Closing as the production is now very very almost finished.



Also available in: Atom PDF