Project

General

Profile

Support #10762

Glideins are failing at OSC and the stderr is reporting "Not enough arguments in fetch_file" errors

Added by Marco Mambelli about 4 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
11/03/2015
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

OSG is running using the OSG factory (glidein.grid.iu.edu) and the FIFE frontend.

Glideins are failing and it reported stderr/stdout form one of them.
A similar error was caused by a bug in glidein_startup.sh in the past but the factory is running 3.2.11 and should have been fixed.
I downloaded glidein_startup.sh from the factory and it is the one in branch_v3_2 (http://glidein.grid.iu.edu/factory/stage/glidein_startup.sh)

Signature OK for main:file_list.faulAe.lst.
Sat Oct 31 11:35:26 EDT 2015 Not enough arguments in fetch_file main error_gen.sh error_gen.faulAe.sh regular 0 TRUE FALSE
Sat Oct 31 11:35:26 EDT 2015 Sleeping 275
...

The file list in question seems to have the correct format:
http://glidein.grid.iu.edu/factory/stage/file_list.faulAe.lst

And the error message is reporting all 7 parameter (which is the correct number and should not trigger the error).
I cannot understand why this is happening.

Improving the debug messages could help in troubleshooting.

stderr.txt (9.59 KB) stderr.txt Marco Mambelli, 11/03/2015 05:13 PM
stdout.txt (3.15 KB) stdout.txt Marco Mambelli, 11/03/2015 05:13 PM
combined_v3_10762_patches_151104.patch (4.7 KB) combined_v3_10762_patches_151104.patch Marco Mambelli, 11/04/2015 05:12 PM

History

#1 Updated by Marco Mambelli about 4 years ago

Added the stdout/stderr form the glidein provided by the OSC sysadmin.

#2 Updated by Marco Mambelli about 4 years ago

  • File 0001-added-debug-messages-for-file-format.patch added

Patch adding more information to the error message to understand while the format is incorrect.

#3 Updated by Marco Mambelli about 4 years ago

  • File deleted (0001-added-debug-messages-for-file-format.patch)

#4 Updated by Marco Mambelli about 4 years ago

Combined patch, includes improved error message, printing of script checksum and setting of default IFS.
This is the recommended patch for the "Not enough arguments" error.

It had been reported that one instance of the error was fixed running "service gwms-factory upgrade".

The cause is still unknown. It may be an old version of glidein_startup.sh in the workdir (the one sent to the site).

Here is also a test script to load the files:

#!/bin/bash
fname=$1

function fetch_file {
 printf -v ifs_str "%q" "${IFS}" 
 if [ "$#" -ne 7 ]; then
    [ "$#" -eq 6 ] && echo "6 args, compatibility mode, OK" || echo "*** Problem, arguments should be 7 or 6!" 
 fi
 echo "Fetch file called with ($#/$ifs_str) arguments: $@" 
 for var in "$@" 
 do
    echo "Arg: [$var]" 
 done
 echo
}

gs_id=main

echo "START - parsing file $fname" 

while read file
  do
    if [ "${file:0:1}" != "#" ]; then
        fetch_file "$gs_id" $file
    fi
  done < "${fname}" 

echo "END" 

#5 Updated by Marco Mambelli about 4 years ago

  • Status changed from New to Feedback
  • Assignee changed from Marco Mambelli to Parag Mhashilkar
  • Target version set to v3_2_12

Things seem to work on https://ticket.opensciencegrid.org/27288
The problem was probably not a GWMS bug but these changes will help understand better if something similar happens in the future (did the operator forgot to run gwms-factory upgrade? was IFS altered? ...)

#6 Updated by Parag Mhashilkar about 4 years ago

  • Assignee changed from Parag Mhashilkar to Marco Mambelli

sent feedback separately

#7 Updated by Marco Mambelli about 4 years ago

Quotes removed as from feedback and merged.
These changes will mainly help troubleshoot (specially to establish the version of the glidein_startup.sh script form the log).
The reason of the problem was not understood and it went away.

#8 Updated by Marco Mambelli about 4 years ago

  • Status changed from Feedback to Resolved

#9 Updated by Parag Mhashilkar about 4 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF