Project

General

Profile

On ubdaq-prod-evb » History » Version 6

« Previous - Version 6/7 (diff) - Next » - Current version
Afroditi Papadopoulou, 12/28/2018 10:10 AM


Running out of Disk Space on ubdaq-prod-evb ?

useful info: there are ~ 33 TB of disk space in /data/ on the evb machine. PUBS will try and clear data in /data/uboonedaq/TestRuns/ until the disk-usage reaches 40% of /data/uboonedaq/TestRuns/.

If this is the case there are several things one should do:

0) Make sure that dCache is normally operating:
--> Take a look at the log file for prod_transfer_binary_evb2dropbox_evb
/home/uboonepro/pubs/log/ubdaq-prod-evb.fnal.gov/prod_transfer_binary_evb2dropbox_evb.log
and search for errors / transfer failures like those below
[ ERROR ] transfer (L: 234) >> {transfer_file} Issuing deletion command: ifdh rm /pnfs/uboone/scratch/uboonepro/dropbox/data/uboone/raw/PhysicsRun-2018_12_28_6_39_57-0020483-00002.ubdaq.json
[ ERROR ] transfer (L: 238) >> {transfer_file} TRIED TO DELETE /pnfs/uboone/scratch/uboonepro/dropbox/data/uboone/raw/PhysicsRun-2018_12_28_6_39_57-0020483-00002.ubdaq.json but got an error from ifdhc
If that is the case and the errors persist, activate the bypass by following the instructions here What to do if dCache/enstore go down (no access to pnfs area)

1) Idenfity who is using up the disk space. Options:
--> a) /data/uboonedaq/rawdata/ is where data from "official" runs goes. Files here are seen (and should be eventually removed) by PUBS.
--> b) /data/uboonedaq/TestRuns/ - is disk-space DAQ people use to test things. It is not seen by PUBS and needs to be removed by hand in order to be cleared.

If most of the space is not being used by /data/uboonedaq/rawdata/ we need to free space manually. If it is urgent to free up space (i.e. data-taking should not be interrupted and the disk will fill up rather soon) you are authorized to clear /data/uboonedaq/TestRuns/. Contact any other person who is using up a considerable amount of space and ask them to quickly remove contents in their /data/ folder.
If /data/uboonedaq/rawdata/ is using up a significant amount of space, the problem is probably PUBS' fault.
2) identify the cause of the problem. Why is disk space not being freed? Possible causes:
--> a) clear_binary_evb is having issues.
--> b) clear_binary_evb does not find any new files to clear. This indicates a possible problem with one of the projects that clear_binary_evb depends on. A possible cause could be poor network speed to drain data out of the evb machine.