Disk Usage » History » Version 400
« Previous -
Version 400/440
(diff) -
Next » -
Current version
Jagjeet Singh, 07/28/2016 04:04 AM
Disk Usage¶
A simple page to produce a snapshot of what we have stored where on the disks in no specific order.
If you have a directory on any of the disks and know what's in it please add it to the list.
We need to keep a better audit of the disk usage so as not to come unstuck when we hit quotas, as well to keep better track of where everything is located.
Usage as of July 28th, 2016:
Filesystem Size Used Avail Use% Mounted on blue3:/nova/data 140T 98T 43T 70% /nova/data blue3:/nova/ana 95T 80T 16T 84% /nova/ana blue3:/nova/prod 100T 87T 14T 87% /nova/prod if-nas-0.fnal.gov:/nova/app 10T 9.4T 703G 94% /nova/app
/nova/prod (Updated July 28th, 2016)¶
Space used | Area | Lifetime |
49617 G | mc | |
33550 G | data | |
1722 G | concat | |
977 G | FTS_DropBoxes | |
344 G | reco_validation_Oct2014_tmpdir |
mc/: Various MC tagged releases from S12-12-12 through to S14-02-05 - needs a full audit.
Clearly the biggest hog here!
Here is the breakdown of the MC directory:
480K /nova/prod/mc/development 9.1G /nova/prod/mc/fcl 1.1G /nova/prod/mc/None 1.6T /nova/prod/mc/S12-12-12 32K /nova/prod/mc/S13-01-15 1.6G /nova/prod/mc/S13-02-03 13T /nova/prod/mc/S13-02-26 5.1M /nova/prod/mc/S13-06-05 1.4T /nova/prod/mc/S13-06-13 7.9T /nova/prod/mc/S13-06-18 18G /nova/prod/mc/S13-06-26 441G /nova/prod/mc/S13-12-13 531G /nova/prod/mc/s14-01-20 338G /nova/prod/mc/S14-01-20 985G /nova/prod/mc/S14-02-05 4.3G /nova/prod/mc/S14-02-05a 102G /nova/prod/mc/S14-03-06 25G /nova/prod/mc/S14-03-25 96K /nova/prod/mc/S14-05-05 11T /nova/prod/mc/S14-05-08 12T /nova/prod/mc/S14-05-12 178G /nova/prod/mc/S14-07-03 40G /nova/prod/mc/S14-07-11 100G /nova/prod/mc/S14-08-15 312G /nova/prod/mc/S14-08-19
data/: 5.7TB - Tagged versions of reconstructed FarDet "numi" data
/nova/data (Updated July 28th, 2016)¶
Space used | Area | Lifetime |
31562 G | mc | |
21025 G | novaroot | |
12379 G | rawdata | |
10617 G | nearline-OnMon | |
7918 G | flux | |
4988 G | nearline | |
3405 G | nearline-Ana | |
2695 G | spillserver_logs | |
2018 G | pedestal_data | |
795 G | pidlibs |
Here is the breakdown of the MC directory:
609M /nova/data/mc/daq_simulated_data 6.1M /nova/data/mc/fcl 290M /nova/data/mc/fclfiles 32K /nova/data/mc/in_progress 152G /nova/data/mc/in_progress_old 179G /nova/data/mc/logfiles 86G /nova/data/mc/S12.06.17_MDC_reco Accessed 2014/03/31 6.5M /nova/data/mc/S12-10-04_to_FTS Not accessed this year, except metadata files 8.7T /nova/data/mc/S12-11-16 Accessed 2014/04/14 32K /nova/data/mc/S12-12-12 992G /nova/data/mc/S13-01-15 32K /nova/data/mc/S13-02-03 310G /nova/data/mc/S13-02-26 25G /nova/data/mc/S13-02-26a 551G /nova/data/mc/S13-04-09 17T /nova/data/mc/S13-06-05 3.5T /nova/data/mc/S13-06-13 7.5T /nova/data/mc/S13-06-18 386G /nova/data/mc/S13-07-22 908M /nova/data/mc/S13-09-17 205G /nova/data/mc/S13-10-11 203G /nova/data/mc/S13-12-13
/nova/ana (Updated July 28th, 2016)¶
Space used | Area | Lifetime |
67001 G | users | |
3523 G | nu_e_ana | |
2812 G | calibration | |
2220 G | nu_int_ana | |
2008 G | assembly_ana | |
758 G | nova_cvmfs | |
745 G | trigger | |
323 G | exotics_ana | |
270 G | steriles | |
114 G | MOVED | |
2 G | nu_mu_ana |
/nova/ana/users (Updated June 9th, 2016)¶
Total Size: 65.477 T
Largest space users:
User | Space Used | Expected Space Need | Reason for Files | Expected Lifetime |
bianjm | 4406 G | |||
radovic | 2838 G | |||
nsmayer | 2078 G | |||
crisprin | 2047 G | |||
ksachdev | 1942 G | |||
edniner | 1912 G | |||
tamsett | 1894 G | |||
psihas | 1848 G | |||
rschroet | 1839 G | |||
barnali | 1632 G |
No longer large space users:
User | Space Used | Expected Space Need | Reason for Files | Expected Lifetime |
brunetti | 184 G |
/nova/app/users (Updated June 9th, 2016)¶
5.79 T
Largest space users:
User | Space Used | Expected Space Need | Reason for Files | Expected Lifetime |
blinehan | 200 G | |||
rhatcher | 194 G | |||
prabhjot | 194 G | |||
bckhouse | 184 G | |||
denis | 182 G | |||
timkudt | 178 G | |||
radovic | 169 G | |||
crisprin | 158 G | |||
tianxc | 138 G | |||
bays | 133 G |
No longer large space users:
User | Space Used | Expected Space Need | Reason for Files | Expected Lifetime |
arrieta1 | 47 G | Can remove more after graduate in October |
novapro Quota¶
Usage as of July 28th, 2016:
Filesystem blocks quota limit grace files quota limit grace blue2:/fermigrid-app 81705M 0 300G 638k 0 0 homesrv01.fnal.gov:/home 11144 0 5120M 14600k 0 0 if-nas-0.fnal.gov:/nusoft/app 1326G 0 1536G 25964k 0 0 blue3:/nusoft/data 24156G 0 28672G 528k 0 0 blue3:/nova/ana 6575G 0 8192G 25718k 0 0 blue3:/nova/prod 86203G 0 100T 4461k 0 0 if-nas-0.fnal.gov:/nova/app 37047M 0 200G 31563k 0 0 blue3:/nova/data 99434G 0 128T 5460k 0 0 blue2:/fermigrid-data 12773G 0 27648G 12043k 0 0 blue2:/fermigrid-fermiapp 2305G 0 3994G 28815k 0 0
Users files on dCache¶
If you have a bunch of important analysis files, but there just isn't room in /nova/ana/users for what you need to do ... welcome to dCache!
Please see the pnfs tutorial here: DocDB 13747
Temporarily, you can put your fil|||es at (great for files returning from grid jobs):
/pnfs/nova/scratch/users/
The best-practices method for moving your files there is to use ifdh cp. The first thing you need to do is make the /pnfs/nova/scratch/users directory that you want to write to be GROUP-writeable. That is, if I wanted to move something to my area, I would (first-time only) do:
chmod g+w /pnfs/nova/scratch/users/lein
Then, the command information for ifdh cp is:
ifdh cp args
The very simple example is:
ifdh cp test.txt /pnfs/nova/scratch/users/lein/
For more advanced use, general file copy using cpn locks dd, gridftp, or srmcp supports:
- basic source/dest filenames: cp src1 dest1 [';' src2 dest2 [';'...]] * recursive directory copies: cp -r src1 dest1 [';' src2 dest2 [';'...]] * copies to dest. directory: cp -D src1 src2 destdir1 [';' src3 src4 destdir2 [';'...]] * copies to a list file: cp -f file_with_src_space_dest_lines * any of the above can take --force={cpn,gridftp,srmcp,expgridftp} * any of the file/dest arguments can be URIs
Note that this is a scratch area where files have a limited lifetime. Least recently accessed files are deleted first. The lifetime is typically a few months, and if you need the file set more permanently, you can copy it to the tape backed area (use sam_clone_dataset from the Sam4users tools below).
More pnfs resources:
Basic dCache documentation: https://srm.fnal.gov/twiki/bin/view/DcacheCorner/DcacheFAQ
lifetime plots: http://fndca.fnal.gov/dcache/dc_lifetime_plots.html
Sam4users: https://cdcvs.fnal.gov/redmine/projects/nova_sam/wiki/User_Datasets
Disk Usage Management¶
These are notes written by Susan on how she manages disk space usage.
The goal is to never let any of the areas get filled up.
Usually, it is relatively safe if all the areas are less than 80% full. If /nova/ana is 85% or more full, you should probably panic.
Fairly frequently (a few times a week, whenever I feel a prickle on the back of my neck that someone, somewhere is doing something horrible) I looked at "BlueArc for NOvA" plot at the bottom of this page:
https://fifemon.fnal.gov:3000/dashboard/db/experiment-overview
(log in with services account)
If any of the areas are using more space than I expect, I investigate to figure out way. Sharp upward changes in size are particularly worrisome.
Also, I update this wiki page with the information from weekly space usage emails. Updating the wiki page gives me a chance to meditate on the size of each area and think about who is using too much space.
To update the wiki page, I also weekly run:
df -h (for the top section)
ksu novapro
quota -u novapro -s (for section on novapro Quota - make sure nothing is changing too much, getting close to filled)
/nova/ana is the area that gets filled up quickly and unexpectedly, usually due to new/rogue users. Usually, between the weekly emails and snooping on grid usage, followed by du -hs in individual user areas, I can find out who is mostly responsible. Then I send emails alerting them to the situation and urging alternative action. Feel free to send a lot of emails nagging about space - I certainly do.
/nova/prod and a few other areas just slowly grow over time - I watch this and periodically make a fuss about needing things archived.
The weekly emails also list biggest space users of condor-tmp - I usually bug people who use more than production. This is a relatively low priority issue.