Project

General

Profile

Support #21812

MINOS Bluearc file cleanup

Added by Arthur Kreymer 8 months ago. Updated 6 months ago.

Status:
Assigned
Priority:
Normal
Start date:
01/30/2019
Due date:
12/31/2019
% Done:

0%

Estimated time:
200.00 h
Duration: 336

Description

As requested by Fermilab ( email to be attached )
we should remove files from /minos/app and /minos/data which are no longer needed.

Data files should be moved from app to data.

Obsolete data files should be removed or archived to dCache.

Issues will be opened for specific activities.

minos-app-Tree-Scan-Rpt.txt (38.4 KB) minos-app-Tree-Scan-Rpt.txt /minos/app summary Arthur Kreymer, 01/30/2019 06:04 PM
minos-data-Tree-Scan-Rpt.txt (467 KB) minos-data-Tree-Scan-Rpt.txt /minos/data summary Arthur Kreymer, 01/30/2019 06:04 PM

History

#1 Updated by Arthur Kreymer 8 months ago

Date: Wed, 30 Jan 2019 15:21:20 +0000
From: Andrew J. Romero <>
To: Arthur E Kreymer <>, Stuart C Fuess <>,
Margaret Votava <>, Mark O Kaletka <>,
Michael K. Rosier <>
Subject: NAS Capacity Review and Cleanup : MINOS

Hi Art

The Fermilab NAS file-service provides
high-availability, Posix oriented,
file storage services for various activities, including:

- The interactive activities of Fermilab's
engineering, operations, financial and administrative teams
- Those interactive scientific computing activities for which
reliable, Posix oriented, mutable file storage makes sense.
- Various core services (CentralWeb, WordPress, TeamCenter, DocDB, Indico, MFA ...etc)

In the past, the Fermilab NAS file-service (then known as "The BlueArc" )
provided storage for Fermilab's grid / batch scientific computing activities.

The system-admin team of the Fermilab NAS file-service
is responsible for insuring the long term sustainability / viability of the system.
This includes, monitoring and tuning the system's security, performance and capacity.
It also includes budgeting for annual maintenance and system upgrades.

Date: Wed, 30 Jan 2019 15:21:20 +0000
From: Andrew J. Romero <>
To: Arthur E Kreymer <>, Stuart C Fuess <>,
Margaret Votava <>, Mark O Kaletka <>,
Michael K. Rosier <>
Subject: NAS Capacity Review and Cleanup : MINOS

Hi Art

The Fermilab NAS file-service provides
high-availability, Posix oriented,
file storage services for various activities, including:

- The interactive activities of Fermilab's
engineering, operations, financial and administrative teams
- Those interactive scientific computing activities for which
reliable, Posix oriented, mutable file storage makes sense.
- Various core services (CentralWeb, WordPress, TeamCenter, DocDB, Indico, MFA ...etc)

In the past, the Fermilab NAS file-service (then known as "The BlueArc" )
provided storage for Fermilab's grid / batch scientific computing activities.

The system-admin team of the Fermilab NAS file-service
is responsible for insuring the long term sustainability / viability of the system.
This includes, monitoring and tuning the system's security, performance and capacity.
It also includes budgeting for annual maintenance and system upgrades.

To control maintenance and component upgrade costs,
It is important that we don't retain capacity for
activities that are now outside of the current NAS file-service scope.
The system-admin team of the Fermilab NAS file-service would like to
implement various relevant enhancements to the NAS service
including SSD "back-end" storage arrays; however,
this cannot be done until "out-of-scope" files have been cleaned-up.

Please, immediately review the following data:

- The attached detailed capacity reports
(created by scanning the file trees of your team's NFS volumes)
- The summary data (from the NAS quota counters) on the NAS metrics web-site
( http://metrics.fnal.gov/nas/metrics )

After reviewing the data, quickly begin the task of cleaning up the associated NFS volumes.

Files that have these characteristics are welcome on the NAS
and should stay on the NAS:

- files actively used by your team's interactive computing effort
- files that are most conveniently accessed using standard Posix / Unix I/O

Files that have these characteristics are outside of the NAS service scope

- bulk data used by grid / batch computing during the "BlueArc era" 
( Please move this data to dcache ... start moving data today )
- files that are inactive ; but, need long term cold-storage / archive-storage
( Please move this data to tape ... start moving data today )
- files that are obsolete, duplicate and un-needed.
No Fermilab funded storage service (NAS, Dcache, local-disk, public-cloud ..etc)
should be used as a trash heap.
( Please cleanup junk ... start today !)
- certain large immutable file-sets that, even in an interactive environment,
can conveniently be used in a simple "get-put" (Grid FTP / Pseudo-NFS)
manner rather than a fully Posix compliant standard IO manner.
( Please move this data to dcache ... this will require some thought ... start thinking today )

There may be some areas of your NAS volumes, whose clean-up,
will require careful deliberation among senior team members;
however, for other areas (like user directories)
you should consider involving all members of your team.
If everyone does a little, a lot will get done, without any one person
being overloaded with a large un-welcome clean-up burden.

In addition to cleaning up, I need your team to decide how much
NFS based / Posix style interactive storage it needs.
I doubt that any team still needs the NFS capacity it needed when
the NAS was mounted on "the grid". Also, teams which were, during this time
allocated multiple data volumes for grid processing will , in general, need to consolidate.

In addition to cleaning up and right-sizing your NFS volumes;
If you haven't already done so, I would like you to contact me and have me implement
and/or fine-tune end-user capacity quotas on both your app and data volumes (see note below).

By enthusiastically performing the clean-up and archival tasks noted above,
you are helping us continue to provide a useful interactive storage service.

Thank You !!

Andy Romero
Fermilab, Core Computing

#2 Updated by Arthur Kreymer 6 months ago

The Computing Sector has asked for our plans via
TASK0150135 03/22 Minos - NAS Cleanup Status Request.

There are 9 questions to answer.
I will prepare draft answers here.

The bottom line is that CS wants at least half the files removed by Aug 2019.

Based on file summaries at http://fndca3a.fnal.gov/cgi-bin/du_cgi.py
we could do this by moving mcimport, LEM and nue_group_files to dCache persistent.
We would need another 100 TB of persistent allocation.

Perhaps some of these files could be archived to tape, or removed.
That would take more planning.



Also available in: Atom PDF