Project

General

Profile

Feature #24491

Multi Pronged Approach to Pilot Log Anonymization

Added by Marco Mambelli about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
Category:
GlideinMonitor
Target version:
Start date:
05/28/2020
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

GlideinMonitor is an application that allows viewing Glidein logs
https://docs.google.com/document/d/1PqmQ_-_JtqK472DT-Y0D08yU1m-lPeFvrp8n4x9f5QE/edit?usp=sharing
https://github.com/glideinWMS/glideinmonitor

A problem w/ the current monitor is that the logs may contain sensitive information
Recognition of sensitive information and the subsequent anonymization should be added.
This will use reversible and irreversible anonymization, done using different techniques like Named Entity Recognition, Regular Expression based pattern matching.
GlideinMonitor's code includes an API for plug-ins to filter the logs that can be used or modified.
The configuration of the anonymization software/plugin can happen via config files or a Web GUI.

Steps will include:
  • Research and evaluation of the state of the art for log anonymization
  • Code development (Python) and integration with the GlideinWMS and GlideinMonitor frameworks
  • Testing the solution in a distributed computing environment (Grids, Clouds, and Supercomputers) and presenting the results
Some resources:


Also available in: Atom PDF