Project

General

Profile

Bug #9644

Gratia is reporting to /var/log/messages

Added by Gerard Bernabeu Altayo almost 4 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Start date:
07/16/2015
Due date:
07/23/2015
% Done:

0%

Estimated time:
Spent time:
Duration: 8

Description

Hi,

I just saw that gratia is reporting to /var/log/messages, I guess this is a log redirection missconfiguration.... This needs to be fixed so that the app reports to its own log and leave messages to the system.

[root@gratia-main-osg ~]# tail /var/log/messages
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,001 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 43 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,020 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 44 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,046 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 45 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,070 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 46 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,091 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 47 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,110 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 48 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,135 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 49 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,159 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 50 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:48:35,184 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 51 / 337 (condor:submit-4.chtc.wisc.edu) saved.
Jul 16 11:48:35 gratia-main-osg dcso_458ce17d2a2c2bb39fdc7bd7cb5286b6: 16 Jul 2015 16:48:35,184 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 51 / 337 (condor:submit-4.chtc.wisc.edu) saved.
[root@gratia-main-osg ~]#

I think that the app is already logging rightfully in it's own log file:

[root@gratia-main-osg ~]# tail /var/log/gratia-service/gratia.log
16 Jul 2015 16:50:38,173 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 169 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,211 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 170 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,236 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 171 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,263 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 172 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,288 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 173 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,314 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 174 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,335 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 175 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,360 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 176 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,390 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 177 / 320 (condor:red.unl.edu) saved.
16 Jul 2015 16:50:38,411 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 178 / 320 (condor:red.unl.edu) saved.
[root@gratia-main-osg ~]#

My bet is that rsyslog is misbehaving, and it makes sense with the config files I see in there:

[root@gratia-main-osg ~]# ls ltrah /etc/rsyslog.d/
total 52K
-rw-r--r-
1 root root 132 May 1 09:45 00-puppet.conf
rw-r--r- 1 root root 510 May 1 09:46 000-use-clogger.conf
rw-r--r- 1 root root 1.6K May 20 10:12 client.conf
rw-r--r- 1 root root 790 May 28 10:55 dcso_f1db4c138fd6034813539cb716232478.conf
rw-r--r- 1 root root 790 May 28 10:55 dcso_a50a002a1ec010aa83b29b04352abc01.conf
rw-r--r- 1 root root 800 May 28 10:55 dcso_458ce17d2a2c2bb39fdc7bd7cb5286b6.conf
rw-r--r- 1 root root 775 Jun 4 10:02 dcso__var_log_tomcat6_catalina.out.conf
rw-r--r- 1 root root 775 Jun 4 10:02 dcso__var_log_tomcat6_catalina.err.conf
rw-r--r- 1 root root 810 Jun 4 10:02 dcso__var_log_gratia-service_gratia.log.conf
drwxr-xr-x. 2 root root 4.0K Jun 4 10:02 .
drwxr-xr-x. 101 root root 12K Jul 16 11:51 ..
[root@gratia-main-osg ~]#

Gerard

History

#1 Updated by Gerard Bernabeu Altayo almost 4 years ago

Looks like the config is duplicated....

[root@gratia-main-osg rsyslog.d]# grep gratia.log dcso_*
dcso_458ce17d2a2c2bb39fdc7bd7cb5286b6.conf:# This file logs the following path: /var/log/gratia-service/gratia.log
dcso_458ce17d2a2c2bb39fdc7bd7cb5286b6.conf:$InputFileName /var/log/gratia-service/gratia.log
dcso__var_log_gratia-service_gratia.log.conf:# RSYSLOG: dcso__var_log_gratia-service_gratia.log.conf file for machine [gratia-main-osg.fnal.gov]
dcso__var_log_gratia-service_gratia.log.conf:# This file logs the following path: /var/log/gratia-service/gratia.log
dcso__var_log_gratia-service_gratia.log.conf:if $syslogtag 'dcso__var_log_gratia-service_gratia.log:' then @clogger.fnal.gov
dcso__var_log_gratia-service_gratia.log.conf:if $syslogtag 'dcso__var_log_gratia-service_gratia.log:' then ~
dcso__var_log_gratia-service_gratia.log.conf:$InputFileName /var/log/gratia-service/gratia.log
dcso__var_log_gratia-service_gratia.log.conf:$InputFileTag dcso__var_log_gratia-service_gratia.log:
dcso__var_log_gratia-service_gratia.log.conf:$InputFileStateFile dcso__var_log_gratia-service_gratia.log.state
[root@gratia-main-osg rsyslog.d]#

I'll rely on puppet to redo this:

[root@gratia-main-osg rsyslog.d]# puppet agent -t
Warning: Local environment: "production" doesn't match server specified node environment "itb", switching agent to "itb".
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/rsyslog_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/iptables_persistent_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/condorceversion.rb
Info: Loading facts in /var/lib/puppet/lib/facter/iptables_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Loading facts in /var/lib/puppet/lib/facter/ip6tables_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/sudoversion.rb
Info: Caching catalog for gratia-main-osg.fnal.gov
Info: Applying configuration version '1437065643'
date
Notice: /Stage[main]/S_gratia::Logging/S_gratia::Def_monitor[/var/log/tomcat6/catalina.out]/File[Creating Rsyslog Configuration file for: /var/log/tomcat6/catalina.out within dcso__var_log_tomcat6_catalina.out.conf]/ensure: defined content as '{md5}970c2affae0ac863ada9544a6af9f3eb'
Info: /Stage[main]/S_gratia::Logging/S_gratia::Def_monitor[/var/log/tomcat6/catalina.out]/File[Creating Rsyslog Configuration file for: /var/log/tomcat6/catalina.out within dcso__var_log_tomcat6_catalina.out.conf]: Scheduling refresh of Service[rsyslog]
Notice: /Stage[main]/S_gratia::Logging/S_gratia::Def_monitor[/var/log/tomcat6/catalina.err]/File[Creating Rsyslog Configuration file for: /var/log/tomcat6/catalina.err within dcso__var_log_tomcat6_catalina.err.conf]/ensure: defined content as '{md5}3741bd9fa1cf1dc8d87c9b9c1f741b4a'
Info: /Stage[main]/S_gratia::Logging/S_gratia::Def_monitor[/var/log/tomcat6/catalina.err]/File[Creating Rsyslog Configuration file for: /var/log/tomcat6/catalina.err within dcso__var_log_tomcat6_catalina.err.conf]: Scheduling refresh of Service[rsyslog]
Notice: /Stage[main]/S_gratia::Logging/S_gratia::Def_monitor[/var/log/gratia-service/gratia.log]/File[Creating Rsyslog Configuration file for: /var/log/gratia-service/gratia.log within dcso__var_log_gratia-service_gratia.log.conf]/ensure: defined content as '{md5}c644ce26155e485bfb534b548d390181'
Info: /Stage[main]/S_gratia::Logging/S_gratia::Def_monitor[/var/log/gratia-service/gratia.log]/File[Creating Rsyslog Configuration file for: /var/log/gratia-service/gratia.log within dcso__var_log_gratia-service_gratia.log.conf]: Scheduling refresh of Service[rsyslog]
Notice: /Stage[main]/Rsyslog::Service/Service[rsyslog]: Triggered 'refresh' from 3 events
Notice: Finished catalog run in 22.83 seconds
[root@gratia-main-osg rsyslog.d]# date
Thu Jul 16 11:54:41 CDT 2015
[root@gratia-main-osg rsyslog.d]#

It's still bad:

[root@gratia-main-osg rsyslog.d]# tail /var/log/messages
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,508 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 213 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,530 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 214 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,557 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 215 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,580 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 216 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,668 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 217 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,694 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 218 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,723 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 219 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,751 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 220 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,777 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 221 / 319 (condor:gridgk05.racf.bnl.gov) saved.
Jul 16 11:55:09 gratia-main-osg dcso__var_log_gratia-service_gratia.log: 16 Jul 2015 16:55:09,797 gratia.service(Thread-326) [FINE]: RecordProcessor: 0: Replication JobUsageRecord 222 / 319 (condor:gridgk05.racf.bnl.gov) saved.
[root@gratia-main-osg rsyslog.d]#

But now we have a single entry at least:

[root@gratia-main-osg rsyslog.d]# ll
total 24
rw-r--r- 1 root root 510 May 1 09:46 000-use-clogger.conf
rw-r--r- 1 root root 132 May 1 09:45 00-puppet.conf
rw-r--r- 1 root root 1546 May 20 10:12 client.conf
rw-r--r- 1 root root 810 Jul 16 11:54 dcso__var_log_gratia-service_gratia.log.conf
rw-r--r- 1 root root 775 Jul 16 11:54 dcso__var_log_tomcat6_catalina.err.conf
rw-r--r- 1 root root 775 Jul 16 11:54 dcso__var_log_tomcat6_catalina.out.conf
[root@gratia-main-osg rsyslog.d]#

#2 Updated by Gerard Bernabeu Altayo almost 4 years ago

  • Due date set to 07/23/2015
  • Assignee changed from Gerard Bernabeu Altayo to Nicholas Peregonow

Nick please take a look at this, I suspect that we're either calling the rsyslog with wrong params or the rsyslog module is buggy.

Once the bug/config issue is fixed you'll have to go to all gratia collectors, wipe out their rsyslog config and make sure only the proper one is in there, I guess most will have the duplicated entries like this one did.

Gerard

#3 Updated by Nicholas Peregonow almost 4 years ago

After looking into this, this is due to rsyslog needing to be able to forward gratia logs to clogger. From my understanding, rsyslog will automatically add the log files to syslog, and then forward them off. This does make parsing syslog for error more difficult, as the logs will be filled with messages that do not pertain to the system.

However, I'm not sure of a better solution for getting java log files to clogger.

#4 Updated by Nicholas Peregonow almost 4 years ago

  • Status changed from New to Resolved

This has been resolved, and configurations are being pushed out to the collectors now.

Renamed log file conf files in the puppet define file, this allows to set priority for the rsyslog files

[root@gratia-main-osg rsyslog.d]# ls la
total 40
drwxr-xr-x. 2 root root 4096 Jul 16 15:22 .
drwxr-xr-x. 101 root root 12288 Jul 16 14:32 ..
-rw-r--r-
1 root root 510 May 1 09:46 000-use-clogger.conf
rw-r--r- 1 root root 132 May 1 09:45 00-puppet.conf
rw-r--r- 1 root root 747 Jul 16 15:22 01_dcso__var_log_gratia-service_gratia.log.conf
rw-r--r- 1 root root 717 Jul 16 15:22 01_dcso__var_log_tomcat6_catalina.err.conf
rw-r--r- 1 root root 717 Jul 16 15:22 01_dcso__var_log_tomcat6_catalina.out.conf
rw-r--r- 1 root root 1546 May 20 10:12 client.conf
[root@gratia-main-osg rsyslog.d]#

This now processors the tocmat log files first, and send them to clogger, afterwhich, tomcat files are no longer processed.

Template generates conf files that now look like this

$ModLoad imfile # Load the imfile input module

if $syslogtag == 'dcso__var_log_gratia-service_gratia.log:' then @clogger.fnal.gov
& ~

$InputFileName /var/log/gratia-service/gratia.log
$InputFileTag dcso__var_log_gratia-service_gratia.log:
$InputFileStateFile dcso__var_log_gratia-service_gratia.log.state
$InputFileSeverity info
$InputFileFacility local6
$InputRunFileMonitor
$InputFilePollInterval 10

Testing if /var/log/messages gets flooded on gratia-main-osg

Jul 16 15:22:17 gratia-main-osg rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="14560" x-info="http://www.rsyslog.com"] exiting on signal 15.
Jul 16 15:22:18 gratia-main-osg kernel: imklog 5.8.10, log source = /proc/kmsg started.
Jul 16 15:22:18 gratia-main-osg rsyslogd: [origin software="rsyslogd" swVersion="5.8.10" x-pid="18986" x-info="http://www.rsyslog.com"] start

and we don't see any gratia messages

#5 Updated by Nicholas Peregonow almost 4 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF