Project

General

Profile

Necessary Maintenance #11374

release cmssrv158 (old SL5 PhEEX dev server)

Added by Natalia Ratnikova almost 4 years ago. Updated almost 4 years ago.

Status:
Resolved
Priority:
Normal
Start date:
01/08/2016
Due date:
% Done:

0%

Estimated time:
Spent time:
Scope:
Internal
Experiment:
-
Stakeholders:
Duration:

Description

Development has been moved to cmsdev33 long ago.
The node can be released to the pool of free servers.

History

#1 Updated by Natalia Ratnikova almost 4 years ago

Node is installed as SL 5 , and already configured as SL6 in cobbler:

[root@cmssrv201 ~]# cobbler-wrapper report cmssrv158
cmssrv158
Hostname cmssrv158.fnal.gov
Profile slf6-x86_64
Gateway 131.225.207.200
Interface eth1 (static)
IP 131.225.204.165
MAC 78:2B:CB:3F:90:F2
Gateway 131.225.207.200
Netmask 255.255.252.0
MTU
Meta disk=default
Comment
Status development
Netboot Enabled? False

[root@cmssrv158 ~]# cat /etc/redhat-release
Scientific Linux SL release 5.4 (Boron)

#2 Updated by Natalia Ratnikova almost 4 years ago

[root@cmsadmin1 ~]# cms-shoot cmssrv158
removing host from rocks on cmsrocks51, if necessary
cmsstor24.fnal.gov: no host cmssrv158 to remove
Connection to cmsrocks51 closed.
removing host from rocks on cmsrocks52, if necessary
cmssrv26.fnal.gov: no host cmssrv158 to remove
Connection to cmsrocks52 closed.
stopping puppet on cmssrv158, if applicable
bash: puppet: command not found
telling host to netboot on next boot
cmssrv158: netboot -> True
set 1 hosts to boot
1 system(s) updated
telling cmspuppetca to remove host's cert, if present
telling cmspuppetca to update autosign information
when you're ready to start, run:
cmspower-powerit --action cycle --comment 'reinstalling' cmssrv158
don't forget to disable zabbix monitoring if applicable

[root@cmsconsole ~]# cmspower-cons cmssrv158
Connecting to node cmssrv158: ssh -x -t root:ttyS11@fcc-2-1572
[root@cmsconsole ~]# cmspower-powerit --action cycle --comment 'reinstalling cmssrv158 as SL6. NR' cmssrv158 === cmssrv158 ===
connecting to APC APCCMS1572-1, outlet 3
Outlet state: OFF
connecting to APC APCCMS1572-1, outlet 3
Outlet state: ON
[root@cmsconsole ~]#

#3 Updated by Natalia Ratnikova almost 4 years ago

Node does not start properly after reinstall (clock skewed?).
Rebooting again. Now can login from the console. But puppet is not running.

Added node to the ENC:

$ cat hosts/cmssrv158.fnal.gov.yaml
classes:
role::unconfigured:
parameters:
checkmk_extra:
- unmonitored

And rebooted again from the command line.
Started puppet as it was not running. This made it worse:

[root@cmsadmin1 ~]# ssh cmssrv158
ssh: connect to host cmssrv158 port 22: No route to host

[root@cmsadmin1 ~]# ping cmssrv158
PING cmssrv158.fnal.gov (131.225.204.165) 56(84) bytes of data.
From cmssrv158.fnal.gov (131.225.204.165) icmp_seq=1 Destination Host Prohibited
From cmssrv158.fnal.gov (131.225.204.165) icmp_seq=2 Destination Host Prohibited
From cmssrv158.fnal.gov (131.225.204.165) icmp_seq=3 Destination Host Prohibited

Nothing alarming in the system log.

Rerun puppet manually a few times until it ends up with repeating errors, listed below.

Asked Lisa, if someone in DCSO can take over.

Here are puppet errors:

[root@cmssrv158 ~]# puppet agent -t
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Loading facts in /var/lib/puppet/lib/facter/iptables_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/rsyslog_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/certificate_facts.rb
Info: Loading facts in /var/lib/puppet/lib/facter/afs_cache_size.rb
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/code_server.rb
Info: Loading facts in /var/lib/puppet/lib/facter/concat_basedir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/os_maj_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/condorceversion.rb
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/postgres_default_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/iptables_persistent_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/cvmfspartsize.rb
Info: Loading facts in /var/lib/puppet/lib/facter/cvmfsversion.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Info: Loading facts in /var/lib/puppet/lib/facter/ip6tables_version.rb
Info: Caching catalog for cmssrv158.fnal.gov
Info: Applying configuration version '1452289101'
Error: /Stage[main]/P_krb5::Keytab/P_secret::File[/etc/krb5.keytab]/File[/etc/krb5.keytab]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///secrets/master/cmssrv158.fnal.gov/krb5.keytab
Notice: /Stage[main]/P_postfix/File[/etc/postfix/main.cf]/content: 
--- /etc/postfix/main.cf        2014-02-20 05:07:18.000000000 -0600
+++ /tmp/puppet-file20160108-10527-179dk4c-0    2016-01-08 15:40:54.737734503 -0600
@@ -1,676 +1,24 @@
-# Global Postfix configuration file. This file lists only a subset
-# of all parameters. For the syntax, and for a complete parameter
-# list, see the postconf(5) manual page (command: "man 5 postconf").
+# /etc/default/main.cf -- Server postfix configuration file.
 #
-# For common configuration examples, see BASIC_CONFIGURATION_README

[... lots of postfix output ... ]

Error: Could not back up /etc/postfix/main.cf: Got passed new contents for sum {md5}49b648101b0e361231a977aa89e0dd60
Error: Could not back up /etc/postfix/main.cf: Got passed new contents for sum {md5}49b648101b0e361231a977aa89e0dd60
Error: /Stage[main]/P_postfix/File[/etc/postfix/main.cf]/content: change from {md5}49b648101b0e361231a977aa89e0dd60 to {md5}293a300026b12419b1f4cd52692ee7e6 failed: Could not back up /etc/postfix/main.cf: Got passed new contents for sum {md5}49b648101b0e361231a977aa89e0dd60
Notice: /Stage[main]/P_postfix/Exec[postmap recipients]: Dependency File[/etc/postfix/main.cf] has failures: true
Warning: /Stage[main]/P_postfix/Exec[postmap recipients]: Skipping because of failed dependencies
Notice: /Stage[main]/P_postfix/File[/etc/postfix/master.cf]: Dependency File[/etc/postfix/main.cf] has failures: true
Warning: /Stage[main]/P_postfix/File[/etc/postfix/master.cf]: Skipping because of failed dependencies
Notice: /Stage[main]/P_postfix/Service[postfix]: Dependency File[/etc/postfix/main.cf] has failures: true
Warning: /Stage[main]/P_postfix/Service[postfix]: Skipping because of failed dependencies
Notice: /Stage[main]/P_postfix/Exec[postfix reload]: Dependency File[/etc/postfix/main.cf] has failures: true
Warning: /Stage[main]/P_postfix/Exec[postfix reload]: Skipping because of failed dependencies
Notice: /Stage[main]/P_postfix/P_postfix::Map[/etc/postfix/transport]/Exec[postmap hash:/etc/postfix/transport]: Dependency File[/etc/postfix/main.cf] has failures: true
Warning: /Stage[main]/P_postfix/P_postfix::Map[/etc/postfix/transport]/Exec[postmap hash:/etc/postfix/transport]: Skipping because of failed dependencies
Notice: /Stage[main]/P_postfix/Exec[postmap senders]: Dependency File[/etc/postfix/main.cf] has failures: true
Warning: /Stage[main]/P_postfix/Exec[postmap senders]: Skipping because of failed dependencies
Notice: Finished catalog run in 14.90 seconds

#4 Updated by Natalia Ratnikova almost 4 years ago

  • Status changed from Assigned to Resolved

Krista generated new keytab and finished reinstall.



Also available in: Atom PDF