Project

General

Profile

Support #8828

Unresponsive pools - Provide instructions for the Primary

Added by Natalia Ratnikova over 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Start date:
05/16/2015
Due date:
% Done:

0%

Estimated time:
2.00 h
Spent time:
component:
base
Stakeholders:
Co-Assignees:
Duration:

Description

Several cases of pools having alarms and become unresponsive happened in the last week,
for different causes.

Affected nodes, incidents/tasks in SNOW and times of incident reported:

cmsstor369 -> RITM0208696 REQ000000230111 2015-05-06 16:35:44

cmsstor339 -> INC000000544557 2015-05-13 15:32:04

cmsstor374 -> INC000000545707 Fri, 15 May 2015 07:59:40 -0500

As a follow-up:
- Review the incidents and actions taken with the dCache support team
- what can be done to avoid or reduce the frequency of the incidents
- identify/classify the causes and impact on the system
- review monitoring setup and instructions for the Primary on-call.

History

#1 Updated by Gerard Bernabeu Altayo over 4 years ago

I reviewed the tickets:

cmsstor369 -> RITM0208696 REQ000000230111 2015-05-06 16:35:44

- Saturated network, locally (2x1Gbps)

cmsstor339 -> INC000000544557 2015-05-13 15:32:04

- Unknown issue, potentially saturated network in the Router level

cmsstor374 -> INC000000545707 Fri, 15 May 2015 07:59:40 -0500

- The pool had network issues on May 13th, lots of errors start showing up in the logs then. I think the issue was probably a build-up of the network issue and was resolved by the daemon restart.

Now focusing on the right thing, the primary procedure.

I'd like to slightly change the procedure for ANY of our Storage/Data Management services procedures so that if one of our servers is wrong the procedure for primary, whenever primary does not know what to do defaults to:

1. Run the script on https://github.com/onnozweers/dcache-scripts/blob/master/dcache-collect-debug-info.sh
2. Reboot the system
3. If service does not come up: escalate issue.

To apply this procedure we need to make sure this script is deployed everywhere (by puppet and probably with a different name) and make it more generic so that it's useful for dCache, EOS, etc. All this, without making it too complicated...

Gerard should verify the procedure once it's tested.

#2 Updated by Gerard Bernabeu Altayo over 4 years ago

  • Assignee set to Natalia Ratnikova

#3 Updated by Natalia Ratnikova over 4 years ago

  • Status changed from New to Assigned

Starting a week ago, it is no longer the on-call Primary's responsibility to investigate and take care of the dCache nodes, and should be done by the service maintainer(s). Whether the primary, or all dCache maintainers should be doing this, is still in discussion, see email thread from Krista ( current primary), Lisa, et. al.

The proposed procedure in this ticket is still relevant.

#4 Updated by Natalia Ratnikova over 4 years ago

  • Estimated time set to 2.00 h

Created personal github repo:

https://github.com/nataliaratnikova/dcache-scripts

Keeping it on github will allow for easy merge of eventual contributions from the source repo.

#5 Updated by Natalia Ratnikova over 4 years ago

Test area: root@cmsstor151:root/dcache-scripts

ssh soor@cmsstor151
git clone https://github.com/nataliaratnikova/dcache-scripts
cd dcache-scripts

[root@cmsstor151 ~]# df -h /var/log/
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 213G 20G 182G 10% /
[root@cmsstor151 ~]# time bash dcache-scripts/dcache-collect-debug-info.sh
+ mkdir -p /var/log/dcache-debug
+ rm -f '/var/log/dcache-debug/*'
+ HOW_MANY_THREAD_DUMPS=10
+ seq 1 10
for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ hostname -s
'[' cmsstor151 srm ']'
+ hostname -s
'[' cmsstor151 namespace ']'
+ sleep 5
+ HOW_MANY_LINES=200000
+ for file in '/var/log/*Domain.log'
+ basename '/var/log/*Domain.log'
basename='*Domain.log'
+ tail -n 200000 '/var/log/*Domain.log'
tail: cannot open `/var/log/*Domain.log' for reading: No such file or directory
+ /usr/bin/dcache status
+ grep Domain
+ awk '{print $1}'
for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap w-cmsstor151-disk_itb-disk1Domain /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk1Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap w-cmsstor151-disk_itb-disk2Domain /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk2Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap w-cmsstor151-disk_itb-disk3Domain /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk3Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap gridftp-cmsstor151Domain /var/log/dcache-debug/dcache-dump-heap-gridftp-cmsstor151Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ cp /etc/dcache/dcache.conf /var/log/dcache-debug/
+ top -b -n 1
+ vmstat 1 10
+ lsof
+ netstat -nap
+ ps -efL
+ chmod 644 /var/log/dcache-debug/dcache.conf '/var/log/dcache-debug/*Domain.log-last-200000-lines-including-10-thread-dumps.txt' /var/log/dcache-debug/lsof.txt /var/log/dcache-debug/netstat.txt /var/log/dcache-debug/ps.txt /var/log/dcache-debug/top.txt /var/log/dcache-debug/vmstat.txt
+ set +x
Dumps and log files have been saved in /var/log/dcache-debug.
You can share them like this (preferably as an unprivileged user):
cd /var/log/dcache-debug ; nohup python -m SimpleHTTPServer 22222 &

real 1m18.839s
user 0m33.124s
sys 0m3.226s
[root@cmsstor151 ~]# du -hs /var/log/dcache-debug/
748K /var/log/dcache-debug/

#6 Updated by Natalia Ratnikova over 4 years ago

From https://cmsweb.fnal.gov/bin/view/Storage/MeetingMinutes2015-07-21 :

Following up on what we'd like primary to do https://cdcvs.fnal.gov/redmine/issues/8828 . Work in progress now (7/21)

Natalia will get this new script in one of Tim's RPMs that are installed everywhere, then processes need to be updated.

#7 Updated by Natalia Ratnikova over 4 years ago

Talked to Tim S:

- the  cms-global rpm is not a good place for this kind of script.
- among the rpms already installed on dcache pool nodes the cms-disk rpm looks like a auitable place, currently it only provides one file:

[root@cmssrv201 slf6-x86_64]# rpm -qlp cms-diskbox-0-1.el6.noarch.rpm
/usr/lib/ruby/site_ruby/1.8/facter/wwn_target.rb

Otherwise, adding this script in dCache specific rpm would be a good interim solution.

Will talk about this at the nex dCache meeting.

#8 Updated by Natalia Ratnikova about 4 years ago

Added the script in cms-diskbox repo and built the rpm following instructions at
https://cmsweb.fnal.gov/bin/view/Software/Rpm

as natasha@cmsadmin1

Ran into problem during gpg resign step. Checked with Gerard: he did not sign the rpms he built with this procedure. Contacting Tim via https://uscms.slack.com/messages/dcso/

Here is the error message for the record:

[natasha@cmsadmin1 cms-diskbox]$ rpm --resign /home/natasha/rpmbuild/slf6-x86_64/cms-diskbox-1.1.1-1.el6.noarch.rpm
Enter pass phrase:
Pass phrase is good.
/home/natasha/rpmbuild/slf6-x86_64/cms-diskbox-1.1.1-1.el6.noarch.rpm:
can't connect to `/home/natasha/.gnupg/S.gpg-agent': No such file or directory
/usr/bin/pinentry: line 22: xprop: command not found
Please install pinentry-gui
gpg-agent1721: can't connect server: ec=4.16383
gpg-agent1721: can't connect to the PIN entry module: End of file
gpg-agent1721: command get_passphrase failed: No pinentry
gpg: problem with the agent: No pinentry
gpg: skipped "US-CMS T1 Signing Key <>": General error
gpg: signing failed: General error
error: gpg exec failed (2)

#9 Updated by Natalia Ratnikova about 4 years ago

Reading through gpg-agent man pages, and
http://www.linuxquestions.org/questions/linux-security-4/gpg-gpg-agent-can't-connect-to-root-gnupg-s-gpg-agent-611843/

Starting the daemon helped to overcome the missing file problem, but still failed to sign:

[natasha@cmsadmin1 ~/work]$ gpg-agent --daemon --enable-ssh-support --write-env-file "/home/natasha/.gnupg/S.gpg-agent" 
setenv GPG_AGENT_INFO /tmp/gpg-ji8boj/S.gpg-agent:7691:1
setenv SSH_AUTH_SOCK /tmp/gpg-tJXaJi/S.gpg-agent.ssh
setenv SSH_AGENT_PID 7691
[natasha@cmsadmin1 ~/work]$ echo $SHELL
/bin/tcsh
[natasha@cmsadmin1 ~/work]$ setenv GPG_AGENT_INFO /tmp/gpg-ji8boj/S.gpg-agent:7691:1
[natasha@cmsadmin1 ~/work]$ setenv SSH_AUTH_SOCK /tmp/gpg-tJXaJi/S.gpg-agent.ssh
[natasha@cmsadmin1 ~/work]$ setenv SSH_AGENT_PID 7691
[natasha@cmsadmin1 ~/work]$ rpm --resign /home/natasha/rpmbuild/slf6-x86_64/cms-diskbox-1.1.1-1.el6.noarch.rpm
Enter pass phrase: 
Pass phrase is good.
/home/natasha/rpmbuild/slf6-x86_64/cms-diskbox-1.1.1-1.el6.noarch.rpm:
gpg: problem with the agent: No pinentry
gpg: skipped "US-CMS T1 Signing Key <cms-t1@fnal.gov>": General error
gpg: signing failed: General error
error: gpg exec failed (2)

#10 Updated by Natalia Ratnikova about 4 years ago

Looking at this with Tim, he found a way to sign properly without involving the pgp-agent, and updated cms-rpmtools correspondingly.
I checked that with the new version 1.0.4 of cms-rpmtools make works smoothly to the end, and creates a properly signed rpm:

[natasha@cmsadmin1 cms-diskbox]$ make confirm
rpm -qpi /home/natasha/rpmbuild/slf6-x86_64/cms-diskbox-1.1.1-1.el6.noarch.rpm
Signature   : RSA/SHA1, Thu 17 Sep 2015 12:23:10 PM CDT, Key ID c0ca30e75b9351a6
[natasha@cmsadmin1 cms-diskbox]$ 

The newly built rpm is to be added to uscms-t1 repo.

#11 Updated by Natalia Ratnikova about 4 years ago

Tested and applied some minor fixes: make script and executable, and exit if it can't create the log dir (e.g. due to insufficient privileges). Rebuilt the rpm , and update the yum repo, following instructions at

https://cmsweb.fnal.gov/bin/view/Software/YumRepo

[root@cmssrv201 slf6-x86_64]# ls -latr  cms-diskbox-*
-rw-r--r-- 1 root root 3212 Sep 18  2014 cms-diskbox-0-1.el6.noarch.rpm
-rw-r--r-- 1 root root 3300 Sep 19  2014 cms-diskbox-0-2.el6.noarch.rpm
-rw-r--r-- 1 root root 3836 Apr 30 13:54 cms-diskbox-1.1.0-1.el6.noarch.rpm
[root@cmssrv201 slf6-x86_64]# pwd
/srv/repo/slf6-x86_64
[root@cmssrv201 slf6-x86_64]# ls -latr  cms-diskbox-*
-rw-r--r-- 1 root root 3212 Sep 18  2014 cms-diskbox-0-1.el6.noarch.rpm
-rw-r--r-- 1 root root 3300 Sep 19  2014 cms-diskbox-0-2.el6.noarch.rpm
-rw-r--r-- 1 root root 3836 Apr 30 13:54 cms-diskbox-1.1.0-1.el6.noarch.rpm
-rw-rw-r-- 1 root root 5284 Sep 17 13:11 cms-diskbox-1.1.1-1.el6.noarch.rpm
[root@cmssrv201 slf6-x86_64]# cd /srv/repo/
[root@cmssrv201 repo]# make uscmst1
task started: 2015-09-17_131726_reposync
task started (id=Reposync, time=Thu Sep 17 13:17:26 2015)
hello, reposync
run, reposync, run!
running: rsync -rltDv --copy-unsafe-links --delete-after  --delete --exclude-from=/etc/cobbler/rsync.exclude /srv/repo/slf6-x86_64/ /var/www/cobbler/repo_mirror/uscmst1-el6-x86_64
received on stdout: building file list ... done
./
cms-diskbox-1.1.1-1.el6.noarch.rpm
repodata/
repodata/repomd.xml
cannot delete non-empty directory: cache
deleting config.repo
deleting repodata/cbcd23bb955542cf355d230e9e308954674df1e1-primary.sqlite.bz2
deleting repodata/9a99b187ca339359d806770d17681ef917cb2350-primary.xml.gz
deleting repodata/9310ab259e970171671567773a295aa4aea8cddc-other.sqlite.bz2
deleting repodata/690422f6582151190a3091d8c357366a1aa3ed78-other.xml.gz
deleting repodata/36c41eeff0dd24560b30e7d7ab4488fd7a622dfc-filelists.sqlite.bz2
deleting repodata/32fd6c64f6e78165e3d416a7c77941f4210ef4b8-filelists.xml.gz

sent 16777 bytes  received 56 bytes  33666.00 bytes/sec
total size is 833042900  speedup is 49488.68

received on stderr: 
running: createrepo  -c cache -s sha /var/www/cobbler/repo_mirror/uscmst1-el6-x86_64
received on stdout: Spawning worker 0 with 236 pkgs
Workers Finished
Gathering worker results

Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete

received on stderr: 
creating: /var/www/cobbler/repo_mirror/uscmst1-el6-x86_64/config.repo
running: chown -R root:apache /var/www/cobbler/repo_mirror/uscmst1-el6-x86_64
received on stdout: 
received on stderr: 
running: chmod -R 755 /var/www/cobbler/repo_mirror/uscmst1-el6-x86_64
received on stdout: 
received on stderr: 
*** TASK COMPLETE ***
task started: 2015-09-17_131732_reposync
task started (id=Reposync, time=Thu Sep 17 13:17:32 2015)
hello, reposync
run, reposync, run!
running: rsync -rltDv --copy-unsafe-links --delete-after  --delete --exclude-from=/etc/cobbler/rsync.exclude /srv/repo/dcso-el6-x86_64/ /var/www/cobbler/repo_mirror/dcso-el6-x86_64
received on stdout: building file list ... done
./
deleting repodata/repomd.xml
deleting repodata/fd0b92b4a4f2bcbf94a26d2e2515afec122d7a92-filelists.sqlite.bz2
deleting repodata/b59abd624e19a6c3ed224a4dae79d0165f97981c-filelists.xml.gz
deleting repodata/8f856c29abbec12f80c1506c5fcd1a1a7df2cf0c-primary.sqlite.bz2
deleting repodata/54d63538e4ff6f26ec7c4f20ce7445953beed778-other.sqlite.bz2
deleting repodata/46bd9a4f4645e66a2704a31f454b8d0d4ebea68a-other.xml.gz
deleting repodata/21603f408958bdd147ed20b771bfbfd475f2d6fc-primary.xml.gz
deleting repodata/
cannot delete non-empty directory: cache
deleting config.repo

sent 521 bytes  received 15 bytes  1072.00 bytes/sec
total size is 273444  speedup is 510.16

received on stderr: 
running: createrepo  -c cache -s sha /var/www/cobbler/repo_mirror/dcso-el6-x86_64
received on stdout: Spawning worker 0 with 13 pkgs
Workers Finished
Gathering worker results

Saving Primary metadata
Saving file lists metadata
Saving other metadata
Generating sqlite DBs
Sqlite DBs complete

received on stderr: 
creating: /var/www/cobbler/repo_mirror/dcso-el6-x86_64/config.repo
running: chown -R root:apache /var/www/cobbler/repo_mirror/dcso-el6-x86_64
received on stdout: 
received on stderr: 
running: chmod -R 755 /var/www/cobbler/repo_mirror/dcso-el6-x86_64
received on stdout: 
received on stderr: 
*** TASK COMPLETE ***
[root@cmssrv201 repo]# 

#12 Updated by Natalia Ratnikova about 4 years ago

  • Status changed from Assigned to Resolved

Installed and tested on dcache testbed pool node.

[root@cmsstor151 ~]# yum list cms-diskbox
Loaded plugins: priorities, security
8389 packages excluded due to repository priority protections
Installed Packages
cms-diskbox.noarch                     1.1.0-1.el6                     @uscmst1
[root@cmsstor151 ~]# yum clean all
Loaded plugins: priorities, security
Cleaning repos: epel osg puppet puppet-deps slf slf-collections slf-security
              : slf-source slf6x slf6x-security uscmst1
Cleaning up Everything
[root@cmsstor151 ~]# yum list cms-diskbox
Loaded plugins: priorities, security
epel                                                    | 3.8 kB     00:00     
epel/primary_db                                         | 6.4 MB     00:00     
osg                                                     | 2.5 kB     00:00     
osg/primary_db                                          | 407 kB     00:00     
puppet                                                  | 2.5 kB     00:00     
puppet/primary_db                                       |  17 kB     00:00     
puppet-deps                                             | 2.5 kB     00:00     
puppet-deps/primary_db                                  |  17 kB     00:00     
slf                                                     | 3.6 kB     00:00     
slf/primary_db                                          | 4.2 MB     00:00     
slf-collections                                         | 2.9 kB     00:00     
slf-collections/primary_db                              | 1.9 MB     00:00     
slf-security                                            | 2.9 kB     00:00     
slf-security/primary_db                                 | 5.5 MB     00:00     
slf-source                                              | 2.7 kB     00:00     
slf-source/primary_db                                   | 1.9 MB     00:00     
slf6x                                                   | 3.6 kB     00:00     
slf6x/primary_db                                        | 4.2 MB     00:00     
slf6x-security                                          | 2.9 kB     00:00     
slf6x-security/primary_db                               | 5.5 MB     00:00     
uscmst1                                                 | 2.5 kB     00:00     
uscmst1/primary_db                                      | 139 kB     00:00     
8389 packages excluded due to repository priority protections
Installed Packages
cms-diskbox.noarch                     1.1.0-1.el6                     @uscmst1
Available Packages
cms-diskbox.noarch                     1.1.1-1.el6                     uscmst1 
[root@cmsstor151 ~]# yum update cms-diskbox
Loaded plugins: priorities, security
Setting up Update Process
8389 packages excluded due to repository priority protections
Resolving Dependencies
--> Running transaction check
---> Package cms-diskbox.noarch 0:1.1.0-1.el6 will be updated
---> Package cms-diskbox.noarch 0:1.1.1-1.el6 will be an update
--> Finished Dependency Resolution

Dependencies Resolved

===============================================================================
 Package             Arch           Version              Repository       Size
===============================================================================
Updating:
 cms-diskbox         noarch         1.1.1-1.el6          uscmst1         5.2 k

Transaction Summary
===============================================================================
Upgrade       1 Package(s)

Total download size: 5.2 k
Is this ok [y/N]: y
Downloading Packages:
cms-diskbox-1.1.1-1.el6.noarch.rpm                      | 5.2 kB     00:00     
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
  Updating   : cms-diskbox-1.1.1-1.el6.noarch                              1/2 
  Cleanup    : cms-diskbox-1.1.0-1.el6.noarch                              2/2 
  Verifying  : cms-diskbox-1.1.1-1.el6.noarch                              1/2 
  Verifying  : cms-diskbox-1.1.0-1.el6.noarch                              2/2 

Updated:
  cms-diskbox.noarch 0:1.1.1-1.el6                                             

Complete!
[root@cmsstor151 ~]# dcache-
dcache-collect-debug-info  dcache-info-provider
[root@cmsstor151 ~]# dcache-collect-debug-info 
+ rm -f /var/log/dcache-debug/dcache.conf /var/log/dcache-debug/dcache-dump-heap-gridftp-cmsstor151Domain.txt /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk1Domain.txt /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk2Domain.txt /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk3Domain.txt /var/log/dcache-debug/gridftp-cmsstor151Domain.log-last-200000-lines-including-10-thread-dumps.txt /var/log/dcache-debug/lsof.txt /var/log/dcache-debug/netstat.txt /var/log/dcache-debug/ps.txt /var/log/dcache-debug/top.txt /var/log/dcache-debug/vmstat.txt /var/log/dcache-debug/w-cmsstor151-disk_itb-disk1Domain.log-last-200000-lines-including-10-thread-dumps.txt /var/log/dcache-debug/w-cmsstor151-disk_itb-disk2Domain.log-last-200000-lines-including-10-thread-dumps.txt /var/log/dcache-debug/w-cmsstor151-disk_itb-disk3Domain.log-last-200000-lines-including-10-thread-dumps.txt
+ HOW_MANY_THREAD_DUMPS=10
++ seq 1 10
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ for i in '`seq 1 $HOW_MANY_THREAD_DUMPS`'
+ /usr/bin/dcache dump threads
Stack traces for w-cmsstor151-disk_itb-disk1Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk2Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log.
Stack traces for w-cmsstor151-disk_itb-disk3Domain have been written to
/var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log.
Stack traces for gridftp-cmsstor151Domain have been written to
/var/log/dcache/gridftp-cmsstor151Domain.log.
+ sleep 5
+ HOW_MANY_LINES=200000
+ for file in '/var/log/dcache/*Domain.log'
++ basename /var/log/dcache/gridftp-cmsstor151Domain.log
+ basename=gridftp-cmsstor151Domain.log
+ tail -n 200000 /var/log/dcache/gridftp-cmsstor151Domain.log
+ for file in '/var/log/dcache/*Domain.log'
++ basename /var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log
+ basename=w-cmsstor151-disk_itb-disk1Domain.log
+ tail -n 200000 /var/log/dcache/w-cmsstor151-disk_itb-disk1Domain.log
+ for file in '/var/log/dcache/*Domain.log'
++ basename /var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log
+ basename=w-cmsstor151-disk_itb-disk2Domain.log
+ tail -n 200000 /var/log/dcache/w-cmsstor151-disk_itb-disk2Domain.log
+ for file in '/var/log/dcache/*Domain.log'
++ basename /var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log
+ basename=w-cmsstor151-disk_itb-disk3Domain.log
+ tail -n 200000 /var/log/dcache/w-cmsstor151-disk_itb-disk3Domain.log
++ /usr/bin/dcache status
++ grep Domain
++ awk '{print $1}'
+ for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap w-cmsstor151-disk_itb-disk1Domain /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk1Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap w-cmsstor151-disk_itb-disk2Domain /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk2Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap w-cmsstor151-disk_itb-disk3Domain /var/log/dcache-debug/dcache-dump-heap-w-cmsstor151-disk_itb-disk3Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ for domain in '`/usr/bin/dcache status | grep '\''Domain'\'' | awk '\''{print $1}'\''`'
+ /usr/bin/dcache dump heap gridftp-cmsstor151Domain /var/log/dcache-debug/dcache-dump-heap-gridftp-cmsstor151Domain.txt
which: no jmap in (/usr/krb5/sbin:/usr/krb5/bin:/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
Could not find the jmap command, part of the Java 6 JDK. This command is
required for producing a heap dump. Please ensure that either jmap is in
the path or update JAVA_HOME.
+ cp /etc/dcache/dcache.conf /var/log/dcache-debug/
+ top -b -n 1
+ vmstat 1 10
+ lsof
+ netstat -nap
+ ps -efL
+ chmod 644 /var/log/dcache-debug/dcache.conf /var/log/dcache-debug/gridftp-cmsstor151Domain.log-last-200000-lines-including-10-thread-dumps.txt /var/log/dcache-debug/lsof.txt /var/log/dcache-debug/netstat.txt /var/log/dcache-debug/ps.txt /var/log/dcache-debug/top.txt /var/log/dcache-debug/vmstat.txt /var/log/dcache-debug/w-cmsstor151-disk_itb-disk1Domain.log-last-200000-lines-including-10-thread-dumps.txt /var/log/dcache-debug/w-cmsstor151-disk_itb-disk2Domain.log-last-200000-lines-including-10-thread-dumps.txt /var/log/dcache-debug/w-cmsstor151-disk_itb-disk3Domain.log-last-200000-lines-including-10-thread-dumps.txt
+ set +x
Dumps and log files have been saved in /var/log/dcache-debug.
You can share them like this (preferably as an unprivileged user):
cd /var/log/dcache-debug ; nohup python -m SimpleHTTPServer 22222 &
[root@cmsstor151 ~]# ls -l /var/log/dcache-debug
total 3840
-rw-r--r-- 1 root root     997 Sep 17 13:30 dcache.conf
-rw-r--r-- 1 root root 1005943 Sep 17 13:30 gridftp-cmsstor151Domain.log-last-200000-lines-including-10-thread-dumps.txt
-rw-r--r-- 1 root root  453163 Sep 17 13:30 lsof.txt
-rw-r--r-- 1 root root   17223 Sep 17 13:30 netstat.txt
-rw-r--r-- 1 root root  226177 Sep 17 13:30 ps.txt
-rw-r--r-- 1 root root   24450 Sep 17 13:30 top.txt
-rw-r--r-- 1 root root    1009 Sep 17 13:30 vmstat.txt
-rw-r--r-- 1 root root  724869 Sep 17 13:30 w-cmsstor151-disk_itb-disk1Domain.log-last-200000-lines-including-10-thread-dumps.txt
-rw-r--r-- 1 root root  725696 Sep 17 13:30 w-cmsstor151-disk_itb-disk2Domain.log-last-200000-lines-including-10-thread-dumps.txt
-rw-r--r-- 1 root root  730610 Sep 17 13:30 w-cmsstor151-disk_itb-disk3Domain.log-last-200000-lines-including-10-thread-dumps.txt
[root@cmsstor151 ~]# 



Also available in: Atom PDF