なんか重い処理を用意する †
- 姫野ベンチ
- make
$ sudo yum -y install gcc make
$ wget http://accc.riken.jp/wp-content/uploads/2015/07/himenobmt.c.zip
$ unzip himenobmt.c.zip
$ lha himenobmt.c.lzh
Makefile の MODEL を MODEL = MIDDLE に変更して
$ make
- 実行してみる
$ ./bmt
mimax = 257 mjmax = 129 mkmax = 129
imax = 256 jmax = 128 kmax =128
cpu : 7.016288 sec.
Loop executed for 200 times
Gosa : 1.245715e-03
MFLOPS measured : 3908.195787
Score based on MMX Pentium 200MHz : 121.109259
bsub (JOB投入) †
$ su lsfadmin
$ bsub -e /tmp/err.txt -o /tmp/std.txt /usr/local/bin/bmt
Job <103> is submitted to default queue <normal>.
bjobs (生きているJOBの一覧) †
$ bsub /usr/local/bin/bmt
Job <104> is submitted to default queue <normal>.
$ bjobs
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME
104 lsfadmi RUN normal lsf1 lsf1 *l/bin/bmt Dec 3 08:13
- options
(default) | 自分のJOBを表示 |
-u user_name | user_name ユーザのJOBを表示 |
-u user_group | user_group グループのJOBを表示 |
-u all | 全ユーザのJOBを表示 |
bkill (JOBの停止) †
$ bsub /usr/local/bin/bmt
Job <105> is submitted to default queue <normal>.
$ bkill 105
Job <105> is being terminated
- 動作
- bkill は、最初に SIGINT と SIGTERM を送信
- lsb.param の JOB_TERMINATE_INTERVAL で指定した秒数後に SIGKILL を送信
- options
-u user_name | user_name ユーザのJOBを全停止 |
-u user_group | user_group グループのJOBを全停止 |
-u all | 全ユーザのJOBを全停止 |
bhist (JOBの実行履歴) †
$ bhist 105
Summary of time in seconds spent in various states:
JOBID USER JOB_NAME PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
105 lsfadmi *bin/bmt 1 0 12 0 0 0 13
bpeek (実行中コマンドの標準出力を見る) †
$ bsub /usr/local/bin/bmt
Job <106> is submitted to default queue <normal>.
$ bpeek 106
<< output from stdout >>
管理コマンド (lsid, lshosts, lsload, bhosts) †
- LSFクラスタの状態を表示
$ lsid
IBM Spectrum LSF Community Edition 10.1.0.0, Jun 15 2016
Copyright IBM Corp. 1992, 2016. All rights reserved.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.
My cluster name is cluster1
My master name is lsf1
- LSFホストの一覧
$ lshosts
HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES
lsf1 X86_64 PC6000 116.1 2 1023M 1.9G Yes (mg)
- LSFホストの負荷
$ lsload
HOST_NAME status r15s r1m r15m ut pg ls it tmp swp mem
lsf1 ok 0.1 0.0 0.1 1% 0.1 1 0 2587M 1.6G 338M
- LSFホストごとのJOB状態一覧
$ bhosts
HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV
lsf1 ok - 2 0 0 0 0 0
queue の設定 †
/usr/share/lsf/conf/lsbatch/cluster1/configdir/lsb.queues
...
Begin Queue
QUEUE_NAME = normal
PRIORITY = 30
INTERACTIVE = NO
FAIRSHARE = USER_SHARES[[default,1]]
#RUN_WINDOW = 5:19:00-1:8:30 20:00-8:30
#r1m = 0.7/2.0 # loadSched/loadStop
#r15m = 1.0/2.5
#pg = 4.0/8
#ut = 0.2
#io = 50/240
#CPULIMIT = 180/hostA # 3 hours of host hostA
#FILELIMIT = 20000
#DATALIMIT = 20000 # jobs data segment limit
#CORELIMIT = 20000
#TASKLIMIT = 5 # job task limit
#USERS = all # users who can submit jobs to this queue
#HOSTS = all # hosts on which jobs in this queue can run
#PRE_EXEC = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
#POST_EXEC = /usr/local/lsf/misc/testq_post |grep -v "Hey"
#REQUEUE_EXIT_VALUES = 55 34 78
#APS_PRIORITY = WEIGHT[[RSRC, 10.0] [MEM, 20.0] [PROC, 2.5] [QPRIORITY, 2.0]] \
# LIMIT[[RSRC, 3.5] [QPRIORITY, 5.5]] \
# GRACE_PERIOD[[QPRIORITY, 200s] [MEM, 10m] [PROC, 2h]]
DESCRIPTION = For normal low priority jobs, running only if hosts are \
lightly loaded.
End Queue
...
まぁ、そういうことですたい
LSF