O2locktop是一个类似于OCFS2 DLM的顶部锁定监视器
o2locktop的Python项目详细描述
o2locktop-类似于ocfs2 dlm锁定监视器的顶部
简介
o2locktop是一个类似于top的工具,用于监视集群中ocfs2 dlm锁的使用情况, 可用于检测热文件/目录,密集获取dlm 锁。
DLM锁定收购的平均/最大等待时间可能给 管理员在关心ocfs2性能时,例如,
- 如果工作负载在节点之间不平衡。
- 如果文件太热,可能需要检查上面的相关应用程序。
- 如果一个目录太热,那么可以用更少的数字把它拆分成更小的 下面的文件。
要了解更多的实现细节,
作为共享磁盘群集文件系统,ocfs2文件和目录可以是 同时从不同的节点访问。保护数据 一致性,通过分布式锁协调文件访问 经理(DLM)。例如,“meta dlm lock”用于保护文件(每个inode) 元数据更改。”“写入dlm锁”用于保护文件数据写入。正常开放 “dlm lock”可用于一个节点保持访问打开的文件,而另一个节点保持访问打开的文件 进程(甚至从其他节点)可能会删除它,并最终获得 关闭所有关联的文件描述符后删除。了解更多信息 关于ocfs2如何与dlm一起工作,请检查OCFS2 Project web page。
o2locktop读取/sys/kernel/debug/下的ocfs2 kernel debugfs统计信息。 也就是说,对于所有集群节点,ocfs2\usstats kernel config选项必须是 设置(启用)。要查看:
grep OCFS2_FS_STATS < /boot/config-`uname -r`
安装
注意:o2locktop与python 2和python 3兼容。
转速:
https://download.opensuse.org/repositories/network:/ha-clustering:/Factory/
例如,要下载opensuse_tumbleweed/noarch/o2locktop-1.0.0…noarch.rpm
sudo zypper install <http_rpm_uri> or sudo rpm -ivh <o2locktop-1.0.0...noarch.rpm>
- Python皮:
sudo pip install o2locktop
- 或者,直接使用源代码树中的o2locktop:
git clone https://github.com/ganghe/o2locktop.git
cd o2locktop
~/o2locktop> python o2locktop -h
用法
详细检查
o2locktop --help
,也可在下面的REFERENCE或者,检查asciidemohere
已知限制
因为内核计算中的ocfs2文件系统统计信息在 申请dlm锁并在其返回时结束。如果它因为 因为一个bug导致的死锁以防万一,o2locktop没有反映这一点 目前的情况。
O2Locktop无法显示索引节点的文件名。附加步骤 需要将inode转换为文件名。
find <YOUR_OCFS2_MOUNT_POINT> -inum <INODE_NUMBER>
待办事项
- 重播O2Locktop日志文件。
- 在集群内部,o2lockto可以不带任何参数运行。
- 单元测试
社区
参考
usage: o2locktop [-h] [-n NODE_IP] [-o LOG_FILE] [-l DISPLAY_LENGTH] [-V] [-d]
[MOUNT_POINT]
It is a top-like tool to monitor OCFS2 DLM lock usage in the cluster, and can
be used to detect hot files/directories, which intensively acquire DLM locks.
positional arguments:
MOUNT_POINT the OCFS2 mount point, eg. /mnt/shared
optional arguments:
-h, --help show this help message and exit
-n NODE_IP OCFS2 node IP address for ssh
-o LOG_FILE log path
-l DISPLAY_LENGTH number of lock records to display
-V, --version print the current version of o2locktop and exit
-d, --debug show all the inode including the system inode number
The average/maximal wait time for DLM lock acquisitions likely gives hints to
the administrator when concern about OCFS2 performance, for example,
- if the workload is unbalanced among nodes.
- if a file is too hot, then maybe need check the related applications above.
- if a directory is too hot, then maybe split it to smaller with less number
of files underneath.
OUTPUT ANNOTATION:
- The output is refreshed every 5 seconds, and sorted by the sum of
DLM EX(exclusive) and PR(protected read) lock average wait time
- One row, one inode (including the system meta files if with '-d' argument)
- Columns:
"TYPE" is DLM lock types,
'M' -> Meta data lock for the inode
'W' -> Write lock for the inode
'O' -> Open lock for the inode
"INO" is the inode number of the file
"EX NUM" is the number of EX lock acquisitions
"EX TIME" is the maximal wait time to get EX lock
"EX AVG" is the average wait time to get EX lock
"PR NUM" is the number of PR(read) lock acquisitions
"PR TIME" is the maximal wait time to get PR lock
"PR AVG" is the average wait time to get PR lock
SHORTCUTS:
- Type "d" to display DLM lock statistics for each node
- Type "Ctrl+C" or "q" to exit o2locktop process
PREREQUISITES:
o2locktop reads OCFS2_FS_STATS statistics from /sys/kernel/debug/. That says,
for all cluster nodes, the kernel option must be set(enabled). Check it out:
grep OCFS2_FS_STATS < /boot/config-\`uname -r\`
o2locktop uses the passwordless SSH to OCFS2 nodes as root. Set it up if not:
ssh-keygen; ssh-copy-id root@node1
EXAMPLES:
- At any machine within or outside of the cluster:
o2locktop -n node1 -n node2 -n node3 /mnt/shared
To find the absolute path of the inode file:
find <MOUNT_POINT> -inum <INO>