要監控服務器的運行狀況? 嘗試一些內建的命令行與少量的外圍工具吧。 許多 Linux 發行版都預置了許多監控工具。這些工具提供系統活動可量化的信息量度。 你可以使用這些工具來查找可能導致運行問題的原因。以下將要討論的這些工具是最基本的系統命令, 當在要進行系統分析和服務器問題調試,例如:
- 查找瓶頸。
- 磁盤(存儲)瓶頸。
- CPU 與內存瓶頸。
- 網絡瓶頸。
#1: top - 活動進程命令
top命令提供一個實時的動態更新的系統運行視圖,例如, 實時活動進程。 默認, 它將顯示服務器上運行的佔用最多CPU線程的進程,並每隔5分鐘刷新一次。
常用熱鍵
top命令提花幾個有用的熱鍵:
熱鍵 |
用法 |
---|---|
t | 開關顯示摘要信息。 |
m | 開關顯示內存信息。 |
A | 對最高的系統資源開銷進行排序顯示。在快速鑑定性能飢渴的系統任務上很有效。 |
f | 進入top的交互配置屏幕。 在爲特別的任務配置 top時很有效。 |
o | 允許交互式選擇top組合。 |
r | 改變優先權問題命令。 |
k | 殺死進程問題命令。 |
z | 開關彩色/單色。 |
=> 相關閱讀: How do I Find Out Linux CPU Utilization?
#2: vmstat - 系統活動, 硬件與系統信息
vmstat 命令報告進程,內存,頁面,IO中斷,磁帶與CPU活躍度的信息。# vmstat 3
輸出示例:
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 2540988 522188 5130400 0 0 2 32 4 2 4 1 96 0 0 1 0 0 2540988 522188 5130400 0 0 0 720 1199 665 1 0 99 0 0 0 0 0 2540956 522188 5130400 0 0 0 0 1151 1569 4 1 95 0 0 0 0 0 2540956 522188 5130500 0 0 0 6 1117 439 1 0 99 0 0 0 0 0 2540940 522188 5130512 0 0 0 536 1189 932 1 0 98 0 0 0 0 0 2538444 522188 5130588 0 0 0 0 1187 1417 4 1 96 0 0 0 0 0 2490060 522188 5130640 0 0 0 18 1253 1123 5 1 94 0 0
顯示內存使用池信息
# vmstat -m
獵取活動 / 休眠的內存頁面信息
# vmstat -a
=> 相關閱讀: How do I find out Linux Resource utilization to detect system bottlenecks?
#3: w - 查看誰登錄並做了什麼
w 命令顯示了此計算機上當前登錄的用戶及他們的進程的信息。# w username
# w vivek
輸出示例:
17:58:47 up 5 days, 20:28, 2 users, load average: 0.36, 0.26, 0.24 USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT root pts/0 10.1.3.145 14:55 5.00s 0.04s 0.02s vim /etc/resolv.conf root pts/1 10.1.3.145 17:43 0.00s 0.03s 0.00s w
#4: uptime - 告訴你係統運行了多長時間
uptime 命令可以查看服務器運行了多長的時間。 當前時間, 系統運行了多久, 多少個用戶已登錄, 和在過去的每1,5與15分內系統平均裝載。# uptime
輸出:
18:02:41 up 41 days, 23:42, 1 user, load average: 0.00, 0.00, 0.00
1 被認可爲最佳的裝載值。 每個系統的裝載都不一樣。在單 CPU 系統 1 - 3 和 多CPU系統 6-10 裝載值也是可接受的。
#5: ps - 顯示進程
ps 會報告當前進程的截圖。 選擇所有進程用 -A 或 -e 選項:# ps -A
輸出示例:
PID TTY TIME CMD 1 ? 00:00:02 init 2 ? 00:00:02 migration/0 3 ? 00:00:01 ksoftirqd/0 4 ? 00:00:00 watchdog/0 5 ? 00:00:00 migration/1 6 ? 00:00:15 ksoftirqd/1 .... ..... 4881 ? 00:53:28 java 4885 tty1 00:00:00 mingetty 4886 tty2 00:00:00 mingetty 4887 tty3 00:00:00 mingetty 4888 tty4 00:00:00 mingetty 4891 tty5 00:00:00 mingetty 4892 tty6 00:00:00 mingetty 4893 ttyS1 00:00:00 agetty 12853 ? 00:00:00 cifsoplockd 12854 ? 00:00:00 cifsdnotifyd 14231 ? 00:10:34 lighttpd 14232 ? 00:00:00 php-cgi 54981 pts/0 00:00:00 vim 55465 ? 00:00:00 php-cgi 55546 ? 00:00:00 bind9-snmp-stat 55704 pts/1 00:00:00 ps
ps類似於 top 但提供更多信息。
顯示長格式輸出
# ps -Al
開啓額外完整模式(通過命令行參數顯示進程):# ps -AlF
查看線程 ( LWP 和 NLWP)
# ps -AlFH
查看進程後的線程
# ps -AlLm
打印服務器上的所有進程
# ps ax
# ps axu
打印進程樹
# ps -ejH
# ps axjf
# pstree
打印安全信息
# ps -eo euser,ruser,suser,fuser,f,comm,label
# ps axZ
# ps -eM
查看用戶 Vivek 的所有進程
# ps -U vivek -u vivek u
自定義查看信息
# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
# ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
# ps -eopid,tt,user,fname,tmout,f,wchan
僅查看進程 Lighttpd 的進程號
# ps -C lighttpd -o pid=
或# pgrep lighttpd
或# pgrep -u vivek php-cgi
顯示進程號爲 55977 的進程名稱
# ps -p 55977 -o comm=
查找內存開消前10位進程
# ps -auxf | sort -nr -k 4 | head -10
查找CPU開消前10位進程
# ps -auxf | sort -nr -k 3 | head -10
#6: free - 內存使用
命令 free 顯示系統中物理內存與交換分區中總共可用的空間。# free
輸出示例:
total used free shared buffers cached Mem: 12302896 9739664 2563232 0 523124 5154740 -/+ buffers/cache: 4061800 8241096 Swap: 1052248 0 1052248
=> 相關閱讀: :
- Linux Find Out Virtual Memory PAGESIZE
- Linux Limit CPU Usage Per Process
- How much RAM does my Ubuntu / Fedora Linux desktop PC have?
#7: iostat - 平均的 CPU 裝載, 磁盤活動
命令 iostat 報告中央處理器統計與設備的輸入/輸出統計,分區與網絡文件系統(NFS)信息。# iostat
輸出示例:
Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 06/26/2009 avg-cpu: %user %nice %system %iowait %steal %idle 3.50 0.09 0.51 0.03 0.00 95.86 Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn sda 22.04 31.88 512.03 16193351 260102868 sda1 0.00 0.00 0.00 2166 180 sda2 22.04 31.87 512.03 16189010 260102688 sda3 0.00 0.00 0.00 1615 0
=> 相關閱讀: : Linux Track NFS Directory / Disk I/O Stats
#8: sar - 收集與報告系統活動
sar 命令腄收集,報告和保存系統活動信息。查看網絡計數,輸入:# sar -n DEV | more
顯示24個月以來的網絡計數:# sar -n DEV -f /var/log/sa/sa24 | more
當然也可以用sar實時查看:# sar 4 5
輸出示例:
Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 06/26/2009 06:45:12 PM CPU %user %nice %system %iowait %steal %idle 06:45:16 PM all 2.00 0.00 0.22 0.00 0.00 97.78 06:45:20 PM all 2.07 0.00 0.38 0.03 0.00 97.52 06:45:24 PM all 0.94 0.00 0.28 0.00 0.00 98.78 06:45:28 PM all 1.56 0.00 0.22 0.00 0.00 98.22 06:45:32 PM all 3.53 0.00 0.25 0.03 0.00 96.19 Average: all 2.02 0.00 0.27 0.01 0.00 97.70
=> 相關閱讀: : How to collect Linux system utilization data into a file
#9: mpstat - 多處理器使用
mpstat 命令顯示了每一有效的處理器的活動,processor 0 是第一個。 mpstat -P ALL 顯示每一個 CPU 在每個進程的利用率:# mpstat -P ALL
示例輸出:
Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in) 06/26/2009 06:48:11 PM CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s 06:48:11 PM all 3.50 0.09 0.34 0.03 0.01 0.17 0.00 95.86 1218.04 06:48:11 PM 0 3.44 0.08 0.31 0.02 0.00 0.12 0.00 96.04 1000.31 06:48:11 PM 1 3.10 0.08 0.32 0.09 0.02 0.11 0.00 96.28 34.93 06:48:11 PM 2 4.16 0.11 0.36 0.02 0.00 0.11 0.00 95.25 0.00 06:48:11 PM 3 3.77 0.11 0.38 0.03 0.01 0.24 0.00 95.46 44.80 06:48:11 PM 4 2.96 0.07 0.29 0.04 0.02 0.10 0.00 96.52 25.91 06:48:11 PM 5 3.26 0.08 0.28 0.03 0.01 0.10 0.00 96.23 14.98 06:48:11 PM 6 4.00 0.10 0.34 0.01 0.00 0.13 0.00 95.42 3.75 06:48:11 PM 7 3.30 0.11 0.39 0.03 0.01 0.46 0.00 95.69 76.89
#10: pmap - Process Memory Usage
The command pmap report memory map of a process. Use this command to find out causes of memory bottlenecks.# pmap -d PID
To display process memory information for pid # 47394, enter:# pmap -d 47394
Sample Outputs:
47394: /usr/bin/php-cgi Address Kbytes Mode Offset Device Mapping 0000000000400000 2584 r-x-- 0000000000000000 008:00002 php-cgi 0000000000886000 140 rw--- 0000000000286000 008:00002 php-cgi 00000000008a9000 52 rw--- 00000000008a9000 000:00000 [ anon ] 0000000000aa8000 76 rw--- 00000000002a8000 008:00002 php-cgi 000000000f678000 1980 rw--- 000000000f678000 000:00000 [ anon ] 000000314a600000 112 r-x-- 0000000000000000 008:00002 ld-2.5.so 000000314a81b000 4 r---- 000000000001b000 008:00002 ld-2.5.so 000000314a81c000 4 rw--- 000000000001c000 008:00002 ld-2.5.so 000000314aa00000 1328 r-x-- 0000000000000000 008:00002 libc-2.5.so 000000314ab4c000 2048 ----- 000000000014c000 008:00002 libc-2.5.so ..... ...... .. 00002af8d48fd000 4 rw--- 0000000000006000 008:00002 xsl.so 00002af8d490c000 40 r-x-- 0000000000000000 008:00002 libnss_files-2.5.so 00002af8d4916000 2044 ----- 000000000000a000 008:00002 libnss_files-2.5.so 00002af8d4b15000 4 r---- 0000000000009000 008:00002 libnss_files-2.5.so 00002af8d4b16000 4 rw--- 000000000000a000 008:00002 libnss_files-2.5.so 00002af8d4b17000 768000 rw-s- 0000000000000000 000:00009 zero (deleted) 00007fffc95fe000 84 rw--- 00007ffffffea000 000:00000 [ stack ] ffffffffff600000 8192 ----- 0000000000000000 000:00000 [ anon ] mapped: 933712K writeable/private: 4304K shared: 768000K
The last line is very important:
mapped: 933712K total amount of memory mapped to files writeable/private: 4304K the amount of private address space shared: 768000K the amount of address space this process is sharing with others=> Related: : Linux find the memory used by a program / process using pmap command
#11 and #12: netstat and ss - Network Statistics
The command netstat displays network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. ss command is used to dump socket statistics. It allows showing information similar to netstat. See the following resources about ss and netstat commands:
ss: Display Linux TCP / UDP Network and Socket Information Get Detailed Information About Particular IP address Connections Using netstat Command#13: iptraf - Real-time Network Statistics
The iptraf command is interactive colorful IP LAN monitor. It is an ncurses-based IP LAN monitor that generates various network statistics including TCP info, UDP counts, ICMP and OSPF information, Ethernet load info, node stats, IP checksum errors, and others. It can provide the following info in easy to read format:
Network traffic statistics by TCP connection IP traffic statistics by network interface Network traffic statistics by protocol Network traffic statistics by TCP/UDP port and by packet size Network traffic statistics by Layer2 address#14: tcpdump - Detailed Network Traffic Analysis
The tcpdump is simple command that dump traffic on a network. However, you need good understanding of TCP/IP protocol to utilize this tool. For.e.g to display traffic info about DNS, enter:# tcpdump -i eth1 'udp port 53'
To display all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets, enter:# tcpdump 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
To display all FTP session to 202.54.1.5, enter:# tcpdump -i eth1 'dst 202.54.1.5 and (port 21 or 20'
To display all HTTP session to 192.168.1.5:# tcpdump -ni eth0 'dst 192.168.1.5 and tcp and port http'
Use wireshark to view detailed information about files, enter:# tcpdump -n -i eth1 -s 0 -w output.txt src or dst port 80
#15: strace - System Calls
Trace system calls and signals. This is useful for debugging webserver and other server problems. See how to use to trace the process and see What it is doing.
#16: /Proc file system - Various Kernel Statistics
/proc file system provides detailed information about various hardware devices and other Linux kernel information. See Linux kernel /proc documentations for further details. Common /proc examples:# cat /proc/cpuinfo
# cat /proc/meminfo
# cat /proc/zoneinfo
# cat /proc/mounts
17#: Nagios - Server And Network Monitoring
Nagios is a popular open source computer system and network monitoring application software. You can easily monitor all your hosts, network equipment and services. It can send alert when things go wrong and again when they get better. FAN is "Fully Automated Nagios". FAN goals are to provide a Nagios installation including most tools provided by the Nagios Community. FAN provides a CDRom image in the standard ISO format, making it easy to easilly install a Nagios server. Added to this, a wide bunch of tools are including to the distribution, in order to improve the user experience around Nagios.
18#: Cacti - Web-based Monitoring Tool
Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices. It can provide data about network, CPU, memory, logged in users, Apache, DNS servers and much more. See how to install and configure Cacti network graphing tool under CentOS / RHEL.
#19: KDE System Guard - Real-time Systems Reporting and Graphing
KSysguard is a network enabled task and system monitor application for KDE desktop. This tool can be run over ssh session. It provides lots of features such as a client/server architecture that enables monitoring of local and remote hosts. The graphical front end uses so-called sensors to retrieve the information it displays. A sensor can return simple values or more complex information like tables. For each type of information, one or more displays are provided. Displays are organized in worksheets that can be saved and loaded independently from each other. So, KSysguard is not only a simple task manager but also a very powerful tool to control large server farms.
See the KSysguard handbook for detailed usage.
#20: Gnome System Monitor - Real-time Systems Reporting and Graphing
The System Monitor application enables you to display basic system information and monitor system processes, usage of system resources, and file systems. You can also use System Monitor to modify the behavior of your system. Although not as powerful as the KDE System Guard, it provides the basic information which may be useful for new users:
Displays various basic information about the computer's hardware and software. Linux Kernel version GNOME version Hardware Installed memory Processors and speeds System Status Currently available disk space Processes Memory and swap space Network usage File Systems Lists all mounted filesystems along with basic information about each.