可參考鏈接,需找開發商量jmx端口暴露方式
https://www.jianshu.com/p/8a5e681b18ce 或者 http://www.mamicode.com/info-detail-2323750.html
上圖兩個鏈接爲不同的jmx暴露方式,實則大同小異,需跟開發商量,以免程序啓動不了
下載jmx_exporter的jar包
創建配置文件namenode.yaml(datanode.yaml)放在任意位置,內容爲你想要的metrics
參考配置:
--- startDelaySeconds: 0 hostPort: master:1234 #master爲本機IP(一般可設置爲localhost);1234爲想設置的jmx端口(可設置爲未被佔用的端口) #jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:1234/jmxrmi ssl: false lowercaseOutputName: false lowercaseOutputLabelNames: false
其他參數參考:
Name | Description |
---|---|
startDelaySeconds | start delay before serving requests. Any requests within the delay period will result in an empty metrics set. |
hostPort | The host and port to connect to via remote JMX. If neither this nor jmxUrl is specified, will talk to the local JVM. |
username | The username to be used in remote JMX password authentication. |
password | The password to be used in remote JMX password authentication. |
jmxUrl | A full JMX URL to connect to. Should not be specified if hostPort is. |
ssl | Whether JMX connection should be done over SSL. To configure certificates you have to set following system properties:-Djavax.net.ssl.keyStore=/home/user/.keystore -Djavax.net.ssl.keyStorePassword=changeit -Djavax.net.ssl.trustStore=/home/user/.truststore -Djavax.net.ssl.trustStorePassword=changeit |
lowercaseOutputName | Lowercase the output metric name. Applies to default format and name . Defaults to false. |
lowercaseOutputLabelNames | Lowercase the output metric label names. Applies to default format and labels . Defaults to false. |
whitelistObjectNames | A list of ObjectNames to query. Defaults to all mBeans. |
blacklistObjectNames | A list of ObjectNames to not query. Takes precedence over whitelistObjectNames . Defaults to none. |
rules | A list of rules to apply in order, processing stops at the first matching rule. Attributes that aren‘t matched aren‘t collected. If not specified, defaults to collecting everything in the default format. |
pattern | Regex pattern to match against each bean attribute. The pattern is not anchored. Capture groups can be used in other options. Defaults to matching everything. |
attrNameSnakeCase | Converts the attribute name to snake case. This is seen in the names matched by the pattern and the default format. For example, anAttrName to an_attr_name. Defaults to false. |
name | The metric name to set. Capture groups from the pattern can be used. If not specified, the default format will be used. If it evaluates to empty, processing of this attribute stops with no output. |
value | Value for the metric. Static values and capture groups from the pattern can be used. If not specified the scraped mBean value will be used. |
valueFactor | Optional number that value (or the scraped mBean value if value is not specified) is multiplied by, mainly used to convert mBean values from milliseconds to seconds. |
labels | A map of label name to label value pairs. Capture groups from pattern can be used in each. name must be set to use this. Empty names and values are ignored. If not specified and the default format is not being used, no labels are set. |
help | Help text for the metric. Capture groups from pattern can be used. name must be set to use this. Defaults to the mBean attribute decription and the full name of the attribute. |
type | The type of the metric, can be GAUGE , COUNTER or UNTYPED . name must be set to use this. Defaults to UNTYPED . |
前提:
1.java程序的目錄結構展示(因斷網環境所以沒辦法用樹狀圖命令tree來展示,只能截圖)
如上圖所示,一個服務器有一個java程序,其中子程序包括manager(管理端), server,client,PRM, Utilities,關聯關係是agent和server要從manager管理端獲取各種參數,server再向agent發包,這樣程序就算跑起來了。
這裏主要監控server和client
下圖爲manager和server以及其中一個client的目錄結構
各conf目錄下都有一個namenode.conf,內容如下
startDelaySeconds: 10
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:${JMX_PORT}/${APP_NAME}
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false
各conf目錄下都有一個namenode.yaml,內容如下(此爲manager下conf中的)
各個子程序bin目錄下都有關於程序的啓動腳本,先執行1腳本啓動,然後2腳本是讓程序開啓一個prometheus jmx端口,以便jmx exporter探針找到這個端口,收集到java程序的metrics信息,達到監控的目的
下圖是npds-manager.sh腳本中實現開啓jmx端口的腳本信息,完整腳本信息在最下面
這樣啓動java程序後,訪問ip加自定義的端口9210就能訪問到manager的mertrics信息了,達到監控的目的
然後找開發要他們想要監控的參數,以我爲例要的參數如下圖:
再以此一個個配置,就能實現監控了
vim npds-manager.sh
#!/bin/bash
then
echo "=================================================================="
echo "ERROR: Please set the JAVA_HOME variable in your environment!"
echo "=================================================================="
exit 20
ficd=$(pwd)
if [ ${cd:0-3} != "bin" ]
then
echo "=================================================================="
echo "ERROR: Please execute the script in the bin directory!"
echo "=================================================================="
exit 21
fiAPP_HOME=${cd%/bin}
APP_NAME=NPDS-Manager
APP_MAIN_CLASS=cn.com.greattimes.npds.manager.boot.ManagerBootstrap
JPS_FLAG=ManagerBootstrap
LIBRARY_PATH=$APP_HOME/jni
JAVA_OPTS="-server
-Xmx1G
-Xms1G"DEBUG=0
RET_CODE=0
FORCE_KILL=0
CHECK_STATUS=1
CHECK_STATUS_TIMEOUT=5
check_env() {
local DIR=${APP_HOME:POS+1}if [ ! $DIR == $APP_NAME ]
then
echo "=================================================================="
exit 22
fiif [ ! -d "$APP_HOME/logs" ]
then
mkdir $APP_HOME/logs
fiif [ $DEBUG -eq 1 ]
then
local DBG_PORT=9310
fi
}check_prometheus() {
if [ -e "$APP_HOME/conf/namenode.conf" ]
then
local JMX_PORT=15914
fiif [ -e "$APP_HOME/conf/namenode.yaml" ]
then
local WEB_PORT=9210
fi
}check() {
PID=0
local TMP=$(jps -v | grep $JPS_FLAG | grep $APP_HOME)
TMP=${TMP%% *}if [ ${#TMP} -ne 0 ]
then
PID=$TMP
fiif [ $PID -ne 0 ]
then
RUNNING=1
else
RUNNING=0
fi
}print() {
if [ $QUIET_MODE -eq 1 ]
then
return
fiecho "$1" "$2"
}
if [ ! -n "$1" ]
then
return 2
fips -p $1 > /dev/null
if [ $? -eq 0 ]
then
kill -9 $1
fi
}start() {
checkif [ $RUNNING -eq 1 ]
then
print "=================================================================="
print "WARN: $APP_NAME(PID=$PID) is running."
print "=================================================================="
return 1
fiinfo
if [ $CHECK_STATUS -eq 0 ]
then
print -n "..."
print "------------------------------------------------------------------"
print "WARN: $APP_NAME is started without checking status."
return 1
filocal n=0
for ((; n<$CHECK_STATUS_TIMEOUT; n++))
do
if [ $n -lt 3 ]
then
sleep 1
print -n "."
else
checkif [ $RUNNING -eq 1 ]
then
print "Done!"
return 0
else
sleep 1
fi
fi
done
print "------------------------------------------------------------------"
print "ERROR: $APP_NAME failed to start for unknown reasons. "
print " Please refer to the log file for details."
return 2
stop() {
checkif [ $RUNNING -eq 0 ]
then
print "=================================================================="
print "WARN: $APP_NAME is stopped."
print "=================================================================="
return 1
fiinfo
print -n "Stopping $APP_NAME"if [ $FORCE_KILL -eq 1 ]
then
print -n "..."kill_pid $PID
print "------------------------------------------------------------------"
print "WARN: $APP_NAME was killed by force."
return 1if [ $CHECK_STATUS -eq 0 ]
then
print -n "..."
print "------------------------------------------------------------------"
print "WARN: $APP_NAME is stopped without checking status."
return 1
filocal n=0
for ((; n<$CHECK_STATUS_TIMEOUT; n++))
do
if [ $n -lt 3 ]
then
sleep 1
print -n "."
else
checkif [ $RUNNING -eq 0 ]
then
print "Done!"
return 0
else
sleep 1
fi
fi
donekill_pid $PID
print "------------------------------------------------------------------"
print "WARN: $APP_NAME was killed for stopping timeout."
return 1
}restart() {
checkif [ $RUNNING -eq 1 ]
then
if [ $CHECK_STATUS -eq 0 ]
then
FORCE_KILL=1
fi
stop
fistart
}status() {
checkif [ $RUNNING -eq 0 ]
then
print "=================================================================="
print "INFO: $APP_NAME is stopped."
print "=================================================================="
return 0
else
print "=================================================================="
print "INFO: $APP_NAME(PID=$PID) is running."
print "=================================================================="
return 1
fi
}info() {
if [ $QUIET_MODE -eq 1 ]
then
return
fiecho "=================================================================="
echo "$APP_NAME Information:"
echo "------------------------------------------------------------------"
echo "JAVA_HOME=$JAVA_HOME"
echo "CLASS_PATH=$CLASS_PATH"
echo
echo `$JAVA_HOME/bin/java -version`
echo "APP_HOME=$APP_HOME"
echo "APP_MAIN_CLASS=$APP_MAIN_CLASS"
echo "=================================================================="
}check_env
if [ $# -gt 1 ]
then
for v in $@
do
if [ "$v" == "$1" ]
then
continue
fi
case "$v" in
'--quiet')
QUIET_MODE=1
;;
'--nocheck')
CHECK_STATUS=0
;;
'--force')
FORCE_KILL=1
;;
*)
;;
esac
done
ficase "$1" in
'start')
check_prometheus
start
;;
'stop')
check_prometheus
stop
;;
'restart')
check_prometheus
restart
;;
'status')
status
;;
'info')
info
;;
*)
echo "=================================================================="
echo "Usage: $0 <commands> [options]"
echo "------------------------------------------------------------------"
echo "where commands include:"
echo
echo -e " start [options]"
echo -e " --quiet\twithout console output"
echo -e " --nocheck\twithout checking status"
echo
echo -e " stop [options]"
echo -e " --quiet\twithout console output"
echo -e " --nocheck\twithout checking status"
echo -e " --force\tforce-kill"
echo
echo -e " restart [options]"
echo -e " --quiet\twithout console output"
echo -e " --nocheck\twithout checking status"
echo -e " --force\tstart after force-kill"
echo
echo -e " status \t\tdisplay running status"
echo
echo -e " info \t\tdisplay environment information"
echo "=================================================================="
;;
esacRET_CODE=`echo $?`
exit $RET_CODE