prometheus監控java程序

可參考鏈接,需找開發商量jmx端口暴露方式

https://www.jianshu.com/p/8a5e681b18ce 或者 http://www.mamicode.com/info-detail-2323750.html

上圖兩個鏈接爲不同的jmx暴露方式,實則大同小異,需跟開發商量,以免程序啓動不了

下載jmx_exporter的jar包

https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.3.1/jmx_prometheus_javaagent-0.3.1.jar

創建配置文件namenode.yaml(datanode.yaml)放在任意位置,內容爲你想要的metrics

參考配置:

---
startDelaySeconds: 0
hostPort: master:1234 #master爲本機IP(一般可設置爲localhost);1234爲想設置的jmx端口(可設置爲未被佔用的端口)
#jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:1234/jmxrmi
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false

 

其他參數參考:

Name Description
startDelaySeconds start delay before serving requests. Any requests within the delay period will result in an empty metrics set.
hostPort The host and port to connect to via remote JMX. If neither this nor jmxUrl is specified, will talk to the local JVM.
username The username to be used in remote JMX password authentication.
password The password to be used in remote JMX password authentication.
jmxUrl A full JMX URL to connect to. Should not be specified if hostPort is.
ssl Whether JMX connection should be done over SSL. To configure certificates you have to set following system properties:
-Djavax.net.ssl.keyStore=/home/user/.keystore
-Djavax.net.ssl.keyStorePassword=changeit
-Djavax.net.ssl.trustStore=/home/user/.truststore
-Djavax.net.ssl.trustStorePassword=changeit
lowercaseOutputName Lowercase the output metric name. Applies to default format and name. Defaults to false.
lowercaseOutputLabelNames Lowercase the output metric label names. Applies to default format and labels. Defaults to false.
whitelistObjectNames A list of ObjectNames to query. Defaults to all mBeans.
blacklistObjectNames A list of ObjectNames to not query. Takes precedence over whitelistObjectNames. Defaults to none.
rules A list of rules to apply in order, processing stops at the first matching rule. Attributes that aren‘t matched aren‘t collected. If not specified, defaults to collecting everything in the default format.
pattern Regex pattern to match against each bean attribute. The pattern is not anchored. Capture groups can be used in other options. Defaults to matching everything.
attrNameSnakeCase Converts the attribute name to snake case. This is seen in the names matched by the pattern and the default format. For example, anAttrName to an_attr_name. Defaults to false.
name The metric name to set. Capture groups from the pattern can be used. If not specified, the default format will be used. If it evaluates to empty, processing of this attribute stops with no output.
value Value for the metric. Static values and capture groups from the pattern can be used. If not specified the scraped mBean value will be used.
valueFactor Optional number that value (or the scraped mBean value if value is not specified) is multiplied by, mainly used to convert mBean values from milliseconds to seconds.
labels A map of label name to label value pairs. Capture groups from pattern can be used in each. name must be set to use this. Empty names and values are ignored. If not specified and the default format is not being used, no labels are set.
help Help text for the metric. Capture groups from pattern can be used. name must be set to use this. Defaults to the mBean attribute decription and the full name of the attribute.
type The type of the metric, can be GAUGECOUNTER or UNTYPEDname must be set to use this. Defaults to UNTYPED.

前提:

1.java程序的目錄結構展示(因斷網環境所以沒辦法用樹狀圖命令tree來展示,只能截圖)

如上圖所示,一個服務器有一個java程序,其中子程序包括manager(管理端), server,client,PRM, Utilities,關聯關係是agent和server要從manager管理端獲取各種參數,server再向agent發包,這樣程序就算跑起來了。

這裏主要監控server和client

下圖爲manager和server以及其中一個client的目錄結構

各conf目錄下都有一個namenode.conf,內容如下

startDelaySeconds: 10
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:${JMX_PORT}/${APP_NAME}
ssl: false
lowercaseOutputName: false
lowercaseOutputLabelNames: false

各conf目錄下都有一個namenode.yaml,內容如下(此爲manager下conf中的)

各個子程序bin目錄下都有關於程序的啓動腳本,先執行1腳本啓動,然後2腳本是讓程序開啓一個prometheus jmx端口,以便jmx exporter探針找到這個端口,收集到java程序的metrics信息,達到監控的目的

下圖是npds-manager.sh腳本中實現開啓jmx端口的腳本信息,完整腳本信息在最下面

這樣啓動java程序後,訪問ip加自定義的端口9210就能訪問到manager的mertrics信息了,達到監控的目的

 

然後找開發要他們想要監控的參數,以我爲例要的參數如下圖:

 

 

 

再以此一個個配置,就能實現監控了

vim npds-manager.sh

#!/bin/bash
then
    echo "=================================================================="
    echo "ERROR: Please set the JAVA_HOME variable in your environment!"
    echo "=================================================================="
    exit 20
fi

cd=$(pwd)

if [ ${cd:0-3} != "bin" ]
then
    echo "=================================================================="
    echo "ERROR: Please execute the script in the bin directory!"
    echo "=================================================================="
    exit 21
fi

APP_HOME=${cd%/bin}

APP_NAME=NPDS-Manager

APP_MAIN_CLASS=cn.com.greattimes.npds.manager.boot.ManagerBootstrap

JPS_FLAG=ManagerBootstrap

LIBRARY_PATH=$APP_HOME/jni


JAVA_OPTS="-server
-Xmx1G
-Xms1G"

DEBUG=0

RET_CODE=0

FORCE_KILL=0

CHECK_STATUS=1

CHECK_STATUS_TIMEOUT=5

check_env() {
    local DIR=${APP_HOME:POS+1}

    if [ ! $DIR == $APP_NAME ]
    then
        echo "=================================================================="
        exit 22
    fi

    if [ ! -d "$APP_HOME/logs" ]
    then
        mkdir $APP_HOME/logs
    fi

    if [ $DEBUG -eq 1 ]
    then
        local DBG_PORT=9310
    fi
}

check_prometheus() {
    if [ -e "$APP_HOME/conf/namenode.conf" ]
    then
        local JMX_PORT=15914
    fi

    if [ -e "$APP_HOME/conf/namenode.yaml" ]
    then
        local WEB_PORT=9210
    fi
}

check() {
    PID=0
    local TMP=$(jps -v | grep $JPS_FLAG | grep $APP_HOME)
    TMP=${TMP%% *}

    if [ ${#TMP} -ne 0 ]
    then
        PID=$TMP
    fi

    if [ $PID -ne 0 ]
    then
        RUNNING=1
    else
        RUNNING=0
    fi
}

print() {
    if [ $QUIET_MODE -eq 1 ]
    then
        return
    fi

    echo "$1" "$2"
}
    if [ ! -n "$1" ]
    then
        return 2
    fi

    ps -p $1 > /dev/null

    if [ $? -eq 0 ]
    then
        kill -9 $1
    fi
}

start() {
    check

    if [ $RUNNING -eq 1 ]
    then
        print "=================================================================="
        print "WARN: $APP_NAME(PID=$PID) is running."
        print "=================================================================="
        return 1
    fi

    info

    if [ $CHECK_STATUS -eq 0 ]
    then
        print -n "..."
        print
        print "------------------------------------------------------------------"
        print "WARN: $APP_NAME is started without checking status."
        print
        return 1
    fi

    local n=0

    for ((; n<$CHECK_STATUS_TIMEOUT; n++))
    do
        if [ $n -lt 3 ]
        then
            sleep 1
            print -n "."
        else
            check

            if [ $RUNNING -eq 1 ]
            then
                print
                print "Done!"
                print
                return 0
            else
                sleep 1
            fi
        fi
    done

    print
    print "------------------------------------------------------------------"
    print "ERROR: $APP_NAME failed to start for unknown reasons. "
    print "       Please refer to the log file for details."
    print
    return 2
stop() {
    check

    if [ $RUNNING -eq 0 ]
    then
        print "=================================================================="
        print "WARN: $APP_NAME is stopped."
        print "=================================================================="
        return 1
    fi

    info
    print -n "Stopping $APP_NAME"

    if [ $FORCE_KILL -eq 1 ]
    then
        print -n "..."
        print

        kill_pid $PID

        print "------------------------------------------------------------------"
        print "WARN: $APP_NAME was killed by force."
        print
        return 1

    if [ $CHECK_STATUS -eq 0 ]
    then
        print -n "..."
        print
        print "------------------------------------------------------------------"
        print "WARN: $APP_NAME is stopped without checking status."
        print
        return 1
    fi

    local n=0

    for ((; n<$CHECK_STATUS_TIMEOUT; n++))
    do
        if [ $n -lt 3 ]
        then
            sleep 1
            print -n "."
        else
            check

            if [ $RUNNING -eq 0 ]
            then
                print
                print "Done!"
                print
                return 0
            else
                sleep 1
            fi
        fi
    done

    kill_pid $PID

    print
    print "------------------------------------------------------------------"
    print "WARN: $APP_NAME was killed for stopping timeout."
    print
    return 1
}

restart() {
    check

    if [ $RUNNING -eq 1 ]
    then
        if [ $CHECK_STATUS -eq 0 ]
        then
            FORCE_KILL=1
        fi
        stop
    fi

    start
}

status() {
    check

    if [ $RUNNING -eq 0 ]
    then
        print "=================================================================="
        print "INFO: $APP_NAME is stopped."
        print "=================================================================="
        return 0
    else
        print "=================================================================="
        print "INFO: $APP_NAME(PID=$PID) is running."
        print "=================================================================="
        return 1
    fi
}

info() {
    if [ $QUIET_MODE -eq 1 ]
    then
        return
    fi

    echo "=================================================================="
    echo "$APP_NAME Information:"
    echo "------------------------------------------------------------------"
    echo "JAVA_HOME=$JAVA_HOME"
    echo "CLASS_PATH=$CLASS_PATH"
    echo
    echo `$JAVA_HOME/bin/java -version`
    echo "APP_HOME=$APP_HOME"
    echo "APP_MAIN_CLASS=$APP_MAIN_CLASS"
    echo "=================================================================="
}

check_env

if [ $# -gt 1 ]
then
    for v in $@
    do
        if [ "$v" == "$1" ]
        then
            continue
        fi
        case "$v" in
            '--quiet')
                QUIET_MODE=1
                ;;
            '--nocheck')
                CHECK_STATUS=0
                ;;
            '--force')
                FORCE_KILL=1
                ;;
            *)
                ;;
        esac
    done
fi

case "$1" in
    'start')
        check_prometheus
        start
        ;;
    'stop')
        check_prometheus
        stop
        ;;
    'restart')
        check_prometheus
        restart
        ;;
    'status')
        status
        ;;
    'info')
        info
        ;;
    *)
        echo "=================================================================="
        echo "Usage: $0 <commands> [options]"
        echo "------------------------------------------------------------------"
        echo "where commands include:"
        echo
        echo -e "      start [options]"
        echo -e "            --quiet\twithout console output"
        echo -e "            --nocheck\twithout checking status"
        echo
        echo -e "       stop [options]"
        echo -e "            --quiet\twithout console output"
        echo -e "            --nocheck\twithout checking status"
        echo -e "            --force\tforce-kill"
        echo
        echo -e "    restart [options]"
        echo -e "            --quiet\twithout console output"
        echo -e "            --nocheck\twithout checking status"
        echo -e "            --force\tstart after force-kill"
        echo
        echo -e "     status \t\tdisplay running status"
        echo
        echo -e "       info \t\tdisplay environment information"
        echo "=================================================================="
        ;;
esac

RET_CODE=`echo $?`

exit $RET_CODE
 

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章