azkaban源碼解析之web服務

以前覺得源碼是個遙遠的東東,總感覺很難,很複雜,寫的又多又長,看起來好麻煩,希望從今天開始以後都能認真的對待源碼,只有瞭解源碼,你才能寫出像別人源碼一樣優秀的東西。

最近要基於azkaban服務進行一個頁面開發,所以需要看下azkaban源碼和api文檔

https://azkaban.readthedocs.io/en/latest/ajaxApi.html#authenticate

源碼自行去github上搜索即可

azkaban是基於javaweb開發的,也就是根據servlet開發,沒有ssm和springboot框架,其實有點low了,但是別人用low的技術也寫出優秀的調度框架。

首先進入azkaban的bin目錄下

最開始解壓的azkaban的web目錄是沒有這兩個文件夾的,第一次啓動start.sh後纔會生成這兩個文件夾

查看啓動腳本發現指向/internal/internal-start-web.sh

發現指向的是AzkabanWebServer類,直接查看main方法

 public static void main(final String[] args) throws Exception {
    // Redirect all std out and err messages into log4j
    StdOutErrRedirect.redirectOutAndErrToLog();

    logger.info("Starting Jetty Azkaban Web Server...");//熟悉的啓動日誌
    final Props props = AzkabanServer.loadProps(args);//加載文件

    if (props == null) {
      logger.error("Azkaban Properties not loaded. Exiting..");
      System.exit(1);
    }

    /* Initialize Guice Injector */
    final Injector injector = Guice.createInjector(
        new AzkabanCommonModule(props),
        new AzkabanWebServerModule(props)
    );
    SERVICE_PROVIDER.setInjector(injector);

    launch(injector.getInstance(AzkabanWebServer.class));//重要的方法
  }

查看launch方法

 public static void launch(final AzkabanWebServer webServer) throws Exception {
    /* This creates the Web Server instance */
    app = webServer;

    webServer.executorManagerAdapter.start();

    webServer.executionLogsCleaner.start();//好像是日誌清理,一個線程一直跑

    // TODO refactor code into ServerProvider
    webServer.prepareAndStartServer(); //重要

    Runtime.getRuntime().addShutdownHook(new Thread() {

        停止的時候打印的日誌。。。不重要
      }

查看prepareAndStartServer方法

private void prepareAndStartServer()
      throws Exception {
    validateDatabaseVersion();
    createThreadPool();
    configureRoutes(); //重要的方法

    if (this.props.getBoolean(Constants.ConfigurationKeys.IS_METRICS_ENABLED, false)) {
      startWebMetrics();
    }

    if (this.props.getBoolean(ConfigurationKeys.ENABLE_QUARTZ, false)) {
      // flowTriggerService needs to be started first before scheduler starts to schedule
      // existing flow triggers
      logger.info("starting flow trigger service");
      this.flowTriggerService.start(); //觸發式調度開啓
      logger.info("starting flow trigger scheduler");
      this.scheduler.start();//定時調度開始
    }

    try {
      this.server.start();
      logger.info("Server started");
    } catch (final Exception e) {
      logger.warn(e);
      Utils.croak(e.getMessage(), 1);
    }
  }

查看configureRoutes方法太長截取片段

private void configureRoutes() throws TriggerManagerException {
    final String staticDir =
        this.props.getString("web.resource.dir", DEFAULT_STATIC_DIR);
    logger.info("Setting up web resource dir " + staticDir);
    final Context root = new Context(this.server, "/", Context.SESSIONS);
    root.setMaxFormContentSize(MAX_FORM_CONTENT_SIZE);

    final String defaultServletPath =
        this.props.getString("azkaban.default.servlet.path", "/index");
    root.setResourceBase(staticDir);
    final ServletHolder indexRedirect =
        new ServletHolder(new IndexRedirectServlet(defaultServletPath));
    root.addServlet(indexRedirect, "/");
    final ServletHolder index = new ServletHolder(new ProjectServlet());
    root.addServlet(index, "/index");
    root.addServlet(new ServletHolder(new ProjectManagerServlet()), "/manager");
    root.addServlet(new ServletHolder(new ExecutorServlet()), "/executor");
    root.addServlet(new ServletHolder(new HistoryServlet()), "/history");
    root.addServlet(new ServletHolder(new ScheduleServlet()), "/schedule");

這裏說明下,這裏的root就類似於之前學習javaweb的web.xml 及主要配置servlet的映射路徑 比如ip:port/manger 對應的是哪一個xxxServlet,這裏都給了說明,比如 /index 對應的是ProjectServlet類

當登陸azkaban後就會進入index,

那麼頁面上顯示的右上角azkaban用戶,中的project【file_to_hbase,test】肯定就是在ProjectServlet的這個類裏獲取的

調用的地址是ip:port/index 是get方法沒有額外參數,所以走下面方法

  @Override
  protected void handleGet(final HttpServletRequest req, final HttpServletResponse resp,
      final Session session) throws ServletException, IOException {

    final ProjectManager manager =
        ((AzkabanWebServer) getApplication()).getProjectManager();

    if (hasParam(req, "ajax")) {
      handleAjaxAction(req, resp, session, manager);
    } else if (hasParam(req, "doaction")) {
      handleDoAction(req, resp, session);
    } else {
      handlePageRender(req, resp, session, manager); //無參走這個
    }
  }
方法handlePageRender
  private void handlePageRender(final HttpServletRequest req,
      final HttpServletResponse resp, final Session session, final ProjectManager manager) {
    final User user = session.getUser();

    final Page page =
        newPage(req, resp, session, "azkaban/webapp/servlet/velocity/index.vm");

    if (this.lockdownCreateProjects &&
        !UserUtils.hasPermissionforAction(this.userManager, user, Permission.Type.CREATEPROJECTS)) {
      page.add("hideCreateProject", true);
    }

    if (hasParam(req, "all")) {
      final List<Project> projects = manager.getProjects();//獲取所有projcets
      page.add("viewProjects", "all");
      page.add("projects", projects);
    } else if (hasParam(req, "group")) {
      final List<Project> projects = manager.getGroupProjects(user);//獲取組內的project
      page.add("viewProjects", "group");
      page.add("projects", projects);
    } else {
      final List<Project> projects = manager.getUserProjects(user);//獲取個人的project
      page.add("viewProjects", "personal");
      page.add("projects", projects);
    }

    page.render();//跳轉頁面
  }

至此就是我們看到的web頁面了。

——————————————————————————————————————————————————————

基於azkaban的開發目前我需要以下接口

1、獲取所有的project

這個有點難搞,根據頁面顯示就是返回了一個html 例如

此時有兩種辦法1、修改源碼,搞個servlet返回一個project的json字符串

                        2、根據返回的值去找到你需要的project名稱

  /**
     * 登錄測試 登錄調度系統
     */

    public static void loginTest() throws Exception {
        HttpHeaders hs = new HttpHeaders();
        LinkedMultiValueMap<String, String> linkedMultiValueMap = new LinkedMultiValueMap<String, String>();
        linkedMultiValueMap.add("action", "login");
        linkedMultiValueMap.add("username", "azkaban");
        linkedMultiValueMap.add("password", "azkaban");

        HttpEntity<MultiValueMap<String, String>> httpEntity = new HttpEntity<>(linkedMultiValueMap, hs);
        String result = restTemplate.postForObject(AZKABAN_URL, httpEntity, String.class);
        JSONObject jsonObject = JSON.parseObject(result);
        SESSION_ID = (String)jsonObject.get("session.id");
        System.out.println(SESSION_ID);
    }

    /**
     * 展示所有project的名稱
     */
    public static void showProject() {
        String result = restTemplate.getForObject(AZKABAN_URL + "/index?session.id="+SESSION_ID,  String.class);
        System.out.println(result);
        regex(result);
    }

azkaban管理登陸信息是靠session.id這個key的,所以拿到這個就可以訪問其他url

根據返回的html,找有project名的共同點

public static void regex(String str) {
        System.out.println("start regex -------------------");
        String regex = "(manager\\?project=)(.*)(\">)";
        Pattern p = Pattern.compile(regex);
        Matcher m = p.matcher(str);
        while (m.find()) {
            System.out.println("group0=" + m.group(0));
            System.out.println("group2=" + m.group(2));
            System.out.println("-------------------------");
        }
    }

返回結果

——————————————————————————————————————————————————————

ProjectManagerServlet 下面的接口

2、獲取指定project下的flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchprojectflows&project=test&flow=ls&start=0&length=10" http://localhost:8081/manager                   
{
  "flows" : [ {
    "flowId" : "ls"
  } ],
  "project" : "test",
  "projectId" : 2
}

3、獲取指定flow的執行情況

 [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchFlowExecutions&project=test&flow=ls&start=0&length=10" http://localhost:8081/manager             
{
  "total" : 4,
  "executions" : [ {
    "submitTime" : 1589165163297,
    "submitUser" : "azkaban",
    "startTime" : 1589165163794,
    "endTime" : 1589165163847,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 17,
    "status" : "SUCCEEDED"
  }, {
    "submitTime" : 1588907022185,
    "submitUser" : "azkaban",
    "startTime" : 1588907022650,
    "endTime" : 1588907024754,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 16,
    "status" : "SUCCEEDED"
  }, {
    "submitTime" : 1588906462960,
    "submitUser" : "azkaban",
    "startTime" : 1588906463407,
    "endTime" : 1588906465524,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 15,
    "status" : "SUCCEEDED"
  }, {
    "submitTime" : 1588902582067,
    "submitUser" : "azkaban",
    "startTime" : -1,
    "endTime" : 1588902583225,
    "flowId" : "ls",
    "projectId" : 2,
    "execId" : 14,
    "status" : "FAILED"
  } ],
  "length" : 10,
  "project" : "test",
  "from" : 0,
  "projectId" : 2,
  "flow" : "ls"

獲取flow上次的執行情況

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchLastSuccessfulFlowExecution&project=test&flow=ls" http://localhost:8081/manager
{
  "success" : "true",
  "project" : "test",
  "message" : "",
  "projectId" : 2,
  "execId" : 17
}

獲取flow的詳情任務類型

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowdetails&project=test&flow=ls" http://localhost:8081/manager 
{
  "project" : "test",
  "projectId" : 2,
  "jobTypes" : [ "command" ]
}

獲取flow的graph

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowgraph&project=test&flow=ls" http://localhost:8081/manager
{
  "nodes" : [ {
    "id" : "ls",
    "type" : "command"
  } ],
  "project" : "test",
  "projectId" : 2,
  "flow" : "ls"
}

獲取flow中的節點數據,比如其中一個job

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflownodedata&project=test&flow=ls&node=ls" http://localhost:8081/manager
{
  "project" : "test",
  "id" : "ls",
  "type" : "command",
  "projectId" : 2,
  "flow" : "ls",
  "props" : {
    "type" : "command",
    "command" : "hdfs dfs -ls /"
  }
}

、獲取project歷史操作日誌(上傳文件、更新文件)

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchProjectLogs&project=test" http://localhost:8081/manager
{
  "columns" : [ "user", "time", "type", "message" ],
  "logData" : [ [ "azkaban", 1588902571595, "UPLOADED", "Uploaded project files zip ls.zip" ], [ "azkaban", 1588848408904, "UPLOADED", "Uploaded project files zip file.zip" ], [ "azkaban", 1588847715497, "UPLOADED", "Uploaded project files zip echo.zip" ], [ "azkaban", 1588845120564, "UPLOADED", "Uploaded project files zip echo.zip" ], [ "azkaban", 1588845108561, "CREATED", null ] ],
  "project" : "test",
  "projectId" : 2
}

獲取flow下的job

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchflowjobs&project=test&flow=ls" http://localhost:8081/manager    
{
  "nodes" : [ {
    "level" : 0,
    "dependents" : [ ],
    "id" : "ls",
    "dependencies" : [ ]
  } ],
  "isLocked" : false,
  "project" : "test",
  "projectId" : 2,
  "flowId" : "ls"
}

獲取job詳情

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#  curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchJobInfo&project=test&flowName=ls&jobName=ls" http://localhost:8081/manager
{
  "jobName" : "ls",
  "generalParams" : {
    "type" : "command",
    "command" : "hdfs dfs -ls /"
  },
  "project" : "test",
  "jobType" : "command",
  "projectId" : 2,
  "overrideParams" : {
    "type" : "command",
    "command" : "hdfs dfs -ls /"
  }
}

ExecutorServlet下的接口

獲取正在執行的flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=getRunning&project=test&flow=ls" http://localhost:8081/executor
{
  "execIds" : [ 21 ]
}
//故意暫停的。

start一個job

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=executeFlow&project=test&flow=ls" http://localhost:8081/executor
{
  "project" : "test",
  "message" : "Execution queued successfully with exec id 18",
  "flow" : "ls",
  "execid" : 18
}

pause 一個flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=executeFlow&project=test&flow=ls" http://localhost:8081/executor                         
{
  "project" : "test",
  "message" : "Execution queued successfully with exec id 20",
  "flow" : "ls",
  "execid" : 20
}(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]#curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=pauseFlow&execid=20" http://localhost:8081/executor                         
{
}
如果這個任務沒有執行
(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=pauseFlow&execid=21" http://localhost:8081/executor                          
{
  "error" : "Cannot find execution '21'"
}

resume一個flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=resumeFlow&execid=20" http://localhost:8081/executor                              
{
}

cancel一個flow

curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=cancelFlow&execid=21" http://localhost:8081/manager

獲取一個flow詳情、失敗通知配置

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=flowInfo&execid=20" http://localhost:8081/executor                         
{
  "flowParam" : {
  },
  "failureAction" : "finishCurrent",
  "notifyFailureFirst" : true,
  "pipelineExecution" : null,
  "queueLevel" : 0,
  "nodeStatus" : {
    "ls" : "SUCCEEDED"
  },
  "pipelineLevel" : null,
  "successEmailsOverride" : false,
  "notifyFailureLast" : false,
  "failureEmails" : [ ],
  "disabled" : [ ],
  "concurrentOptions" : "skip",
  "successEmails" : [ ],
  "failureEmailsOverride" : false
}

獲取一個執行過的flow詳情 執行情況詳情

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchexecflow&execid=18" http://localhost:8081/executor
{
  "project" : "test",
  "updateTime" : 1589273555374,
  "type" : null,
  "attempt" : 0,
  "execid" : 18,
  "submitTime" : 1589273552887,
  "nodes" : [ {
    "nestedId" : "ls",
    "startTime" : 1589273553378,
    "updateTime" : 1589273555334,
    "id" : "ls",
    "endTime" : 1589273555321,
    "type" : "command",
    "attempt" : 0,
    "status" : "SUCCEEDED"
  } ],
  "nestedId" : "ls",
  "submitUser" : "azkaban",
  "startTime" : 1589273553362,
  "id" : "ls",
  "endTime" : 1589273555360,
  "projectId" : 2,
  "flowId" : "ls",
  "flow" : "ls",
  "status" : "SUCCEEDED"
}

ScheduleServlet裏的接口

定時調度一個flow

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=scheduleCronFlow&projectName=test&flow=ls" --data-urlencode cronExpression="0 23/30 5,7-10 ? * 6#3" http://loalhost:8081/schedule
{
  "message" : "test.ls scheduled.",
  "scheduleId" : 1,
  "status" : "success"
}

獲取一個定時調度flow的詳情

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --get --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&ajax=fetchSchedule&projectId=2&flowId=ls" --data-urlencode cronExpression="0 23/30 5,7-10 ? * 6#3" http://localhost:8081/schedule   
{
  "schedule" : {
    "cronExpression" : "0 23/30 5,7-10 ? * 6#3",
    "nextExecTime" : "2020-05-15 05:23:00",
    "period" : "null",
    "submitUser" : "azkaban",
    "executionOptions" : {
      "notifyOnFirstFailure" : true,
      "notifyOnLastFailure" : false,
      "failureEmails" : [ ],
      "successEmails" : [ ],
      "pipelineLevel" : null,
      "queueLevel" : 0,
      "concurrentOption" : "skip",
      "mailCreator" : "default",
      "memoryCheck" : true,
      "flowParameters" : {
      },
      "failureAction" : "FINISH_CURRENTLY_RUNNING",
      "slaOptions" : [ ],
      "disabledJobs" : [ ],
      "pipelineExecutionId" : null,
      "failureEmailsOverridden" : false,
      "successEmailsOverridden" : false
    },
    "scheduleId" : "1",
    "firstSchedTime" : "2020-05-12 02:18:47"
  }
}
吐槽下這裏的projectId是真的id 這裏的flowId 還是name

取消一個定時調度的flow ,注意這裏是post

(base) [root@VM_13_25_centos ~/azkaban/azkaban-web-server-3.81.4/bin]# curl -k --data "session.id=6d7da831-2c0f-4f5b-85ec-2e928bf1ece6&action=removeSched&scheduleId=1" http://localhost:8081/schedule       
{
  "message" : "flow ls removed from Schedules.",
  "status" : "success"
}

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章