stackstorm 30. 源碼分析之----stackstorm重要場景run action


目標:
弄清楚run action原理
目錄:
0 st2api調試代碼前準備
1 st2api服務分析
2 關鍵方法publish_request分析
3 st2actionrunner調試代碼前準備
4 st2actionrunner服務分析
5 總結

0 st2api調試代碼前準備
將st2api容器啓動命令修改

      containers:
      - command:
        - bash
        - -c
        - exec /opt/stackstorm/st2/bin/gunicorn st2api.wsgi:application -k eventlet
          -b 0.0.0.0:9101 --workers 1 --threads 1 --graceful-timeout 10 --timeout
          30
修改爲:
      containers:
      - command:
        - sleep
        - 3d
等待st2api的pod啓動後
修改/etc/st2/st2.conf
[auth]
host = 127.0.0.1
[api]
host = 127.0.0.1
的host的值修改爲0.0.0.0

原因:
服務監聽127.0.0.1只有本機纔可以監聽,而
0.0.0.0則表示其他節點可以發送請求,這樣監聽服務也可以監聽到其他節點的請求。


修改爲:
sudo /opt/stackstorm/st2/bin/st2api --config-file /etc/st2/st2.conf

進入st2任意的pod,執行st2 login命令後,然後執行如下命令:
st2 run core.local cmd=ls

1 st2api服務分析
代碼入口
/opt/stackstorm/st2/lib/python2.7/site-packages/st2api/controllers/v1/actionexecutions.py的post()
代碼如下:
class ActionExecutionsController(BaseResourceIsolationControllerMixin,
                                 ActionExecutionsControllerMixin, ResourceController):

    def post(self, liveaction_api, requester_user, context_string=None, show_secrets=False):
        return self._handle_schedule_execution(liveaction_api=liveaction_api,
                                               requester_user=requester_user,
                                               context_string=context_string,
                                               show_secrets=show_secrets)

分析:
1.1)執行run action命令進入到
上述st2api的代碼
對應輸入參數樣例如下:
(Pdb) p liveaction_api
<st2common.router.Body object at 0x4090e50>
(Pdb) p liveaction_api.__dict__
{u'action': u'core.local', u'user': None, u'parameters': {u'cmd': u'ls'}}
(Pdb) p requester_user
<UserDB: UserDB(id=5ed81d6197474e0001587c06, is_service=False, name="[email protected]", nicknames={})>
(Pdb) p requester_user.__dict__
{'_cls': 'UserDB'}
(Pdb) p context_string
None
(Pdb) p show_secrets
None

1.2)繼續進入
class ActionExecutionsControllerMixin(BaseRestControllerMixin):

    def _handle_schedule_execution(self, liveaction_api, requester_user, context_string=None,
                                   show_secrets=False):
        """
        :param liveaction: LiveActionAPI object.
        :type liveaction: :class:`LiveActionAPI`
        """

        if not requester_user:
            requester_user = UserDB(cfg.CONF.system_user.user)

        # Assert action ref is valid
        action_ref = liveaction_api.action
        action_db = action_utils.get_action_by_ref(action_ref)

        if not action_db:
            message = 'Action "%s" cannot be found.' % action_ref
            LOG.warning(message)
            abort(http_client.BAD_REQUEST, message)

        # Assert the permissions
        assert_user_has_resource_db_permission(user_db=requester_user, resource_db=action_db,
                                               permission_type=PermissionType.ACTION_EXECUTE)

        # Validate that the authenticated user is admin if user query param is provided
        user = liveaction_api.user or requester_user.name
        assert_user_is_admin_if_user_query_param_is_provided(user_db=requester_user,
                                                             user=user)

        try:
            return self._schedule_execution(liveaction=liveaction_api,
                                            requester_user=requester_user,
                                            user=user,
                                            context_string=context_string,
                                            show_secrets=show_secrets,
                                            pack=action_db.pack)
        except ValueError as e:
            LOG.exception('Unable to execute action.')
            ......
分析:
上述主要處理邏輯是:
根據action_ref查詢得到action_db,調用_schedule_execution獲取執行結果


1.3) 分析_schedule_execution方法
class ActionExecutionsControllerMixin(BaseRestControllerMixin):

    def _schedule_execution(self,
                            liveaction,
                            requester_user,
                            user=None,
                            context_string=None,
                            show_secrets=False,
                            pack=None):
        # Initialize execution context if it does not exist.
        if not hasattr(liveaction, 'context'):
            liveaction.context = dict()

        liveaction.context['user'] = user
        liveaction.context['pack'] = pack
        LOG.debug('User is: %s' % liveaction.context['user'])

        # Retrieve other st2 context from request header.
        if context_string:
            context = try_loads(context_string)
            if not isinstance(context, dict):
                raise ValueError('Unable to convert st2-context from the headers into JSON.')
            liveaction.context.update(context)

        # Include RBAC context (if RBAC is available and enabled)
        if cfg.CONF.rbac.enable:
            user_db = UserDB(name=user)
            role_dbs = rbac_service.get_roles_for_user(user_db=user_db, include_remote=True)
            roles = [role_db.name for role_db in role_dbs]
            liveaction.context['rbac'] = {
                'user': user,
                'roles': roles
            }

        # Schedule the action execution.
        liveaction_db = LiveActionAPI.to_model(liveaction)
        action_db = action_utils.get_action_by_ref(liveaction_db.action)
        runnertype_db = action_utils.get_runnertype_by_name(action_db.runner_type['name'])

        try:
            liveaction_db.parameters = param_utils.render_live_params(
                runnertype_db.runner_parameters, action_db.parameters, liveaction_db.parameters,
                liveaction_db.context)
        except param_exc.ParamException:

            # We still need to create a request, so liveaction_db is assigned an ID
            liveaction_db, actionexecution_db = action_service.create_request(liveaction_db)

            # By this point the execution is already in the DB therefore need to mark it failed.
            _, e, tb = sys.exc_info()
            action_service.update_status(
                liveaction=liveaction_db,
                new_status=action_constants.LIVEACTION_STATUS_FAILED,
                result={'error': str(e), 'traceback': ''.join(traceback.format_tb(tb, 20))})
            # Might be a good idea to return the actual ActionExecution rather than bubble up
            # the exception.
            raise validation_exc.ValueValidationException(str(e))

        # The request should be created after the above call to render_live_params
        # so any templates in live parameters have a chance to render.
        liveaction_db, actionexecution_db = action_service.create_request(liveaction_db)
        liveaction_db = LiveAction.add_or_update(liveaction_db, publish=False)

        _, actionexecution_db = action_service.publish_request(liveaction_db, actionexecution_db)
        mask_secrets = self._get_mask_secrets(requester_user, show_secrets=show_secrets)
        execution_api = ActionExecutionAPI.from_model(actionexecution_db, mask_secrets=mask_secrets)

        return Response(json=execution_api, status=http_client.CREATED)


分析:
1)_schedule_execution方法
    1 根據輸入參數,形如
        (Pdb) p liveaction_api
        <st2common.router.Body object at 0x4090e50>
        (Pdb) p liveaction_api.__dict__
        {u'action': u'core.local', u'user': None, u'parameters': {u'cmd': u'ls'}}
        (Pdb) p requester_user
        <UserDB: UserDB(id=5ed81d6197474e0001587c06, is_service=False, name="[email protected]", nicknames={})>
        (Pdb) p requester_user.__dict__
        {'_cls': 'UserDB'}
        (Pdb) p context_string
        None
        (Pdb) p show_secrets
        None
        pack : core
    2 根據action_ref查詢得到action_db,根據runner_type名稱(例如: 'local-shell-cmd')查詢得到runnertype_db
    3 獲取action實例的參數(例如: liveaction_db.parameters{u'cmd': u'ls'})
    4 調用create_request(liveaction): 創建一個action的執行,返回(liveaction, execution),具體是:
        向live_action_d_b表添加或更新liveaction
        創建execution
    5 調用publish_request(liveaction, execution)方法,具體是:
        發送liveaction消息到'st2.liveaction'這個exchange, routing_key爲'create'
        發送liveaction消息到'st2.liveaction.status'這個exchange, routing_key爲'requested'
        發送actionexecution消息到'st2.execution'這個exchange, routing_key爲'create'
        返回: liveaction, execution
    6 將execution返回

其中最爲關鍵的就是5 調用publish_request(liveaction, execution)方法
發送liveaction消息到'st2.liveaction.status'這個exchange, routing_key爲'requested'
具體參見2的分析

2 關鍵方法publish_request分析
_, actionexecution_db = action_service.publish_request(liveaction_db, actionexecution_db)
進行liveaction狀態的修改,導致actionrunner接收到消息並進行處理
下面是actionrunner的日誌
2020-06-05T08:36:02.974507088Z 2020-06-05 16:36:02,925 AUDIT [-] The status of action execution is changed from requested to scheduled. <LiveAction.id=5eda017034b397007b090d30, ActionExecution.id=5eda01db34b397007b090d33> (liveaction_db={'status': 'scheduled', 'runner_info': {},

根據我之前的文章分析:
https://blog.csdn.net/qingyuanluofeng/java/article/details/105398730

ActionExecutionScheduler類中監聽的隊列,綁定關係如下
'st2.liveaction.status'--->'requested'--->'st2.actionrunner.req'

現在這裏發送的消息
exchange是'st2.liveaction.status',routing_key是'scheduled'

ActionExecutionDispatcher類中監聽的隊列,綁定關係如下
'st2.liveaction.status'--->'scheduled'--->'st2.actionrunner.work',
'st2.liveaction.status'--->'canceling'--->'st2.actionrunner.cancel', 
'st2.liveaction.status'--->'pausing'--->'st2.actionrunner.pause',
'st2.liveaction.status'--->'resuming'--->'st2.actionrunner.resume'

所以ActionExecutionScheduler實際起到的作用就是將liveaction的狀態從requested轉換爲scheduled,
並以exchange是'st2.liveaction.status',routing_key是'scheduled'將liveaction作爲payload發送消息出去,
該發送的消息會被ActionExecutionDispatcher類中監聽的隊列:
'st2.liveaction.status'--->'scheduled'--->'st2.actionrunner.work',
收到,收到消息後,最終會觸發action的執行。


3 st2actionrunner調試代碼前準備
現在可以跳過ActionExecutionScheduler類的分析,直接進入到
ActionExecutionDispatcher類進行分析,因爲最終是在這個類中進行action的真正執行。
接下來需要調試st2actionrunner服務
先修改st2actionrunner的容器啓動命令
      containers:
      - command:
        - bash
        - -c
        - exec /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf


      containers:
      - command:
        - sleep
        - 3d
然後等待pod啓動後,修改/etc/st2/st2.conf
[auth]
host = 127.0.0.1
[api]
host = 127.0.0.1
的host的值修改爲0.0.0.0

進入st2actionrunner的pod中,在如下代碼處加上斷點
cd /opt/stackstorm/st2/lib/python2.7/site-packages/st2actions
vi worker.py

調試:
然後執行如下命令來手動開啓st2actionrunner服務
sudo /opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf
進入其他任意一個st2的pod,執行:
st2 run core.local cmd=ls

4 st2actionrunner服務分析
代碼入口
class ActionExecutionDispatcher(MessageHandler):

    def process(self, liveaction):
        """Dispatches the LiveAction to appropriate action runner.

        LiveAction in statuses other than "scheduled" and "canceling" are ignored. If
        LiveAction is already canceled and result is empty, the LiveAction
        is updated with a generic exception message.

        :param liveaction: Action execution request.
        :type liveaction: ``st2common.models.db.liveaction.LiveActionDB``

        :rtype: ``dict``
        """

        if liveaction.status == action_constants.LIVEACTION_STATUS_CANCELED:
            LOG.info('%s is not executing %s (id=%s) with "%s" status.',
                     self.__class__.__name__, type(liveaction), liveaction.id, liveaction.status)
            if not liveaction.result:
                updated_liveaction = action_utils.update_liveaction_status(
                    status=liveaction.status,
                    result={'message': 'Action execution canceled by user.'},
                    liveaction_id=liveaction.id)
                executions.update_execution(updated_liveaction)
            return

        if liveaction.status not in ACTIONRUNNER_DISPATCHABLE_STATES:
            LOG.info('%s is not dispatching %s (id=%s) with "%s" status.',
                     self.__class__.__name__, type(liveaction), liveaction.id, liveaction.status)
            return

        try:
            liveaction_db = action_utils.get_liveaction_by_id(liveaction.id)
        except StackStormDBObjectNotFoundError:
            LOG.exception('Failed to find liveaction %s in the database.', liveaction.id)
            raise

        if liveaction.status != liveaction_db.status:
            LOG.warning(
                'The status of liveaction %s has changed from %s to %s '
                'while in the queue waiting for processing.',
                liveaction.id,
                liveaction.status,
                liveaction_db.status
            )

        dispatchers = {
            action_constants.LIVEACTION_STATUS_SCHEDULED: self._run_action,
            action_constants.LIVEACTION_STATUS_CANCELING: self._cancel_action,
            action_constants.LIVEACTION_STATUS_PAUSING: self._pause_action,
            action_constants.LIVEACTION_STATUS_RESUMING: self._resume_action
        }

        return dispatchers[liveaction.status](liveaction)

分析:
4.1)進入
class ActionExecutionDispatcher(MessageHandler):

    def _run_action(self, liveaction_db):
        # stamp liveaction with process_info
        runner_info = system_info.get_process_info()

        # Update liveaction status to "running"
        liveaction_db = action_utils.update_liveaction_status(
            status=action_constants.LIVEACTION_STATUS_RUNNING,
            runner_info=runner_info,
            liveaction_id=liveaction_db.id)

        self._running_liveactions.add(liveaction_db.id)

        action_execution_db = executions.update_execution(liveaction_db)

        # Launch action
        extra = {'action_execution_db': action_execution_db, 'liveaction_db': liveaction_db}
        LOG.audit('Launching action execution.', extra=extra)

        # the extra field will not be shown in non-audit logs so temporarily log at info.
        LOG.info('Dispatched {~}action_execution: %s / {~}live_action: %s with "%s" status.',
                 action_execution_db.id, liveaction_db.id, liveaction_db.status)

        extra = {'liveaction_db': liveaction_db}
        try:
            result = self.container.dispatch(liveaction_db)
            LOG.debug('Runner dispatch produced result: %s', result)
            if not result:
                raise ActionRunnerException('Failed to execute action.')
        except:
            _, ex, tb = sys.exc_info()
            extra['error'] = str(ex)
            LOG.info('Action "%s" failed: %s' % (liveaction_db.action, str(ex)), extra=extra)

            liveaction_db = action_utils.update_liveaction_status(
                status=action_constants.LIVEACTION_STATUS_FAILED,
                liveaction_id=liveaction_db.id,
                result={'error': str(ex), 'traceback': ''.join(traceback.format_tb(tb, 20))})
            executions.update_execution(liveaction_db)
            raise
        finally:
            # In the case of worker shutdown, the items are removed from _running_liveactions.
            # As the subprocesses for action executions are terminated, this finally block
            # will be executed. Set remove will result in KeyError if item no longer exists.
            # Use set discard to not raise the KeyError.
            self._running_liveactions.discard(liveaction_db.id)

        return result

分析:
4.1.1) 變量分析
(Pdb) p liveaction_db
<LiveActionDB: LiveActionDB(action="core.local", action_is_workflow=False, callback={}, context={u'rbac': {u'user': u'admin', u'roles': [u'admin']}, u'user': u'admin', u'pack': u'core'}, end_timestamp=None, id=5edde21d3e2ff7000d50e09d, notify=None, parameters={u'cmd': u'ls'}, result={}, runner_info={}, start_timestamp="2020-06-08 07:00:45.250893+00:00", status="scheduled")>
(Pdb) p liveaction_db.__dict__
{'_fields_ordered': ('id', 'status', 'start_timestamp', 'end_timestamp', 'action', 'action_is_workflow', 'parameters', 'result', 'context', 'callback', 'runner_info', 'notify')}

(Pdb) p runner_info
{'hostname': 'dozer-st2actionrunner-0', 'pid': 39}
(Pdb) p liveaction_db.id
ObjectId('5edde21d3e2ff7000d50e09d')
db.live_action_d_b.find({'_id': ObjectId('5edde21d3e2ff7000d50e09d')}).pretty();
{
    "_id" : ObjectId("5edde21d3e2ff7000d50e09d"),
    "status" : "scheduled",
    "start_timestamp" : NumberLong("1591599645250893"),
    "action" : "core.local",
    "action_is_workflow" : false,
    "parameters" : {
        "cmd" : "ls"
    },
    "result" : {
        
    },
    "context" : {
        "rbac" : {
            "user" : "admin",
            "roles" : [
                "admin"
            ]
        },
        "user" : "admin",
        "pack" : "core"
    },
    "callback" : {
        
    },
    "runner_info" : {
        
    }
}

4.1.2) 處理流程分析
更新liveaction的狀態
調用RunnerContainer.dispatch(self, liveaction_db)方法執行action
進入:
st2/st2actions/st2actions/container/base.py

class RunnerContainer(object):

    def dispatch(self, liveaction_db):
        action_db = get_action_by_ref(liveaction_db.action)
        if not action_db:
            raise Exception('Action %s not found in DB.' % (liveaction_db.action))

        liveaction_db.context['pack'] = action_db.pack

        runnertype_db = get_runnertype_by_name(action_db.runner_type['name'])

        extra = {'liveaction_db': liveaction_db, 'runnertype_db': runnertype_db}
        LOG.info('Dispatching Action to a runner', extra=extra)

        # Get runner instance.
        runner = self._get_runner(runnertype_db, action_db, liveaction_db)

        LOG.debug('Runner instance for RunnerType "%s" is: %s', runnertype_db.name, runner)

        # Process the request.
        funcs = {
            action_constants.LIVEACTION_STATUS_REQUESTED: self._do_run,
            action_constants.LIVEACTION_STATUS_SCHEDULED: self._do_run,
            action_constants.LIVEACTION_STATUS_RUNNING: self._do_run,
            action_constants.LIVEACTION_STATUS_CANCELING: self._do_cancel,
            action_constants.LIVEACTION_STATUS_PAUSING: self._do_pause,
            action_constants.LIVEACTION_STATUS_RESUMING: self._do_resume
        }

        if liveaction_db.status not in funcs:
            raise actionrunner.ActionRunnerDispatchError(
                'Action runner is unable to dispatch the liveaction because it is '
                'in an unsupported status of "%s".' % liveaction_db.status
            )

        liveaction_db = funcs[liveaction_db.status](
            runner=runner,
            runnertype_db=runnertype_db,
            action_db=action_db,
            liveaction_db=liveaction_db
        )

        return liveaction_db.result

分析:
1)變量分析
(Pdb) p liveaction_db
<LiveActionDB: LiveActionDB(action="core.local", action_is_workflow=False, callback={}, context={u'rbac': {u'user': u'admin', u'roles': [u'admin']}, u'user': u'admin', u'pack': u'core'}, end_timestamp=None, id=5edde21d3e2ff7000d50e09d, notify=None, parameters={u'cmd': u'ls'}, result={}, runner_info={'hostname': 'dozer-st2actionrunner-0', 'pid': 39}, start_timestamp="2020-06-08 07:00:45.250893+00:00", status="running")>
(Pdb) p action_db
<ActionDB: ActionDB(description="Action that executes an arbitrary Linux command on the localhost.", enabled=True, entry_point="", id=5ed9ffb3ad1847001bf0857f, name="local", notify=NotifySchema@91974864(on_complete="None", on_success="None", on_failure="None"), pack="core", parameters={u'cmd': {u'required': True, u'type': u'string', u'description': u'Arbitrary Linux command to be executed on the local host.'}, u'sudo': {u'immutable': True}}, ref="core.local", runner_type={u'name': u'local-shell-cmd'}, tags=[], uid="action:core:local")>
(Pdb) p action_db.pack
u'core'
(Pdb) p runnertype_db
<RunnerTypeDB: RunnerTypeDB(description="A runner to execute local actions as a fixed user.", enabled=True, id=5ed9ffb2ad1847001bf08568, name="local-shell-cmd", query_module=None, runner_module="local_runner", runner_parameters={u'kwarg_op': {u'default': u'--', u'type': u'string', u'description': u'Operator to use in front of keyword args i.e. "--" or "-".'}, u'sudo': {u'default': False, u'type': u'boolean', u'description': u'The command will be executed with sudo.'}, u'sudo_password': {u'default': None, u'secret': True, u'required': False, u'type': u'string', u'description': u'Sudo password. To be used when paswordless sudo is not allowed.'}, u'env': {u'type': u'object', u'description': u'Environment variables which will be available to the command(e.g. key1=val1,key2=val2)'}, u'timeout': {u'default': 60, u'type': u'integer', u'description': u"Action timeout in seconds. Action will get killed if it doesn't finish in timeout seconds."}, u'cmd': {u'type': u'string', u'description': u'Arbitrary Linux command to be executed on the host.'}, u'cwd': {u'type': u'string', u'description': u'Working directory where the command will be executed in'}}, uid="runner_type:local-shell-cmd")>
(Pdb) p runner
<local_runner.LocalShellRunner object at 0x57c6250>

2) 邏輯處理分析
會調用_do_run方法

4.1.3) 
    def _do_run(self, runner, runnertype_db, action_db, liveaction_db):
        # Create a temporary auth token which will be available
        # for the duration of the action execution.
        runner.auth_token = self._create_auth_token(context=runner.context, action_db=action_db,
                                                    liveaction_db=liveaction_db)

        try:
            # Finalized parameters are resolved and then rendered. This process could
            # fail. Handle the exception and report the error correctly.
            try:
                runner_params, action_params = param_utils.render_final_params(
                    runnertype_db.runner_parameters, action_db.parameters, liveaction_db.parameters,
                    liveaction_db.context)
                runner.runner_parameters = runner_params
            except ParamException as e:
                raise actionrunner.ActionRunnerException(str(e))

            LOG.debug('Performing pre-run for runner: %s', runner.runner_id)
            runner.pre_run()

            # Mask secret parameters in the log context
            resolved_action_params = ResolvedActionParameters(action_db=action_db,
                                                              runner_type_db=runnertype_db,
                                                              runner_parameters=runner_params,
                                                              action_parameters=action_params)
            extra = {'runner': runner, 'parameters': resolved_action_params}
            LOG.debug('Performing run for runner: %s' % (runner.runner_id), extra=extra)
            (status, result, context) = runner.run(action_params)

            try:
                result = json.loads(result)
            except:
                pass

            action_completed = status in action_constants.LIVEACTION_COMPLETED_STATES
            if isinstance(runner, AsyncActionRunner) and not action_completed:
                self._setup_async_query(liveaction_db.id, runnertype_db, context)
        except:
            LOG.exception('Failed to run action.')
            _, ex, tb = sys.exc_info()
            # mark execution as failed.
            status = action_constants.LIVEACTION_STATUS_FAILED
            # include the error message and traceback to try and provide some hints.
            result = {'error': str(ex), 'traceback': ''.join(traceback.format_tb(tb, 20))}
            context = None
        finally:
            # Log action completion
            extra = {'result': result, 'status': status}
            LOG.debug('Action "%s" completed.' % (action_db.name), extra=extra)

            # Update the final status of liveaction and corresponding action execution.
            liveaction_db = self._update_status(liveaction_db.id, status, result, context)

            # Always clean-up the auth_token
            # This method should be called in the finally block to ensure post_run is not impacted.
            self._clean_up_auth_token(runner=runner, status=status)

        LOG.debug('Performing post_run for runner: %s', runner.runner_id)
        runner.post_run(status=status, result=result)

        LOG.debug('Runner do_run result', extra={'result': liveaction_db.result})
        LOG.audit('Liveaction completed', extra={'liveaction_db': liveaction_db})

        return liveaction_db

分析:
1) 變量分析
(Pdb) p liveaction_db
<LiveActionDB: LiveActionDB(action="core.local", action_is_workflow=False, callback={}, context={u'rbac': {u'user': u'admin', u'roles': [u'admin']}, u'user': u'admin', u'pack': u'core'}, end_timestamp="2020-06-08 08:05:56.057139+00:00", id=5eddef7a3e2ff7000d50e0a4, notify=None, parameters={u'cmd': u'ls'}, result={'failed': False, 'stderr': '', 'return_code': 0, 'succeeded': True, 'stdout': 'bootstrap\ncmd\nconfig.py\nconfig.pyc\ncontainer\n__init__.py\n__init__.pyc\nnotifier\npolicies\nresultstracker\nrunners\nscheduler.py\nscheduler.pyc\nworker.py\nworker.py_bak\nworker.pyc'}, runner_info={u'hostname': u'dozer-st2actionrunner-0', u'pid': 39}, start_timestamp="2020-06-08 07:57:46.975401+00:00", status="succeeded")>

2)邏輯處理分析
關鍵就是調用
(status, result, context) = runner.run(action_params)
進行action的真正執行
進入:
/opt/stackstorm/runners/local_runner/local_runner/local_runner.py的run()方法


4.1.4) 代碼如下
class LocalShellRunner(ActionRunner, ShellRunnerMixin):

    def run(self, action_parameters):
        env_vars = self._env

        if not self.entry_point:
            script_action = False
            command = self.runner_parameters.get(RUNNER_COMMAND, None)
            action = ShellCommandAction(name=self.action_name,
                                        action_exec_id=str(self.liveaction_id),
                                        command=command,
                                        user=self._user,
                                        env_vars=env_vars,
                                        sudo=self._sudo,
                                        timeout=self._timeout,
                                        sudo_password=self._sudo_password)
        else:
            script_action = True
            script_local_path_abs = self.entry_point
            positional_args, named_args = self._get_script_args(action_parameters)
            named_args = self._transform_named_args(named_args)

            action = ShellScriptAction(name=self.action_name,
                                       action_exec_id=str(self.liveaction_id),
                                       script_local_path_abs=script_local_path_abs,
                                       named_args=named_args,
                                       positional_args=positional_args,
                                       user=self._user,
                                       env_vars=env_vars,
                                       sudo=self._sudo,
                                       timeout=self._timeout,
                                       cwd=self._cwd,
                                       sudo_password=self._sudo_password)

        args = action.get_full_command_string()
        sanitized_args = action.get_sanitized_full_command_string()

        # For consistency with the old Fabric based runner, make sure the file is executable
        if script_action:
            args = 'chmod +x %s ; %s' % (script_local_path_abs, args)
            sanitized_args = 'chmod +x %s ; %s' % (script_local_path_abs, sanitized_args)

        env = os.environ.copy()

        # Include user provided env vars (if any)
        env.update(env_vars)

        # Include common st2 env vars
        st2_env_vars = self._get_common_action_env_variables()
        env.update(st2_env_vars)

        LOG.info('Executing action via LocalRunner: %s', self.runner_id)
        LOG.info('[Action info] name: %s, Id: %s, command: %s, user: %s, sudo: %s' %
                 (action.name, action.action_exec_id, sanitized_args, action.user, action.sudo))

        stdout = StringIO()
        stderr = StringIO()

        store_execution_stdout_line = functools.partial(store_execution_output_data,
                                                        output_type='stdout')
        store_execution_stderr_line = functools.partial(store_execution_output_data,
                                                        output_type='stderr')

        read_and_store_stdout = make_read_and_store_stream_func(execution_db=self.execution,
            action_db=self.action, store_data_func=store_execution_stdout_line)
        read_and_store_stderr = make_read_and_store_stream_func(execution_db=self.execution,
            action_db=self.action, store_data_func=store_execution_stderr_line)

        # If sudo password is provided, pass it to the subprocess via stdin>
        # Note: We don't need to explicitly escape the argument because we pass command as a list
        # to subprocess.Popen and all the arguments are escaped by the function.
        if self._sudo_password:
            LOG.debug('Supplying sudo password via stdin')
            echo_process = subprocess.Popen(['echo', self._sudo_password + '\n'],
                                            stdout=subprocess.PIPE)
            stdin = echo_process.stdout
        else:
            stdin = None

        # Make sure os.setsid is called on each spawned process so that all processes
        # are in the same group.

        # Process is started as sudo -u {{system_user}} -- bash -c {{command}}. Introduction of the
        # bash means that multiple independent processes are spawned without them being
        # children of the process we have access to and this requires use of pkill.
        # Ideally os.killpg should have done the trick but for some reason that failed.
        # Note: pkill will set the returncode to 143 so we don't need to explicitly set
        # it to some non-zero value.
        exit_code, stdout, stderr, timed_out = shell.run_command(cmd=args,
                                                                 stdin=stdin,
                                                                 stdout=subprocess.PIPE,
                                                                 stderr=subprocess.PIPE,
                                                                 shell=True,
                                                                 cwd=self._cwd,
                                                                 env=env,
                                                                 timeout=self._timeout,
                                                                 preexec_func=os.setsid,
                                                                 kill_func=kill_process,
                                                           read_stdout_func=read_and_store_stdout,
                                                           read_stderr_func=read_and_store_stderr,
                                                           read_stdout_buffer=stdout,
                                                           read_stderr_buffer=stderr)

        error = None

        if timed_out:
            error = 'Action failed to complete in %s seconds' % (self._timeout)
            exit_code = -1 * exit_code_constants.SIGKILL_EXIT_CODE

        # Detect if user provided an invalid sudo password or sudo is not configured for that user
        if self._sudo_password:
            if re.search('sudo: \d+ incorrect password attempts', stderr):
                match = re.search('\[sudo\] password for (.+?)\:', stderr)

                if match:
                    username = match.groups()[0]
                else:
                    username = 'unknown'

                error = ('Invalid sudo password provided or sudo is not configured for this user '
                        '(%s)' % (username))
                exit_code = -1

        succeeded = (exit_code == exit_code_constants.SUCCESS_EXIT_CODE)

        result = {
            'failed': not succeeded,
            'succeeded': succeeded,
            'return_code': exit_code,
            'stdout': strip_shell_chars(stdout),
            'stderr': strip_shell_chars(stderr)
        }

        if error:
            result['error'] = error

        status = PROC_EXIT_CODE_TO_LIVEACTION_STATUS_MAP.get(
            str(exit_code),
            action_constants.LIVEACTION_STATUS_FAILED
        )

        return (status, jsonify.json_loads(result, LocalShellRunner.KEYS_TO_TRANSFORM), None)

分析:
1) 變量分析
(Pdb) p action_parameters
{}
(Pdb) p self.entry_point
None
(Pdb) p command
u'ls'
(Pdb) p self.action_name
u'local'
(Pdb) p self.liveaction_id
'5edde21d3e2ff7000d50e09d'
(Pdb) p self._timeout
60
(Pdb) p self._sudo_password
None
2) 邏輯處理分析
這裏根據是否有entry_point來判斷是直接執行命令還是執行shell腳本。
根據上述要求初始化ShellCommandAction對象或ShellScriptAction對象。
然後獲得完成的執行命令,例如:
u'sudo -E -H -u [email protected] -- bash -c ls'
更新環境變量,構造如下方法:
read_and_store_stdout = make_read_and_store_stream_func(execution_db=self.execution,
            action_db=self.action, store_data_func=store_execution_stdout_line)
最終調用shell.run_command方法傳入read_and_store_stdout方法進行處理

重點就是分析shell.run_command方法

4.1.5)代碼
st2/st2common/util/green/shell.py的run_command()方法如下

def run_command(cmd, stdin=None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False,
                cwd=None, env=None, timeout=60, preexec_func=None, kill_func=None,
                read_stdout_func=None, read_stderr_func=None,
                read_stdout_buffer=None, read_stderr_buffer=None):
    """
    Run the provided command in a subprocess and wait until it completes.

    :param cmd: Command to run.
    :type cmd: ``str`` or ``list``

    :param stdin: Process stdin.
    :type stdin: ``object``

    :param stdout: Process stdout.
    :type stdout: ``object``

    :param stderr: Process stderr.
    :type stderr: ``object``

    :param shell: True to use a shell.
    :type shell ``boolean``

    :param cwd: Optional working directory.
    :type cwd: ``str``

    :param env: Optional environment to use with the command. If not provided,
                environment from the current process is inherited.
    :type env: ``dict``

    :param timeout: How long to wait before timing out.
    :type timeout: ``float``

    :param preexec_func: Optional pre-exec function.
    :type preexec_func: ``callable``

    :param kill_func: Optional function which will be called on timeout to kill the process.
                      If not provided, it defaults to `process.kill`
    :type kill_func: ``callable``

    :param read_stdout_func: Function which is responsible for reading process stdout when
                                 using live read mode.
    :type read_stdout_func: ``func``

    :param read_stdout_func: Function which is responsible for reading process stderr when
                                 using live read mode.
    :type read_stdout_func: ``func``


    :rtype: ``tuple`` (exit_code, stdout, stderr, timed_out)
    """
    LOG.debug('Entering st2common.util.green.run_command.')

    assert isinstance(cmd, (list, tuple) + six.string_types)

    if (read_stdout_func and not read_stderr_func) or (read_stderr_func and not read_stdout_func):
        raise ValueError('Both read_stdout_func and read_stderr_func arguments need '
                         'to be provided.')

    if read_stdout_func and not (read_stdout_buffer or read_stderr_buffer):
        raise ValueError('read_stdout_buffer and read_stderr_buffer arguments need to be provided '
                         'when read_stdout_func is provided')

    if not env:
        LOG.debug('env argument not provided. using process env (os.environ).')
        env = os.environ.copy()

    # Note: We are using eventlet friendly implementation of subprocess
    # which uses GreenPipe so it doesn't block
    LOG.debug('Creating subprocess.')
    process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
                               env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)

    if read_stdout_func:
        LOG.debug('Spawning read_stdout_func function')
        read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)

    if read_stderr_func:
        LOG.debug('Spawning read_stderr_func function')
        read_stderr_thread = eventlet.spawn(read_stderr_func, process.stderr, read_stderr_buffer)

    def on_timeout_expired(timeout):
        global timed_out

        try:
            LOG.debug('Starting process wait inside timeout handler.')
            process.wait(timeout=timeout)
        except subprocess.TimeoutExpired:
            # Command has timed out, kill the process and propagate the error.
            # Note: We explicitly set the returncode to indicate the timeout.
            LOG.debug('Command execution timeout reached.')
            process.returncode = TIMEOUT_EXIT_CODE

            if kill_func:
                LOG.debug('Calling kill_func.')
                kill_func(process=process)
            else:
                LOG.debug('Killing process.')
                process.kill()

            if read_stdout_func and read_stderr_func:
                LOG.debug('Killing read_stdout_thread and read_stderr_thread')
                read_stdout_thread.kill()
                read_stderr_thread.kill()

    LOG.debug('Spawning timeout handler thread.')
    timeout_thread = eventlet.spawn(on_timeout_expired, timeout)
    LOG.debug('Attaching to process.')

    if read_stdout_func and read_stderr_func:
        LOG.debug('Using real-time stdout and stderr read mode, calling process.wait()')
        process.wait()
    else:
        LOG.debug('Using delayed stdout and stderr read mode, calling process.communicate()')
        stdout, stderr = process.communicate()

    timeout_thread.cancel()
    exit_code = process.returncode

    if read_stdout_func and read_stderr_func:
        # Wait on those green threads to finish reading from stdout and stderr before continuing
        read_stdout_thread.wait()
        read_stderr_thread.wait()

        stdout = read_stdout_buffer.getvalue()
        stderr = read_stderr_buffer.getvalue()

    if exit_code == TIMEOUT_EXIT_CODE:
        LOG.debug('Timeout.')
        timed_out = True
    else:
        LOG.debug('No timeout.')
        timed_out = False

    LOG.debug('Returning.')
    return (exit_code, stdout, stderr, timed_out)

分析:
1) 變量分析
(Pdb) p cmd
u'sudo -E -H -u [email protected] -- bash -c ls'
(Pdb) p stdin
None
(Pdb) p stdout
-1
(Pdb) p stderr
-1
(Pdb) p shell
True
(Pdb) p cwd
None
(Pdb) p env
{'USERNAME': 'root', 'SUDO_COMMAND': '/opt/stackstorm/st2/bin/st2actionrunner --config-file /etc/st2/st2.conf', 'TERM': 'xterm', 'SHELL': '/bin/bash', 'ST2_ACTION_PACK_NAME': u'core', 'ST2_ACTION_EXECUTION_ID': '5edde21d3e2ff7000d50e09e', 'ST2_ACTION_AUTH_TOKEN': u'd8beda985f27486ea54d9d3471415731', 'HOSTNAME': 'dozer-st2actionrunner-0', 'SUDO_UID': '0', 'SUDO_GID': '0', 'LS_COLORS': 'rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:', 'LOGNAME': 'root', 'USER': 'root', 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin', 'ST2_ACTION_API_URL': 'http://st2api:9101/v1', 'MAIL': '/var/mail/root', 'SUDO_USER': 'root', 'PS1': '\\[\x1b[1m\\]()\\[\x1b(B\x1b[m\\][\\u@\\h \\W]\\$ ', 'HOME': '/root', 'LC_ALL': 'en_US.utf8'}
(Pdb) p timeout
60
(Pdb) p preexec_func
<built-in function setsid>
(Pdb) p kill_func
<function kill_process at 0x49a3668>
(Pdb) p read_stdout_func
<function read_and_store_stream at 0x5828c08>
(Pdb) p read_stderr_func
<function read_and_store_stream at 0x5828b90>
(Pdb) p read_stdout_buffer
<StringIO.StringIO instance at 0x582a560>
(Pdb) p read_stderr_buffer
<StringIO.StringIO instance at 0x582a680>

2)邏輯處理分析
具體執行shell類型的action時,先通過subprocess.Popen命令來執行
process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
                               env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)
可以看到只要read_stdout_func非空,就會開啓協程來處理subprocess.Popen命令得到的stdout結果
read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)
另外,爲了處理超時的問題,還開啓瞭如下協程進行超時就kill的處理
timeout_thread = eventlet.spawn(on_timeout_expired, timeout)
最終執行:
process.wait()
來等待action執行完成。返回(exit_code, stdout, stderr, timed_out)的4元組。

3)樣例如下:
(Pdb) p exit_code
0
(Pdb) p stdout
'bootstrap\ncmd\nconfig.py\nconfig.pyc\ncontainer\n__init__.py\n__init__.pyc\nnotifier\npolicies\nresultstracker\nrunners\nscheduler.py\nscheduler.pyc\nworker.py\nworker.py_bak\nworker.pyc\n'
(Pdb) p stderr
''
(Pdb) p timeout
60

5 總結
1) st2 run action是st2的常用場景,其本質是通過:
命令行向st2 api服務發送請求
--> st2 api服務經過層層處理後,最終發送liveaction消息到'st2.liveaction.status'這個exchange, routing_key爲'requested' 
--> 該消息被st2 actionrunner服務中的ActionExecutionScheduler類所處理,因爲該類監聽的隊列綁定關係如下
'st2.liveaction.status'--->'requested'--->'st2.actionrunner.req' ,隨後消息被處理後被ActionExecutionScheduler類再次發送消息
到'st2.liveaction.status'的exchange, routing_key爲'scheduled'
--> 該消息被st2 actionrunner服務中的ActionExecutionDispatcher類處理,因爲該類監聽的隊列綁定關係如下
'st2.liveaction.status'--->'scheduled'--->'st2.actionrunner.work'
--> 在ActionExecutionDispatcher類中調用RunnerContainer.dispatch(self, liveaction_db)方法執行action,
根據不通的runner類型,調用runner的run方法進行action的處理。
2) 以LocalShellRunner爲例,其run方法具體處理流程是:
具體執行shell類型的action時,先通過subprocess.Popen命令來執行
process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
                               env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)
可以看到只要read_stdout_func非空,就會開啓協程來處理subprocess.Popen命令得到的stdout結果
read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)
另外,爲了處理超時的問題,還開啓瞭如下協程進行超時就kill的處理
timeout_thread = eventlet.spawn(on_timeout_expired, timeout)
最終執行:
process.wait()
來等待action執行完成。返回(exit_code, stdout, stderr, timed_out)的4元組。

參考:
stackstorm 2.6代碼
 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章