目標:
弄清楚st2actionrunner的併發處理能力
1 分析actionrunner邏輯流程
在st2actionrunner服務中兩個消費者服務ActionExecutionScheduler和ActionExecutionDispatcher,
其中ActionExecutionScheduler服務既是消費者,也作爲生產者發送消息給ActionExecutionDispatcher處理,
ActionExecutionDispatcher接收到消息調用runner去執行action,生成execution,最終完成整個處理過程。
如果採用mistral runner則會建立workflow和execution, st2會關聯mistral中的execution到st2本身的execution,
用於追溯execution的執行情況。
2 ActionExecutionScheduler也是發送消息的,所以真正的執行runner是
ActionExecutionDispatcher,其調用RunnerContainer。
其中python runner在:
st2/contrib/runners/python_runner/python_runner/python_runner.py
PythonRunner類的run方法調用run_command方法
對應於
from st2common.util.green.shell import run_command
進入:
st2/st2common/st2common/util/green/shell.py的
def run_command(cmd, stdin=None, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False,
cwd=None, env=None, timeout=60, preexec_func=None, kill_func=None,
read_stdout_func=None, read_stderr_func=None,
read_stdout_buffer=None, read_stderr_buffer=None):
"""
Run the provided command in a subprocess and wait until it completes.
:param cmd: Command to run.
:type cmd: ``str`` or ``list``
:param stdin: Process stdin.
:type stdin: ``object``
:param stdout: Process stdout.
:type stdout: ``object``
:param stderr: Process stderr.
:type stderr: ``object``
:param shell: True to use a shell.
:type shell ``boolean``
:param cwd: Optional working directory.
:type cwd: ``str``
:param env: Optional environment to use with the command. If not provided,
environment from the current process is inherited.
:type env: ``dict``
:param timeout: How long to wait before timing out.
:type timeout: ``float``
:param preexec_func: Optional pre-exec function.
:type preexec_func: ``callable``
:param kill_func: Optional function which will be called on timeout to kill the process.
If not provided, it defaults to `process.kill`
:type kill_func: ``callable``
:param read_stdout_func: Function which is responsible for reading process stdout when
using live read mode.
:type read_stdout_func: ``func``
:param read_stdout_func: Function which is responsible for reading process stderr when
using live read mode.
:type read_stdout_func: ``func``
:rtype: ``tuple`` (exit_code, stdout, stderr, timed_out)
"""
LOG.debug('Entering st2common.util.green.run_command.')
assert isinstance(cmd, (list, tuple) + six.string_types)
if (read_stdout_func and not read_stderr_func) or (read_stderr_func and not read_stdout_func):
raise ValueError('Both read_stdout_func and read_stderr_func arguments need '
'to be provided.')
if read_stdout_func and not (read_stdout_buffer or read_stderr_buffer):
raise ValueError('read_stdout_buffer and read_stderr_buffer arguments need to be provided '
'when read_stdout_func is provided')
if not env:
LOG.debug('env argument not provided. using process env (os.environ).')
env = os.environ.copy()
# Note: We are using eventlet friendly implementation of subprocess
# which uses GreenPipe so it doesn't block
LOG.debug('Creating subprocess.')
process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)
if read_stdout_func:
LOG.debug('Spawning read_stdout_func function')
read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)
if read_stderr_func:
LOG.debug('Spawning read_stderr_func function')
read_stderr_thread = eventlet.spawn(read_stderr_func, process.stderr, read_stderr_buffer)
def on_timeout_expired(timeout):
global timed_out
try:
LOG.debug('Starting process wait inside timeout handler.')
process.wait(timeout=timeout)
except subprocess.TimeoutExpired:
# Command has timed out, kill the process and propagate the error.
# Note: We explicitly set the returncode to indicate the timeout.
LOG.debug('Command execution timeout reached.')
process.returncode = TIMEOUT_EXIT_CODE
if kill_func:
LOG.debug('Calling kill_func.')
kill_func(process=process)
else:
LOG.debug('Killing process.')
process.kill()
if read_stdout_func and read_stderr_func:
LOG.debug('Killing read_stdout_thread and read_stderr_thread')
read_stdout_thread.kill()
read_stderr_thread.kill()
LOG.debug('Spawning timeout handler thread.')
timeout_thread = eventlet.spawn(on_timeout_expired, timeout)
LOG.debug('Attaching to process.')
if read_stdout_func and read_stderr_func:
LOG.debug('Using real-time stdout and stderr read mode, calling process.wait()')
process.wait()
else:
LOG.debug('Using delayed stdout and stderr read mode, calling process.communicate()')
stdout, stderr = process.communicate()
timeout_thread.cancel()
exit_code = process.returncode
if read_stdout_func and read_stderr_func:
# Wait on those green threads to finish reading from stdout and stderr before continuing
read_stdout_thread.wait()
read_stderr_thread.wait()
stdout = read_stdout_buffer.getvalue()
stderr = read_stderr_buffer.getvalue()
if exit_code == TIMEOUT_EXIT_CODE:
LOG.debug('Timeout.')
timed_out = True
else:
LOG.debug('No timeout.')
timed_out = False
LOG.debug('Returning.')
return (exit_code, stdout, stderr, timed_out)
分析:
最關鍵的部分
process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)
if read_stdout_func:
LOG.debug('Spawning read_stdout_func function')
read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)
if read_stderr_func:
LOG.debug('Spawning read_stderr_func function')
read_stderr_thread = eventlet.spawn(read_stderr_func, process.stderr, read_stderr_buffer)
這裏用了eventlet.green的subprocess,具體如下:
from eventlet.green import subprocess
用eventlet.spawn來開啓協程一旦執行完成,就將結果寫入到actionexecution結果中。
用協程判斷是否超時
最終返回結果:
return (exit_code, stdout, stderr, timed_out)
因爲沒有特殊設置eventlet的相關參數
eventlet.green.subprocess
參考文檔:
https://kite.com/python/docs/eventlet.green.subprocess
中途還調用了:
st2/st2common/st2common/services/action.py
def store_execution_output_data(execution_db, action_db, data, output_type='output',
timestamp=None):
"""
Store output from an execution as a new document in the collection.
"""
execution_id = str(execution_db.id)
action_ref = action_db.ref
runner_ref = getattr(action_db, 'runner_type', {}).get('name', 'unknown')
timestamp = timestamp or date_utils.get_datetime_utc_now()
output_db = ActionExecutionOutputDB(execution_id=execution_id,
action_ref=action_ref,
runner_ref=runner_ref,
timestamp=timestamp,
output_type=output_type,
data=data)
output_db = ActionExecutionOutput.add_or_update(output_db, publish=True,
dispatch_trigger=False)
return output_db
參考:
https://blog.csdn.net/qingyuanluofeng/java/article/details/105398730
總結:
1)調度action執行的服務是消息消費者,對應3副本,在該消費者中針對定時任務調用
python runner去執行定時任務的python腳本,執行定時任務的python腳本採用了
協程去操作,具體就是:
from eventlet.green import subprocess
process = subprocess.Popen(args=cmd, stdin=stdin, stdout=stdout, stderr=stderr,
env=env, cwd=cwd, shell=shell, preexec_fn=preexec_func)
read_stdout_thread = eventlet.spawn(read_stdout_func, process.stdout, read_stdout_buffer)
read_stderr_thread = eventlet.spawn(read_stderr_func, process.stderr, read_stderr_buffer)
3 看消費者本身
3.1 分析ActionExecutionDispatcher
對應代碼st2/st2actions/st2actions/worker.py
class ActionExecutionDispatcher(MessageHandler):
message_type = LiveActionDB
def __init__(self, connection, queues):
super(ActionExecutionDispatcher, self).__init__(connection, queues)
self.container = RunnerContainer()
self._running_liveactions = set()
def get_queue_consumer(self, connection, queues):
# We want to use a special ActionsQueueConsumer which uses 2 dispatcher pools
return ActionsQueueConsumer(connection=connection, queues=queues, handler=self)
def process(self, liveaction):
......
分析:
關鍵就是消費者ActionsQueueConsumer(connection=connection, queues=queues, handler=self)
看其是否支持併發處理
3.2 分析ActionsQueueConsumer
st2/st2common/st2common/transport/consumers.py
代碼如下:
class ActionsQueueConsumer(QueueConsumer):
"""
Special Queue Consumer for action runner which uses multiple BufferedDispatcher pools:
1. For regular (non-workflow) actions
2. One for workflow actions
This way we can ensure workflow actions never block non-workflow actions.
"""
def __init__(self, connection, queues, handler):
self.connection = connection
self._queues = queues
self._handler = handler
workflows_pool_size = cfg.CONF.actionrunner.workflows_pool_size
actions_pool_size = cfg.CONF.actionrunner.actions_pool_size
self._workflows_dispatcher = BufferedDispatcher(dispatch_pool_size=workflows_pool_size,
name='workflows-dispatcher')
self._actions_dispatcher = BufferedDispatcher(dispatch_pool_size=actions_pool_size,
name='actions-dispatcher')
def process(self, body, message):
try:
if not isinstance(body, self._handler.message_type):
raise TypeError('Received an unexpected type "%s" for payload.' % type(body))
action_is_workflow = getattr(body, 'action_is_workflow', False)
if action_is_workflow:
# Use workflow dispatcher queue
dispatcher = self._workflows_dispatcher
else:
# Use queue for regular or workflow actions
dispatcher = self._actions_dispatcher
LOG.debug('Using BufferedDispatcher pool: "%s"', str(dispatcher))
dispatcher.dispatch(self._process_message, body)
except:
LOG.exception('%s failed to process message: %s', self.__class__.__name__, body)
finally:
# At this point we will always ack a message.
message.ack()
def shutdown(self):
self._workflows_dispatcher.shutdown()
self._actions_dispatcher.shutdown()
分析:
self._actions_dispatcher = BufferedDispatcher(dispatch_pool_size=actions_pool_size, name='actions-dispatcher')
協程池的大小爲:
workflows_pool_size = cfg.CONF.actionrunner.workflows_pool_size
actions_pool_size = cfg.CONF.actionrunner.actions_pool_size
找到:
st2/st2common/st2common/config.py
dispatcher_pool_opts = [
cfg.IntOpt('workflows_pool_size', default=40,
help='Internal pool size for dispatcher used by workflow actions.'),
cfg.IntOpt('actions_pool_size', default=60,
help='Internal pool size for dispatcher used by regular actions.')
]
do_register_opts(dispatcher_pool_opts, group='actionrunner')
所以workflow的協程池大小默認爲40,action的協程池大小默認爲60
查看/etc/st2/st2.conf文件內容:
[actionrunner]
logging = /etc/st2/logging.actionrunner.conf
[api]
# allow_origin is required for handling CORS in st2 web UI.
allow_origin = *
# Host and port to bind the API server.
host = 127.0.0.1
logging = /etc/st2/logging.api.conf
mask_secrets = true
port = 9101
[auth]
# Base URL to the API endpoint excluding the version (e.g. http://myhost.net:9101/)
api_url = http://st2api:9101
mode = standalone
# Note: Settings below are only used in "standalone" mode
# backend: flat_file
# backend_kwargs: '{"file_path": "/etc/st2/htpasswd"}'
backend = keystone
backend_kwargs = {"keystone_url": "http://keystone-api.openstack.svc.cluster.local:80", "keystone_version": 3, "keystone_mode": "email"}
debug = false
enable = true
host = 127.0.0.1
logging = /etc/st2/logging.auth.conf
port = 9100
use_ssl = false
[content]
packs_base_paths = /opt/stackstorm/packs.dev
[coordination]
url = redis://[email protected]:6379
[database]
host = mongodb.openstack.svc.cluster.local
password = dozer
port = 27017
username = dozer
[exporter]
logging = /etc/st2/logging.exporter.conf
[garbagecollector]
action_executions_output_ttl = 14
action_executions_ttl = 14
logging = /etc/st2/logging.garbagecollector.conf
purge_inquiries = true
trigger_instances_ttl = 14
[keyvalue]
encryption_key_path = /etc/st2/keys/datastore_key.json
[log]
excludes = requests,paramiko
mask_secrets = true
redirect_stderr = false
[messaging]
url = amqp://rabbitmq:[email protected]:5672
[mistral]
api_url = http://st2api:9101
v2_base_url = http://mistral-api:8989/v2
[notifier]
logging = /etc/st2/logging.notifier.conf
[rbac]
enable = false
permission_isolation = true
sync_remote_groups = true
[resultstracker]
logging = /etc/st2/logging.resultstracker.conf
[rulesengine]
logging = /etc/st2/logging.rulesengine.conf
[sensorcontainer]
logging = /etc/st2/logging.sensorcontainer.conf
[ssh_runner]
remote_dir = /tmp
[stream]
logging = /etc/st2/logging.stream.conf
[syslog]
facility = local7
host = 127.0.0.1
port = 514
protocol = udp
[system]
base_path = /opt/stackstorm
[system_user]
ssh_key_file = /home/adminATexample.org/.ssh/admin_rsa
user = [email protected]
關鍵就是:
self._actions_dispatcher = BufferedDispatcher(dispatch_pool_size=actions_pool_size, name='actions-dispatcher')
3.3 分析BufferedDispatcher
調用:
st2/st2common/st2common/util/greenpooldispatch.py的
class BufferedDispatcher(object):
def __init__(self, dispatch_pool_size=50, monitor_thread_empty_q_sleep_time=5,
monitor_thread_no_workers_sleep_time=1, name=None):
self._pool_limit = dispatch_pool_size
self._dispatcher_pool = eventlet.GreenPool(dispatch_pool_size)
self._dispatch_monitor_thread = eventlet.greenthread.spawn(self._flush)
self._monitor_thread_empty_q_sleep_time = monitor_thread_empty_q_sleep_time
self._monitor_thread_no_workers_sleep_time = monitor_thread_no_workers_sleep_time
self._name = name
self._work_buffer = Queue.Queue()
# Internal attributes we use to track how long the pool is busy without any free workers
self._pool_last_free_ts = time.time()
@property
def name(self):
return self._name or id(self)
def dispatch(self, handler, *args):
self._work_buffer.put((handler, args), block=True, timeout=1)
self._flush_now()
def shutdown(self):
self._dispatch_monitor_thread.kill()
def _flush(self):
while True:
while self._work_buffer.empty():
eventlet.greenthread.sleep(self._monitor_thread_empty_q_sleep_time)
while self._dispatcher_pool.free() <= 0:
eventlet.greenthread.sleep(self._monitor_thread_no_workers_sleep_time)
self._flush_now()
def _flush_now(self):
if self._dispatcher_pool.free() <= 0:
now = time.time()
if (now - self._pool_last_free_ts) >= POOL_BUSY_THRESHOLD_SECONDS:
LOG.info(POOL_BUSY_LOG_MESSAGE % (self.name, POOL_BUSY_THRESHOLD_SECONDS))
return
# Update the time of when there were free threads available
self._pool_last_free_ts = time.time()
while not self._work_buffer.empty() and self._dispatcher_pool.free() > 0:
(handler, args) = self._work_buffer.get_nowait()
self._dispatcher_pool.spawn(handler, *args)
分析:
1)self._dispatcher_pool = eventlet.GreenPool(dispatch_pool_size)
2)def dispatch(self, handler, *args):
向任務隊列中壓入待執行的方法和該方法所需的參數,
只要任務隊列和協程池不空,就從任務隊列中取出當前待處理任務及其參數,放入協程池中處理。
3)_flush_now(self):
只要任務隊列和協程池不空,就從任務隊列中取出當前待處理任務及其參數,放入協程池中處理。
4)最關鍵的部分就是在協程池中處理消息:
self._dispatcher_pool.spawn(handler, *args)
4 總結
1) st2actionrunner默認支持workflow的協程池大小默認爲40,支持action的協程池大小默認爲60
如果是3個actionrunner服務,則支持120個workflow併發處理,180個action併發處理。
2) st2actionrunner中主入口類ActionExecutionDispatcher在其消費者
ActionsQueueConsumer中預先實例化了兩個BufferedDispatcher,一個用於處理workflow作爲runner的BufferedDispatcher,
一個用於處理python作爲runner的BufferedDispatcher。根據待處理消息判斷其是哪一種放入對應BufferedDispatcher。
BufferedDispatcher其內部是一個隊列+協程池。
將待處理消息和處理方法放入BufferedDispatcher的隊列中,然後從隊列中獲取待處理消息和處理方法,然後使用協程池去處理消息。
所以st2actionrunner本質上是併發的。
參考:
stackstorm 2.6代碼