由於celery是併發執行任務,當打印日誌在同一個文件時,不同進程之間的日誌就交錯堆疊在了一起,想要查詢日誌回溯某個問題時,總是非常困難。
如果每條log都能帶上當前task的ID,就會方便很多。
核心代碼:
from celery._state import get_current_task
task = get_current_task()
if task and task.request:
task_id = task.request.id
task_name = task.name
一、準備環境
建議使用root賬戶操作,以免安裝、配置、運行權限不夠。
- 操作系統:推薦Debian系列(或其它Linux,celery4.0開始不再支持windows)
- redis版本:4.0.8 (sudo apt-get install redis-server, sudo service redis-server start)
- python版本:3.6.5
- celery:4.2.0 (使用pip3 install https://github.com/celery/celery/tarball/v4.2.0-136-gc1d0bfe 安裝,pip install celery安裝的包中的async模塊和python關鍵字衝突)
- redis:3.0.1 (pip install redis)
二、編寫程序
import logging
from celery._state import get_current_task
class Formatter(logging.Formatter):
"""Formatter for tasks, adding the task name and id."""
def format(self, record):
task = get_current_task()
if task and task.request:
record.__dict__.update(task_id='%s ' % task.request.id,
task_name='%s ' % task.name)
else:
record.__dict__.setdefault('task_name', '')
record.__dict__.setdefault('task_id', '')
return logging.Formatter.format(self, record)
root_logger = logging.getLogger() # 返回logging.root
root_logger.setLevel(logging.DEBUG)
# 將日誌輸出到文件
fh = logging.FileHandler('celery_worker.log') # 這裏注意不要使用TimedRotatingFileHandler,celery的每個進程都會切分,導致日誌丟失
formatter = Formatter('[%(task_name)s%(task_id)s%(process)s %(thread)s %(asctime)s %(pathname)s:%(lineno)s] %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
fh.setFormatter(formatter)
fh.setLevel(logging.DEBUG)
root_logger.addHandler(fh)
# 將日誌輸出到控制檯
sh = logging.StreamHandler()
formatter = Formatter('[%(task_name)s%(task_id)s%(process)s %(thread)s %(asctime)s %(pathname)s:%(lineno)s] %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
sh.setFormatter(formatter)
sh.setLevel(logging.INFO)
root_logger.addHandler(sh)
class CeleryConfig(object):
BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/0'
CELERY_TASK_SERIALIZER = 'pickle' # " json從4.0版本開始默認json,早期默認爲pickle(可以傳二進制對象)
CELERY_RESULT_SERIALIZER = 'pickle'
CELERY_ACCEPT_CONTENT = ['json', 'pickle']
CELERY_ENABLE_UTC = True # 啓用UTC時區
CELERY_TIMEZONE = 'Asia/Shanghai' # 上海時區
CELERYD_HIJACK_ROOT_LOGGER = False # 攔截根日誌配置
CELERYD_MAX_TASKS_PER_CHILD = 1 # 每個進程最多執行1個任務後釋放進程(再有任務,新建進程執行,解決內存泄漏)
import logging
from celery import Celery, platforms
import celeryconfig
import detail
platforms.C_FORCE_ROOT = True # 配置裏設置了序列化類型爲pickle,操作系統開啓允許
app = Celery(__name__)
app.config_from_object(celeryconfig.CeleryConfig)
@app.task(bind=True)
def heavy_task(self, seconds=1):
logging.info("I'm heavy_task") # 默認使用logging.root
return detail.process_heavy_task(seconds)
import logging
import time
def process_heavy_task(seconds=1):
logging.info("I'm process_heavy_task") # 默認使用logging.root
time.sleep(seconds)
return True
三、啓動並測試
- 新建shell窗口,啓動celery服務
# ls
celeyconfig.py detail.py tasks.py
# celery worker -A tasks -l info
- 新建shell窗口,監控日誌文件
# ls
celeyconfig.py celery_worker.log detail.py tasks.py
# tail -f celery_worker.log
- 新建shell窗口,調用celery任務
# ls
celeyconfig.py celery_worker.log detail.py tasks.py
# python3
>>> import tasks
>>> t = tasks.heavy_task.delay(3)
>>> t.result
True
四、結果截圖