以容器方式部署通義千問 Qwen

準備服務器

  • 阿里云云服務器
  • 實例規格:輕量級 GPU 實例 ecs.vgn6i-m4-vws.xlarge(4vCPU 23GiB)
  • 磁盤空間 :50G
  • 操作系統:Ubuntu 22.04

安裝 docker

apt install docker.io

安裝 NVIDIA GRID 驅動

acs-plugin-manager --exec --plugin grid_driver_install

安裝 NVIDIA Container Toolkit

  • 安裝命令
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
apt-get update
apt-get install -y nvidia-container-toolkit
  • 配置命令
nvidia-ctk runtime configure --runtime=docker
systemctl restart docker
  • 驗證是否安裝成功
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

下載 model checkpoint

  • 創建下載腳本 download-model-checkpoint.py
from modelscope import snapshot_download
from transformers import AutoModelForCausalLM, AutoTokenizer

# Downloading model checkpoint to a local dir model_dir
model_dir = snapshot_download('qwen/Qwen-7B-Chat')

# Loading local checkpoints
# trust_remote_code is still set as True since we still load codes from local dir instead of transformers
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    device_map="auto",
    trust_remote_code=True
).eval()
  • 安裝腳本依賴包
pip install modelscope
pip install transformers
pip install torch
pip install tiktoken
pip install transformers_stream_generator
pip install accelerate
  • 執行腳本下載 model checkpoints
python3 download-model-checkpoint.py 

注:model checkpoints 文件會被下載到 ~/.cache/modelscope/hub/qwen/Qwen-7B-Chat 文件夾中(這個路徑就是 model_dir 變量的值)。

啓動容器運行模型服務(OpenAI API 兼容方式)

  • 簽出通義千問的開源代碼
git clone https://github.com/QwenLM/Qwen.git
  • 使用下面的腳本啓動容器
IMAGE_NAME=qwenllm/qwen:cu114
PORT=8901
CHECKPOINT_PATH=~/.cache/modelscope/hub/qwen/Qwen-7B-Chat
bash docker/docker_openai_api.sh -i ${IMAGE_NAME} -c ${CHECKPOINT_PATH} --port ${PORT}

注:qwenllm/qwen:cu114 鏡像文件大小爲 9.87G

  • 確認容器是否啓動成功
# docker ps
CONTAINER ID   IMAGE                COMMAND                  CREATED         STATUS         PORTS                                   NAMES
b2bd3f3417af   qwenllm/qwen:cu114   "/opt/nvidia/nvidia_…"   3 minutes ago   Up 3 minutes   0.0.0.0:8901->80/tcp, :::8901->80/tcp   qwen

啓動成功!

  • 確認 api 是否可以正常請求
# curl localhost:8901/v1/models | jq

輸出內容

{
  "object": "list",
  "data": [
    {
      "id": "gpt-3.5-turbo",
      "object": "model",
      "created": 1707471911,
      "owned_by": "owner",
      "root": null,
      "parent": null,
      "permission": null
    }
  ]
}

請求成功!可以正常兼容 openai 的 api。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章