Win11 部署 Langchain-Chatchat

原創

un8134

2023-11-14 14:10

LangChain-Chatchat (原 Langchain-ChatGLM)

基於 ChatGLM 等大語言模型與 Langchain 等應用框架實現，開源、可離線部署的檢索增強生成(RAG)大模型知識庫項目。

項目地址：https://github.com/chatchat-space/Langchain-Chatchat

部署還是比較簡單的，照着文檔一步一步來就行

先看下 python 版本，最好 3.10：https://www.python.org/download/releases/

python --version

然後拉取倉庫，安裝依賴

# 拉取倉庫
git clone https://github.com/chatchat-space/Langchain-Chatchat.git

# 進入目錄
cd Langchain-Chatchat

# 安裝全部依賴
pip install -r requirements.txt

我 webui 和 api 都安裝，可以按需安裝的

#安裝API
pip install -r requirements_api.txt

#安裝webui
pip install -r requirements_webui.txt

下面需要下載模型，受限需要裝下 Git LFS：https://docs.github.com/zh/repositories/working-with-files/managing-large-files/installing-git-large-file-storage

git lfs install

模型一般從 HuggingFace 下載： https://huggingface.co/models

不過由於某些原因可能無法訪問，我們可以從鏡像站點下載：https://hf-mirror.com/models

下載模型：

git clone https://hf-mirror.com/THUDM/chatglm2-6b
git clone https://hf-mirror.com/moka-ai/m3e-base

下面複製下默認的配置文件

python copy_config_example.py

初始化知識庫

python init_database.py --recreate-vs

然後啓動

python startup.py -a

順利的話可以看到當前配置：

瀏覽器打開web界面:

API也順利運行：

仍本水滸傳進去，看看效果：

貌似還行，主要是 cpu 的話實在太慢…

看下我們的 pytorch 是否支持使用 GPU

python
import torch
torch.__version__
print(torch.cuda.is_available())

False 說明當前 pytorch 不支持 cuda

要用 gpu，需要裝 cuda，然後裝支持 cuda 的 pytorch

先下載 cuda toolkit ：https://developer.nvidia.com/cuda-toolkit-archive

目前 pytorch 用的比較多的是 11.8 和 12.1

安裝完成後，運行查看下 cuda 版本

nvcc –V

然後上 https://pytorch.org/ ，安裝支持 cuda 的 pytorch

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

但我們已經裝過 pytorch 的話，這樣不一定能裝上…

需要上 https://download.pytorch.org/whl/torch_stable.html

找到需要的版本，本地安裝，

像CPU版本的，支持 python3.10 的，win版的是這個：

支持 cuda 12.1，python3.10 的，win版的是這個：

下載到本地，安裝：

pip install g:/AI/torch-2.1.0+cu121-cp310-cp310-win_amd64.whl

安裝完成後再看下是否支持 cuda：

python
import torch
torch.__version__
print(torch.cuda.is_available())

現在再啓動 Langchain-Chatchat，就可以支持 GPU 了

但是我的 8G 顯存太小了，使用 chatglm2-6b 的時候會報顯存不足無法啓動….

可以使用量化過的 chatglm2-6b-int4 模型（當然量化過的模型會傻一點…）

先下載模型：

git clone https://hf-mirror.com/THUDM/chatglm2-6b-int4

然後修改 model_config.py，修改 LLM 模型名稱（注意上面 MODEL_PATH 的 llm_model 裏指定了 chatglm2-6b-int4 模型的路徑，需要的話可以自己修改模型路徑）

然後運行

python startup.py -a

可以看到我們已經是 cuda 的版本了

隨便試試…

速度比 cpu 快了很多…

原來智多星是李逵…

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Win11 部署 Langchain-Chatchat

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

企業大模型如何成爲自己數據的“百科全書”？

本地SSL證書過期輸入命令在IIS自動生成

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（二）使用kube-vip實現集羣VIP訪問

.NET週刊【5月第2期 2024-05-12】

ASP.NET 8 使用 NLog

ASP.NET Core 使用 pdfjs 加載實時水印 base64 編碼的 PDF

ASP.NET Core 給 PDF 加水印

ASP.NET Core PNG 圖片轉 PDF

ASP.NET Core 用密碼加密 PDF

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結