chatglm-6b，搭建自己的AI模型

蓟捷

2023-12-01

环境

环境

centos7.6

CPU32G

SSD>=40G

python3.9

pip23.1

必备软件及安装

 yum install -y libffi-devel python-devel openssl-devel
 yum install psutil
 yum install gcc python3-devel
 yum install gcc g++ make openssl-devel zlib-devel
 yum install gcc-c++

拉取模型架构

 git clone git@github.com:THUDM/ChatGLM-6B.git
 cd ChatGLM-6B

使用 pip 安装依赖

 pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
 pip3 install fastapi uvicorn -i https://pypi.tuna.tsinghua.edu.cn/simple
 pip3 install streamlit -i https://pypi.tuna.tsinghua.edu.cn/simple
 pip3 install streamlit-chat -i https://pypi.tuna.tsinghua.edu.cn/simple

部署（GPU）

从云端加载模型

页面交互部署

执行脚本

 streamlit run web_demo2.py --server.port 7860

浏览器访问路径：

 ip:7860

命令行交互部署

python3 cli_demo.py

API接口交互部署

python3 api.py

默认部署在本地的 8000 端口，通过 POST 方法进行调用

curl -X POST "http://127.0.0.1:8000" \
     -H 'Content-Type: application/json' \
     -d '{"prompt": "你好", "history": []}'

从本地加载模型

如果网络较差，下载过慢，可从清华仓库手动下载模型文件

从Hugging Face Hub 仓库下载模型

1.安装git lfs

 $ git lfs install
 > Git LFS initialized.

2.下载模型实现（生成chatglm-6b文件夹）

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/THUDM/chatglm-6b

3.从清华仓库下载模型参数文件，并将下载的文件替换到本地的 chatglm-6b 目录下

https://cloud.tsinghua.edu.cn/d/fb9f16d6dc8f482596c2/

页面交互部署

修改web_demo2.py文件并保存

 # 默认
 tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
 model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
 #修改为如下：
 tokenizer = AutoTokenizer.from_pretrained("chatglm-6b", trust_remote_code=True)
 model = AutoModel.from_pretrained("chatglm-6b", trust_remote_code=True).half().cuda()
 # "THUDM/chatglm-6b"修改为"chatglm-6b"，"chatglm-6b"为 （从Hugging Face Hub 仓库下载模型——2.下载模型实现）生成的文件夹路径

执行脚本

streamlit run web_demo2.py --server.port 7860

浏览器访问路径：

 ip:7860

命令行交互部署

修改cli_demo.py文件并保存

 # 默认
 tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
 model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
 #修改为如下：
 tokenizer = AutoTokenizer.from_pretrained("chatglm-6b", trust_remote_code=True)
 model = AutoModel.from_pretrained("chatglm-6b", trust_remote_code=True).half().cuda()
 # "THUDM/chatglm-6b"修改为"chatglm-6b"，"chatglm-6b"为 （从Hugging Face Hub 仓库下载模型——2.下载模型实现）生成的文件夹路径

python3 cli_demo.py

API接口交互部署

修改api.py文件并保存

 # 默认
 tokenizer = AutoTokenizer.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True)
 model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
 #修改为如下：
 tokenizer = AutoTokenizer.from_pretrained("chatglm-6b", trust_remote_code=True)
 model = AutoModel.from_pretrained("chatglm-6b", trust_remote_code=True).half().cuda()
 # "THUDM/chatglm-6b"修改为"chatglm-6b"，"chatglm-6b"为 （从Hugging Face Hub 仓库下载模型——2.下载模型实现）生成的文件夹路径

python3 api.py

默认部署在本地的 8000 端口，通过 POST 方法进行调用

curl -X POST "http://127.0.0.1:8000" \
     -H 'Content-Type: application/json' \
     -d '{"prompt": "你好", "history": []}'

以上均使用GPU部署（不低于13G），如GPU 显存有限，可使用CPU部署（需要大概 32GB 内存）

部署（CPU）

将部署（GPU）中所有需修改文件的地方修改如下，

 # 默认文件
 model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()
 
 # 云端加载模型修改方式
 model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).float()
 #本地加载模型修改方式
 model = AutoModel.from_pretrained("chatglm-6b", trust_remote_code=True).float()

其余操作步骤同部署（GPU）

chatglm-6b，搭建自己的AI模型

环境

必备软件及安装

拉取模型架构

使用 pip 安装依赖

部署（GPU）

从云端加载模型

从本地加载模型

部署（CPU）

相关阅读

相关文章

相关问答

相关文档