如何安装和使用vicuna

沙岳

2023-12-01

Vicuna 是基于 LLaMa 微调得来的大规模语言对话模型。

本文以 Vicuna-7B 模型为例，安装和使用 Vicuna。若需使用 Vicuna-13B 模型，仅需把参数 7B 改成 13B 即可。

0. 虚拟环境

在 conda 中创建虚拟环境，python 版本要求在 3.8 以上：https://blog.csdn.net/Yu_L2/article/details/105186991

虚拟环境还要安装 torch：https://pytorch.org/get-started/previous-versions/

以及 transformers：https://huggingface.co/docs/transformers/installation

等库。

克隆 Vicuna 库到本地。

git clone https://github.com/lm-sys/FastChat.git

可以通过以下命令下载 LLaMA。

pip install pyllama -U
python -m llama.download --model_size 7B

【提示】模型下载速度较慢，可以通过 Ctrl+C 终止下载，再重新运行该命令继续下载。

下载转换程序并运行。

python convert_llama_weights_to_hf.py --input_dir ./llama --model_size 7B --output_dir ./output/

--input_dir 后的参数表示 LLaMA 模型下载的目录，--output_dir 模型表示转换后的模型所在目录。

首先需要下载 Vicuna模型的 delta 参数。

git clone https://huggingface.co/lmsys/vicuna-7b-delta-v1.1

其次把 LLaMA 参数与 Vicuna 的 delta 参数进行合并，得到 Vicuna 模型参数。

python -m fastchat.model.apply_delta
	--base-model-path ./output
	--target-model-path ./vicuna-7b
	--delta-path ./vicuna-7b-delta-v1.1

--base-model-path 参数为转换后的模型位置，--target-model-path 表示最终模型输出的位置，--delta-path 参数表示下载的delta权重位置。

【提示】如果出现关于 git lfs 相关的错误，可以在 delta 目录下执行以下命令。

git lfs pull
git lfs install

模型可以在 GPU 上运行。

python -m fastchat.serve.cli --model-path ./vicuna-7b

--model-path 参数表示合并后的模型位置。

还可以通过 --num-gpus 指定 GPU 个数。

python -m fastchat.serve.cli --model-path ./vicuna-7b --num-gpus 2

若想让模型运行在CPU上，则使用--device 参数。

python -m fastchat.serve.cli --model-path ./vicuna-7b --device cpu

使用 --load-8bit 参数减小占用的内存。

python -m fastchat.serve.cli --model-path ./vicuna-7b --load-8bit