Meta Llama 3本地部署

ops/2024/12/22 1:55:04/

感谢阅读

环境安装
收尾

环境安装

项目文件
下载完后在根目录进入命令终端（windows下cmd、linux下终端、conda的话activate）
运行

pip install -e .

不要控制台，因为还要下载模型。这里挂着是节省时间

模型申请链接
在这里插入图片描述
复制如图所示的链接
然后在刚才的控制台

bash download.sh

在验证哪里直接输入刚才链接即可
如果报错没有wget，则点我下载wget
然后放到C:\Windows\System32 下

torchrun --nproc_per_node 1 example_chat_completion.py \--ckpt_dir Meta-Llama-3-8B-Instruct/ \--tokenizer_path Meta-Llama-3-8B-Instruct/tokenizer.model \--max_seq_len 512 --max_batch_size 6

收尾

创建chat.py脚本

# Copyright (c) Meta Platforms, Inc. and affiliates.
# This software may be used and distributed in accordance with the terms of the Llama 3 Community License Agreement.from typing import List, Optionalimport firefrom llama import Dialog, Llamadef main(ckpt_dir: str,tokenizer_path: str,temperature: float = 0.6,top_p: float = 0.9,max_seq_len: int = 512,max_batch_size: int = 4,max_gen_len: Optional[int] = None,
):"""Examples to run with the models finetuned for chat. Prompts correspond of chatturns between the user and assistant with the final one always being the user.An optional system prompt at the beginning to control how the model should respondis also supported.The context window of llama3 models is 8192 tokens, so `max_seq_len` needs to be <= 8192.`max_gen_len` is optional because finetuned models are able to stop generations naturally."""generator = Llama.build(ckpt_dir=ckpt_dir,tokenizer_path=tokenizer_path,max_seq_len=max_seq_len,max_batch_size=max_batch_size,)# Modify the dialogs list to only include user inputsdialogs: List[Dialog] = [[{"role": "user", "content": ""}],  # Initialize with an empty user input]# Start the conversation loopwhile True:# Get user inputuser_input = input("You: ")# Exit loop if user inputs 'exit'if user_input.lower() == 'exit':break# Append user input to the dialogs listdialogs[0][0]["content"] = user_input# Use the generator to get model responseresult = generator.chat_completion(dialogs,max_gen_len=max_gen_len,temperature=temperature,top_p=top_p,)[0]# Print model responseprint(f"Model: {result['generation']['content']}")if __name__ == "__main__":fire.Fire(main)

然后运行

torchrun --nproc_per_node 1 chat.py     --ckpt_dir Meta-Llama-3-8B-Instruct/     --tokenizer_path Meta-Llama-3-8B-Instruct/tokenizer.model     --max_seq_len 512 --max_batch_size 6

Meta Llama 3本地部署

感谢阅读

环境安装

收尾

相关文章

若依框架学习-springboot-gateway笔记

【树莓派Linux内核开发】入门实操篇（虚拟机Ubuntu环境搭建+内核源码获取与配置+内核交叉编译+内核镜像挂载）

VNISEdit 制作安装包

基于遗传算法的TSP算法（matlab实现）

C语言什么是指针数组？

tcp服务器端与多个客户端连接

centOS7.9| 无root安装 openssl 1.1.1

一个基于更新频率和卡片等级、浏览量的动态推荐排序算法