下载模型文件:
将https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B/tree/main目录下面的所有文件全部下载下来,稍微有点大,所有文件将近16个G
编写代码进行推理
$ more testDS.py
from transformers import AutoModelForCausalLM, AutoTokenizer
from datetime import datetime
# 加载预训练的模型和分词器
model_name = '/data/model/DeepSeek-R1-Distill-Qwen-7B'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)# 编写提示词
prompt = """"""# 对输入进行编码
input_ids = tokenizer.encode(prompt + tokenizer.eos_token, return_tensors='pt')
print("当前日期和时间:", datetime.now())
# 生成回复
output = model.generate(input_ids, max_length=10000, pad_token_id=tokenizer.eos_token_id, no_repeat_ngram_size=2, top_k=50, top_p=0.95, temperature=0.7)# 解码生成的文本
response = tokenizer.decode(output[:, input_ids.shape[-1]:][0], skip_special_tokens=True)
print(response)
print("当前日期和时间:", datetime.now())
推理结果
$ python testDS.py
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:02<00:00, 1.23s/it]
当前日期和时间: 2025-02-07 14:40:02.670872
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
<think>
xxx
</think>
领导人
当前日期和时间: 2025-02-07 14:40:57.471051