零基础学习书生.浦语大模型-入门岛

第一关：Linux基础知识

任务一：Cursor连接SSH运行代码

使用Remote - SSH插件即可

运行指令

python hello_world.py

端口映射

ssh -p 46561 root@ssh.intern-ai.org.cn -CNg -L 7860:127.0.0.1:7860 -o StrictHostKeyChecking=no

注：46561：服务器端口号

进入127.0.0.1:7860访问资源

第二关：Python基础知识

任务一：Leetcode383题

代码

from collections import Counterclass Solution(object):def canConstruct(self, ransomNote, magazine):""":type ransomNote: str:type magazine: str:rtype: bool"""ransom_counter = Counter(ransomNote)magazine_counter = Counter(magazine)for char, count in ransom_counter.items():if magazine_counter[char] < count:return Falsereturn True

任务二：Vscode连接InternStudio debug

登录书生.浦语API官网，获取API token

将API token输入到实例代码中，并运行该代码

可以发现此代码存在错误

由于Bug提示信息得json数据格式存在问题，于是在第30行打断，查看res数值情况。

res数值为'根据提供的模型介绍文字，以下是提取的关于该模型的信息，以JSON格式返回：\n\n```json\n{\n "模型名字": "书生浦语InternLM2.5",\n "开发机构": "上海人工智能实验室",\n "提供参数版本": ["1.8B", "7B", "20B"],\n "上下文长度": "1M"\n}\n```\n\n这些信息准确地反映了模型名称、开发机构、提供的参数版本以及支持的上下文长度。如果还有其他需要补充或调整的信息，请随时告知。'

因此需要提取JSON格式的代码

res = internlm_gen(prompt,client)json_pattern = r'json\s*({[\s\S]*?})\s*'
match = re.search(json_pattern, res)
res = match.group(1) res_json = json.loads(res)
print(res_json)

运行成功

第三关：Git基础知识

任务1: 破冰活动：自我介绍

fork项目 InternLM/Tutorial

执行指令

git clone https://github.com/Aaronzijingcai/Tutorial.git
git checkout -b class origin/class
git checkout -b class_7745git add .
git commit -m "add git_camp4_7745_introduction"
git push origin class_7745

添加Pull requests

add git_7745_introduction by Aaronzijingcai · Pull Request #2952 · InternLM/Tutorial · GitHub

任务2: 实践项目：构建个人项目

dock408项目

介绍：Dock408 是基于 InternLM 书生浦语大模型开发的多智能体项目。"Dock"（码头）寓意着帮助考生顺利抵达成功的彼岸——即考研成功。该项目旨在为408考生提供高效的学习支持和知识掌握工具。当学生上传题目后，Dock408 不仅会提供详细的解答，还会列举相关的知识点，帮助学生查漏补缺。此外，Dock408 还会生成与该题目相关的练习题，通过“举一反三”的方式强化学生的理解和应用能力，从而更好地掌握知识。更重要的是，Dock408 根据学生的历史刷题记录和知识点查阅情况，利用艾宾浩斯遗忘曲线理论，智能生成个性化的复习提醒，确保学生能够有效地巩固所学内容，避免遗忘。

地址：https://github.com/Aaronzijingcai/dock408

第四关：玩转HF/魔塔/魔乐社区

任务一：模型下载

在自己的conda环境中安装transformer

# 安装transformers
pip install transformers==4.38
pip install sentencepiece==0.1.99
pip install einops==0.8.0
pip install protobuf==5.27.2
pip install accelerate==0.33.0

从HF中下载internlm2_5-7b-chat模型

touch hf_download_josn.py

其中hf_download_json.py代码为

import torch
from transformers import AutoTokenizer, AutoModelForCausalLMtokenizer = AutoTokenizer.from_pretrained("internlm/internlm2_5-1_8b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2_5-1_8b", torch_dtype=torch.float16, trust_remote_code=True)
model = model.eval()inputs = tokenizer(["A beautiful flower"], return_tensors="pt")
gen_kwargs = {"max_length": 128,"top_p": 0.8,"temperature": 0.8,"do_sample": True,"repetition_penalty": 1.0
}# 以下内容可选，如果解除注释等待一段时间后可以看到模型输出
# output = model.generate(**inputs, **gen_kwargs)
# output = tokenizer.decode(output[0].tolist(), skip_special_tokens=True)
# print(output)