没有GPU,也可以尝试一下NVIDIA提供的免费大模型环境。
这里准备的demo调用了3.8B参数的轻量级模型:microsoft/phi-3-mini-4k-instruct, 加上Flask做一个简单的网页调用演示。
a) 项目结构
app.py
templates/index.html
b) 安装引用
# pip install Flask openai
c) 准备Flask应用 (app.py)
python">from flask import Flask, request, render_template
from openai import OpenAIapp = Flask(__name__)client = OpenAI(base_url="https://integrate.api.nvidia.com/v1",api_key="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"
)@app.route("/", methods=["GET", "POST"])
def index():result = ""if request.method == "POST":user_input = request.form["user_input"]completion = client.chat.completions.create(model="microsoft/phi-3-mini-4k-instruct",messages=[{"role": "user", "content": user_input}],temperature=0.2,top_p=0.7,max_tokens=1024,stream=True)for chunk in completion:if chunk.choices[0].delta.content is not None:result += chunk.choices[0].delta.contentreturn render_template("index.html", result=result)if __name__ == "__main__":app.run(debug=True)
d) 准备网页模板 (templates/index.html)
<!DOCTYPE html>
<html lang="zh">
<head><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>大模型请求</title>
</head>
<body><h1>输入你的请求</h1><form method="POST"><textarea name="user_input" rows="4" cols="50" placeholder="请输入内容..."></textarea><br><input type="submit" value="提交"></form><h2>返回结果:</h2><pre>{{ result }}</pre>
</body>
</html>
e) 运行Flask应用
# python app.py* Serving Flask app 'app'* Debug mode: on
WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.* Running on http://127.0.0.1:5000
Press CTRL+C to quit* Restarting with stat* Debugger is active!* Debugger PIN: 831-621-819
f) 调用大模型
打开浏览器,访问 http://127.0.0.1:5000
<end>