大语言模型实践——基于现有API的二次开发

基于现有的API平台做一些实用的AI小应用。

API服务商：阿里云百炼

云服务器：阿里云（2核2GB）

部署框架：gradio

调用框架：openai

语言：Python

（注：若搭建网站或API接口出现调用异常，会立即关闭停止使用）

1、搭建个人DeepSeek-v3、R1网站

搭建代码（DeepSeek-v3）如下，目前DP官网无法充值，后续其官网稳定可以使用官网API进行调用，当下使用阿里云百炼中集成的接口进行调用：

import gradio as gr
from openai import OpenAI
import yamldef get_apikey(path = "apikey.yaml"):with open(path, 'r') as f:config = yaml.safe_load(f)res = config["apikey"]return resdef deepseek_academic_repeat(question):client = OpenAI(api_key = get_apikey()['dashscope'],base_url = "https://dashscope.aliyuncs.com/compatible-mode/v1")response = client.chat.completions.create(model="deepseek-v3",messages=[{"role": "user", "content": f"{question}"},],max_tokens=4096,temperature=1.0,stream=False)# print(response.choices[0].message.content)return response.choices[0].message.contentinterface = gr.Interface(fn=deepseek_academic_repeat,                 # 润色函数inputs=[gr.Textbox(label="输入", placeholder="问题输入",lines=20),],# 输入框outputs=gr.Textbox(label="DeepSeek-v3回复",lines=20), # 输出框title="临时DeepSeek",           # 应用标题description="基于阿里云Dashscope中集成的DeepSeek-v3接口实现对DP-v3的使用" # 应用描述
)if __name__ == "__main__":interface.launch(server_name="0.0.0.0", server_port=5566, share=True)  # 指定端口号

结果如下：

公网可直接访问：

（DeepSeek-v3）：临时DeepSeekhttp://47.94.104.2:5566/http://47.94.104.2:5566/（DeepSeek-r1）：临时DeepSeek-R1http://47.94.104.2:3344/http://47.94.104.2:3344/

2、基于文本生成模型搭建学术风格润色网站

基于大语言模型实现学术风格润色网站。使用DeepSeek-v3模型。

搭建代码如下：

import gradio as gr
from openai import OpenAI
import yamldef get_apikey(path = "apikey.yaml"):with open(path, 'r') as f:config = yaml.safe_load(f)res = config["apikey"]return resdef deepseek_academic_repeat(question):client = OpenAI(api_key = get_apikey()['dashscope'],base_url = "https://dashscope.aliyuncs.com/compatible-mode/v1")academic_prompt = """# 角色 # 你是一个学术科技论文修改专家，你需要对用户输入论文进行润色或修改，使其符合学术写作规范。# 任务 # 对用户上传文本进行润色和修改，主要从以下几个角度进行：1、使用正式的学术语言，避免口语化表达确保逻辑清晰。2、论述严谨，增强论证的说服力。3、确保描述的视角的一致性，保持与笔者的描述视角相同。4、改进句子结构，使其更简洁和规范。5、增强术语的使用，确保准确表达领域内的概念和观点。6、检查拼写、语法和标点符号错误，确保文本的语言准确无误。# 限制 # 不得编造。输出语言与输入语言保持严格一致。# 输出 # 只输出润色的结果，不输入任何其他的无关内容。"""response = client.chat.completions.create(model="deepseek-v3",messages=[{"role": "system", "content": academic_prompt},{"role": "user", "content": f"{question}"},],max_tokens=2024,temperature=0.15,stream=False)# print(response.choices[0].message.content)return response.choices[0].message.contentinterface = gr.Interface(fn=deepseek_academic_repeat,                 # 润色函数inputs=[gr.Textbox(label="输入文本", placeholder="请输入需要润色的文本",lines=20),],# 输入框outputs=gr.Textbox(label="润色后的文本",lines=20), # 输出框title="文本润色工具",           # 应用标题description="在左侧输入需要润色的文本，右侧将显示润色后的文本。" # 应用描述
)if __name__ == "__main__":interface.launch(server_name="0.0.0.0", server_port=8443, share=True)  # 指定端口号

公网可直接访问：

文本润色工具：

文本润色工具http://47.94.104.2:8443/http://47.94.104.2:8443/

3、结合OCR实现PDF全文润色并输出本地

第一步，在云服务上开发功能，并将该功能部署为API接口。

主要需要实现的功能包括，图片文字提取（OCR），润色（大模型）并输出。OCR使用Qwen的视觉大模型（Qwen-VL）实现，润色使用Qwen-Max实现。

功能开发如下：

from openai import OpenAI
import time
import yaml,json
from fastapi import FastAPI
from pydantic import BaseModel
import uvicornapp = FastAPI()def get_apikey(path = "/root/MyProj/apikey.yaml"):with open(path, 'r') as f:config = yaml.safe_load(f)res = config["apikey"]return resdef qwen_ocr(base64_image_code,addr_type):# 获取今天的年月日today = time.strftime("%Y-%m-%d", time.localtime())# print(today)if today == "2025-05-03":print("free test is ending!")return {"res": "free test is ending!"}client = OpenAI(api_key=get_apikey()['dashscope'],base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",)completion = client.chat.completions.create(model="qwen-vl-ocr",messages=[{"role": "user","content": [{"type": "image_url","image_url": f"data:image/{addr_type};base64,{base64_image_code}","min_pixels": 28 * 28 * 4,"max_pixels": 1280 * 784},# 目前为保证识别效果，模型内部会统一使用"Read all the text in the image."作为text的值，用户输入的文本不会生效。{"type": "text", "text": "Read all the text in the image."},]}])res_dict = json.loads(completion.model_dump_json())res_text = res_dict['choices'][0]['message']['content']return {"ocr_res": res_text}def qwen_max_repeat(content):client = OpenAI(api_key=get_apikey()['dashscope'],base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",)completion = client.chat.completions.create(model="qwen-max-2025-01-25",messages=[{"role": "system","content": """# 角色 # 你是一个学术科技论文修改专家，你需要对用户输入论文进行润色或修改，使其符合学术写作规范。# 任务 # 对用户上传文本进行润色和修改，主要从以下几个角度进行：1、使用正式的学术语言，避免口语化表达确保逻辑清晰。2、论述严谨，增强论证的说服力。3、确保描述的视角的一致性，保持与笔者的描述视角相同。4、改进句子结构，使其更简洁和规范，删除不必要的重复内容或过于冗长的表述。5、增强术语的使用，确保准确表达领域内的概念和观点。6、检查拼写、错别字、语法和标点符号错误，确保文本的语言准确无误。# 限制 # 不得编造。输出语言与输入语言保持严格一致。# 输出 # 只输出润色的结果，不输入任何其他的无关内容。并整理输出的排版与格式，包括缩进和换行等。"""},{"role": "user","content": f"""{content}"""}],max_tokens=4096,temperature=0.12,stream=False)res = json.loads(completion.model_dump_json())return {"response": res["choices"][0]["message"]["content"]}class paper_revision_input(BaseModel):base64_image_code: straddr_type: str@app.post("/paper_revision")
def main_process(func_input: paper_revision_input):content = qwen_ocr(func_input.base64_image_code, func_input.addr_type)out = qwen_max_repeat(content)return out["response"]if __name__ == "__main__":uvicorn.run(app, host="0.0.0.0", port=99)

接口为：http://47.94.104.2:99/paper_revision

第二步，在本地调用此服务

技术流程为：pdf读取--分割图像--图像文字识别--润色--输出

import io
from PIL import Image
import base64
import requests
from tqdm import tqdm# ===== pdf2img optional Method 1 ===== # 
# from pdf2image import convert_from_path     # 该方法需要安装poppler，linux下较为方便，在win则较为麻烦，win下建议使用方法2。
# def read_pdf2ImgLs(pdf_path) -> list:
#     images_ls = convert_from_path(pdf_path,dpi=300)
#     return images_ls# ===== pdf2img optional Method 2 ===== # 
import fitz # pip install pymupdf
def read_pdf2ImgLs(pdf_path) -> list:pdf = fitz.open(pdf_path)images_ls = []zoom_x = 2.0zoom_y = 2.0for i,pg in enumerate(pdf):mat = fitz.Matrix(zoom_x, zoom_y)pix = pg.get_pixmap(matrix=mat)img = Image.frombytes("RGB", [pix.width, pix.height], pix.samples)images_ls.append(img)return images_lsdef PILimage2base64(image):buffered = io.BytesIO()image_type = 'PNG'image.save(buffered, format=image_type)return base64.b64encode(buffered.getvalue()).decode(),image_typedef paper_revision(pdf_path):# 设置输出txt路径output_txt = 'output.txt'image_ls = read_pdf2ImgLs(pdf_path)for page,image in enumerate(tqdm(image_ls, desc='Processing pages')):base64code,addr_type = PILimage2base64(image)input_data = {"base64_image_code": base64code,"addr_type": addr_type,}repeat_response = requests.post('http://47.94.104.2:99/paper_revision',json=input_data,)assert repeat_response.status_code == 200result = repeat_response.content.decode('utf-8')cleaned_string = result.strip('"')decoded_string = cleaned_string.replace('\\n', '\n').replace('\\\\', '\\')with open(output_txt, 'a', encoding='utf-8') as f:f.write(decoded_string+'\n')f.write(f'(page:{page+1})\n')if __name__ == '__main__':paper_revision('test_file.pdf')

调用结果：