xinference docker 部署方式

server/2025/3/6 19:52:08/

文章目录

    • 简绍
    • docker 安装方式
    • 访问地址
    • 对应官网
    • 在 dify 中 添加 xinference 容器
    • 内置大语言模型
    • 嵌入模型
    • 图像模型
    • 音频模型
    • 重排序模型
    • 视频模型

在这里插入图片描述

简绍

Xorbits Inference (Xinference) 是一个开源平台,用于简化各种 AI 模型的运行和集成。借助 Xinference,您可以使用任何开源 LLM、嵌入模型和多模态模型在云端或本地环境中运行推理,并创建强大的 AI 应用。

docker__6">docker 安装方式

docker 下载对应的 xinference

docker pull xprobe/xinference

docker 运行,注意 路径改成自己的,

docker run  -d  --name xinference --gpus all  -v E:/docker/xinference/models:/root/models  -v E:/docker/xinference/.xinference:/root/.xinference -v E:/docker/xinference/.cache/huggingface:/root/.cache/huggingface -e XINFERENCE_HOME=/root/models  -p 9997:9997 xprobe/xinference:latest xinference-local -H 0.0.0.0
  • -d: 让容器在后台运行。
  • --name xinference: 为容器指定一个名称,这里是xinference。
  • --gpus all: 允许容器访问主机上的所有GPU,这对于需要进行大量计算的任务(如机器学习模型的推理)非常有用。
  • -v E:/docker/xinference/models:/root/models, -v E:/docker/xinference/.xinference:/root/.xinference, -v E:/docker/xinference/.cache/huggingface:/root/.cache/huggingface: 这些参数用于将主机的目录挂载到容器内部的特定路径,以便于数据持久化和共享。例如,第一个挂载是将主机的E:/docker/xinference/models目录映射到容器内的/root/models目录。
  • -e XINFERENCE_HOME=/root/models: 设置环境变量XINFERENCE_HOME,其值为/root/models,这可能是在容器内配置某些应用行为的方式。
  • -p 9997:9997: 将主机的9997端口映射到容器的9997端口,允许外部通过主机的该端口访问容器的服务。
  • xprobe/xinference:latest: 指定要使用的镜像和标签,这里使用的是xprobe/xinference镜像的latest版本。
  • xinference-local -H 0.0.0.0: 在容器启动时执行的命令,看起来像是以本地模式运行某个服务,并监听所有网络接口。

访问地址

http://127.0.0.1:9997/
在这里插入图片描述

对应官网

https://inference.readthedocs.io/zh-cn/latest/index.html

在 dify 中 添加 xinference 容器

docker dify 添加 docker 容器内ip 配置

http://host.docker.internal:9997

内置大语言模型

MODEL NAME

ABILITIES

COTNEXT_LENGTH

DESCRIPTION

aquila2

generate

2048

Aquila2 series models are the base language models

aquila2-chat

chat

2048

Aquila2-chat series models are the chat models

aquila2-chat-16k

chat

16384

AquilaChat2-16k series models are the long-text chat models

baichuan-2

generate

4096

Baichuan2 is an open-source Transformer based LLM that is trained on both Chinese and English data.

baichuan-2-chat

chat

4096

Baichuan2-chat is a fine-tuned version of the Baichuan LLM, specializing in chatting.

c4ai-command-r-v01

chat

131072

C4AI Command-R(+) is a research release of a 35 and 104 billion parameter highly performant generative model.

code-llama

generate

100000

Code-Llama is an open-source LLM trained by fine-tuning LLaMA2 for generating and discussing code.

code-llama-instruct

chat

100000

Code-Llama-Instruct is an instruct-tuned version of the Code-Llama LLM.

code-llama-python

generate

100000

Code-Llama-Python is a fine-tuned version of the Code-Llama LLM, specializing in Python.

codegeex4

chat

131072

the open-source version of the latest CodeGeeX4 model series

codeqwen1.5

generate

65536

CodeQwen1.5 is the Code-Specific version of Qwen1.5. It is a transformer-based decoder-only language model pretrained on a large amount of data of codes.

codeqwen1.5-chat

chat

65536

CodeQwen1.5 is the Code-Specific version of Qwen1.5. It is a transformer-based decoder-only language model pretrained on a large amount of data of codes.

codeshell

generate

8194

CodeShell is a multi-language code LLM developed by the Knowledge Computing Lab of Peking University.

codeshell-chat

chat

8194

CodeShell is a multi-language code LLM developed by the Knowledge Computing Lab of Peking University.

codestral-v0.1

generate

32768

Codestrall-22B-v0.1 is trained on a diverse dataset of 80+ programming languages, including the most popular ones, such as Python, Java, C, C++, JavaScript, and Bash

cogagent

chat, vision

4096

The CogAgent-9B-20241220 model is based on GLM-4V-9B, a bilingual open-source VLM base model. Through data collection and optimization, multi-stage training, and strategy improvements, CogAgent-9B-20241220 achieves significant advancements in GUI perception, inference prediction accuracy, action space completeness, and task generalizability.

cogvlm2

chat, vision

8192

CogVLM2 have achieved good results in many lists compared to the previous generation of CogVLM open source models. Its excellent performance can compete with some non-open source models.

cogvlm2-video-llama3-chat

chat, vision

8192

CogVLM2-Video achieves state-of-the-art performance on multiple video question answering tasks.

csg-wukong-chat-v0.1

chat

32768

csg-wukong-1B is a 1 billion-parameter small language model(SLM) pretrained on 1T tokens.

deepseek

generate

4096

DeepSeek LLM, trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese.

deepseek-chat

chat

4096

DeepSeek LLM is an advanced language model comprising 67 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese.

deepseek-coder

generate

16384

Deepseek Coder is composed of a series of code language models, each trained from scratch on 2T tokens, with a composition of 87% code and 13% natural language in both English and Chinese.

deepseek-coder-instruct

chat

16384

deepseek-coder-instruct is a model initialized from deepseek-coder-base and fine-tuned on 2B tokens of instruction data.

deepseek-r1

chat

163840

DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.

deepseek-r1-distill-llama

chat

131072

deepseek-r1-distill-llama is distilled from DeepSeek-R1 based on Llama

deepseek-r1-distill-qwen

chat

131072

deepseek-r1-distill-qwen is distilled from DeepSeek-R1 based on Qwen

deepseek-v2

generate

128000

DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference.

deepseek-v2-chat

chat

128000

DeepSeek-V2, a strong Mixture-of-Experts (MoE) language model characterized by economical training and efficient inference.

deepseek-v2-chat-0628

chat

128000

DeepSeek-V2-Chat-0628 is an improved version of DeepSeek-V2-Chat.

deepseek-v2.5

chat

128000

DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct. The new model integrates the general and coding abilities of the two previous versions.

deepseek-v3

chat

163840

DeepSeek-V3, a strong Mixture-of-Experts (MoE) language model with 671B total parameters with 37B activated for each token.

deepseek-vl-chat

chat, vision

4096

DeepSeek-VL possesses general multimodal understanding capabilities, capable of processing logical diagrams, web pages, formula recognition, scientific literature, natural images, and embodied intelligence in complex scenarios.

gemma-2-it

chat

8192

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

gemma-it

chat

8192

Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.

glm-4v

chat, vision

8192

GLM4 is the open source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI.

glm-edge-chat

chat

8192

The GLM-Edge series is our attempt to face the end-side real-life scenarios, which consists of two sizes of large-language dialogue models and multimodal comprehension models (GLM-Edge-1.5B-Chat, GLM-Edge-4B-Chat, GLM-Edge-V-2B, GLM-Edge-V-5B). Among them, the 1.5B / 2B model is mainly for platforms such as mobile phones and cars, and the 4B / 5B model is mainly for platforms such as PCs.

glm-edge-v

chat, vision

8192

The GLM-Edge series is our attempt to face the end-side real-life scenarios, which consists of two sizes of large-language dialogue models and multimodal comprehension models (GLM-Edge-1.5B-Chat, GLM-Edge-4B-Chat, GLM-Edge-V-2B, GLM-Edge-V-5B). Among them, the 1.5B / 2B model is mainly for platforms such as mobile phones and cars, and the 4B / 5B model is mainly for platforms such as PCs.

glm4-chat

chat, tools

131072

GLM4 is the open source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI.

glm4-chat-1m

chat, tools

1048576

GLM4 is the open source version of the latest generation of pre-trained models in the GLM-4 series launched by Zhipu AI.

gorilla-openfunctions-v2

chat

4096

OpenFunctions is designed to extend Large Language Model (LLM) Chat Completion feature to formulate executable APIs call given natural language instructions and API context.

gpt-2

generate

1024

GPT-2 is a Transformer-based LLM that is trained on WebTest, a 40 GB dataset of Reddit posts with 3+ upvotes.

internlm2-chat

chat

32768

The second generation of the InternLM model, InternLM2.

internlm2.5-chat

chat

32768

InternLM2.5 series of the InternLM model.

internlm2.5-chat-1m

chat

262144

InternLM2.5 series of the InternLM model supports 1M long-context

internlm3-instruct

chat, tools

32768

InternLM3 has open-sourced an 8-billion parameter instruction model, InternLM3-8B-Instruct, designed for general-purpose usage and advanced reasoning.

internvl-chat

chat, vision

32768

InternVL 1.5 is an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding.

internvl2

chat, vision

32768

InternVL 2 is an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding.

llama-2

generate

4096

Llama-2 is the second generation of Llama, open-source and trained on a larger amount of data.

llama-2-chat

chat

4096

Llama-2-Chat is a fine-tuned version of the Llama-2 LLM, specializing in chatting.

llama-3

generate

8192

Llama 3 is an auto-regressive language model that uses an optimized transformer architecture

llama-3-instruct

chat

8192

The Llama 3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks..

llama-3.1

generate

131072

Llama 3.1 is an auto-regressive language model that uses an optimized transformer architecture

llama-3.1-instruct

chat, tools

131072

The Llama 3.1 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks..

llama-3.2-vision

generate, vision

131072

The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image…

llama-3.2-vision-instruct

chat, vision

131072

Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image…

llama-3.3-instruct

chat, tools

131072

The Llama 3.3 instruction tuned models are optimized for dialogue use cases and outperform many of the available open source chat models on common industry benchmarks..

marco-o1

chat, tools

32768

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

minicpm-2b-dpo-bf16

chat

4096

MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings.

minicpm-2b-dpo-fp16

chat

4096

MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings.

minicpm-2b-dpo-fp32

chat

4096

MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings.

minicpm-2b-sft-bf16

chat

4096

MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings.

minicpm-2b-sft-fp32

chat

4096

MiniCPM is an End-Size LLM developed by ModelBest Inc. and TsinghuaNLP, with only 2.4B parameters excluding embeddings.

minicpm-llama3-v-2_5

chat, vision

8192

MiniCPM-Llama3-V 2.5 is the latest model in the MiniCPM-V series. The model is built on SigLip-400M and Llama3-8B-Instruct with a total of 8B parameters.

minicpm-v-2.6

chat, vision

32768

MiniCPM-V 2.6 is the latest model in the MiniCPM-V series. The model is built on SigLip-400M and Qwen2-7B with a total of 8B parameters.

minicpm3-4b

chat

32768

MiniCPM3-4B is the 3rd generation of MiniCPM series. The overall performance of MiniCPM3-4B surpasses Phi-3.5-mini-Instruct and GPT-3.5-Turbo-0125, being comparable with many recent 7B~9B models.

mistral-instruct-v0.1

chat

8192

Mistral-7B-Instruct is a fine-tuned version of the Mistral-7B LLM on public datasets, specializing in chatting.

mistral-instruct-v0.2

chat

8192

The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1.

mistral-instruct-v0.3

chat

32768

The Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an improved instruct fine-tuned version of Mistral-7B-Instruct-v0.1.

mistral-large-instruct

chat

131072

Mistral-Large-Instruct-2407 is an advanced dense Large Language Model (LLM) of 123B parameters with state-of-the-art reasoning, knowledge and coding capabilities.

mistral-nemo-instruct

chat

1024000

The Mistral-Nemo-Instruct-2407 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-Nemo-Base-2407

mistral-v0.1

generate

8192

Mistral-7B is a unmoderated Transformer based LLM claiming to outperform Llama2 on all benchmarks.

mixtral-8x22b-instruct-v0.1

chat

65536

The Mixtral-8x22B-Instruct-v0.1 Large Language Model (LLM) is an instruct fine-tuned version of the Mixtral-8x22B-v0.1, specializing in chatting.

mixtral-instruct-v0.1

chat

32768

Mistral-8x7B-Instruct is a fine-tuned version of the Mistral-8x7B LLM, specializing in chatting.

mixtral-v0.1

generate

32768

The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.

omnilmm

chat, vision

2048

OmniLMM is a family of open-source large multimodal models (LMMs) adept at vision & language modeling.

openhermes-2.5

chat

8192

Openhermes 2.5 is a fine-tuned version of Mistral-7B-v0.1 on primarily GPT-4 generated data.

opt

generate

2048

Opt is an open-source, decoder-only, Transformer based LLM that was designed to replicate GPT-3.

orion-chat

chat

4096

Orion-14B series models are open-source multilingual large language models trained from scratch by OrionStarAI.

orion-chat-rag

chat

4096

Orion-14B series models are open-source multilingual large language models trained from scratch by OrionStarAI.

phi-2

generate

2048

Phi-2 is a 2.7B Transformer based LLM used for research on model safety, trained with data similar to Phi-1.5 but augmented with synthetic texts and curated websites.

phi-3-mini-128k-instruct

chat

128000

The Phi-3-Mini-128K-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets.

phi-3-mini-4k-instruct

chat

4096

The Phi-3-Mini-4k-Instruct is a 3.8 billion-parameter, lightweight, state-of-the-art open model trained using the Phi-3 datasets.

platypus2-70b-instruct

generate

4096

Platypus-70B-instruct is a merge of garage-bAInd/Platypus2-70B and upstage/Llama-2-70b-instruct-v2.

qvq-72b-preview

chat, vision

32768

QVQ-72B-Preview is an experimental research model developed by the Qwen team, focusing on enhancing visual reasoning capabilities.

qwen-chat

chat

32768

Qwen-chat is a fine-tuned version of the Qwen LLM trained with alignment techniques, specializing in chatting.

qwen-vl-chat

chat, vision

4096

Qwen-VL-Chat supports more flexible interaction, such as multiple image inputs, multi-round question answering, and creative capabilities.

qwen1.5-chat

chat, tools

32768

Qwen1.5 is the beta version of Qwen2, a transformer-based decoder-only language model pretrained on a large amount of data.

qwen1.5-moe-chat

chat, tools

32768

Qwen1.5-MoE is a transformer-based MoE decoder-only language model pretrained on a large amount of data.

qwen2-audio

generate, audio

32768

Qwen2-Audio: A large-scale audio-language model which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions.

qwen2-audio-instruct

chat, audio

32768

Qwen2-Audio: A large-scale audio-language model which is capable of accepting various audio signal inputs and performing audio analysis or direct textual responses with regard to speech instructions.

qwen2-instruct

chat, tools

32768

Qwen2 is the new series of Qwen large language models

qwen2-moe-instruct

chat, tools

32768

Qwen2 is the new series of Qwen large language models.

qwen2-vl-instruct

chat, vision

32768

Qwen2-VL: To See the World More Clearly.Qwen2-VL is the latest version of the vision language models in the Qwen model familities.

qwen2.5

generate

32768

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.

qwen2.5-coder

generate

32768

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).

qwen2.5-coder-instruct

chat, tools

32768

Qwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen).

qwen2.5-instruct

chat, tools

32768

Qwen2.5 is the latest series of Qwen large language models. For Qwen2.5, we release a number of base language models and instruction-tuned language models ranging from 0.5 to 72 billion parameters.

qwen2.5-vl-instruct

chat, vision

128000

Qwen2.5-VL: Qwen2.5-VL is the latest version of the vision language models in the Qwen model familities.

qwq-32b-preview

chat

32768

QwQ-32B-Preview is an experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities.

seallm_v2

generate

8192

We introduce SeaLLM-7B-v2, the state-of-the-art multilingual LLM for Southeast Asian (SEA) languages

seallm_v2.5

generate

8192

We introduce SeaLLM-7B-v2.5, the state-of-the-art multilingual LLM for Southeast Asian (SEA) languages

skywork

generate

4096

Skywork is a series of large models developed by the Kunlun Group · Skywork team.

skywork-math

generate

4096

Skywork is a series of large models developed by the Kunlun Group · Skywork team.

starling-lm

chat

4096

We introduce Starling-7B, an open large language model (LLM) trained by Reinforcement Learning from AI Feedback (RLAIF). The model harnesses the power of our new GPT-4 labeled ranking dataset

telechat

chat

8192

The TeleChat is a large language model developed and trained by China Telecom Artificial Intelligence Technology Co., LTD. The 7B model base is trained with 1.5 trillion Tokens and 3 trillion Tokens and Chinese high-quality corpus.

tiny-llama

generate

2048

The TinyLlama project aims to pretrain a 1.1B Llama model on 3 trillion tokens.

wizardcoder-python-v1.0

chat

100000

wizardmath-v1.0

chat

2048

WizardMath is an open-source LLM trained by fine-tuning Llama2 with Evol-Instruct, specializing in math.

xverse

generate

2048

XVERSE is a multilingual large language model, independently developed by Shenzhen Yuanxiang Technology.

xverse-chat

chat

2048

XVERSEB-Chat is the aligned version of model XVERSE.

yi

generate

4096

The Yi series models are large language models trained from scratch by developers at 01.AI.

yi-1.5

generate

4096

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

yi-1.5-chat

chat

4096

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

yi-1.5-chat-16k

chat

16384

Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

yi-200k

generate

262144

The Yi series models are large language models trained from scratch by developers at 01.AI.

yi-chat

chat

4096

The Yi series models are large language models trained from scratch by developers at 01.AI.

yi-coder

generate

131072

Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.Excelling in long-context understanding with a maximum context length of 128K tokens.Supporting 52 major programming languages, including popular ones such as Java, Python, JavaScript, and C++.

yi-coder-chat

chat

131072

Yi-Coder is a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.Excelling in long-context understanding with a maximum context length of 128K tokens.Supporting 52 major programming languages, including popular ones such as Java, Python, JavaScript, and C++.

yi-vl-chat

chat, vision

4096

Yi Vision Language (Yi-VL) model is the open-source, multimodal version of the Yi Large Language Model (LLM) series, enabling content comprehension, recognition, and multi-round conversations about images.

嵌入模型

  • bce-embedding-base_v1
  • bge-base-en
  • bge-base-en-v1.5
  • bge-base-zh
  • bge-base-zh-v1.5
  • bge-large-en
  • bge-large-en-v1.5
  • bge-large-zh
  • bge-large-zh-noinstruct
  • bge-large-zh-v1.5
  • bge-m3
  • bge-small-en-v1.5
  • bge-small-zh
  • bge-small-zh-v1.5
  • e5-large-v2
  • gte-base
  • gte-large
  • gte-Qwen2
  • jina-clip-v2
  • jina-embeddings-v2-base-en
  • jina-embeddings-v2-base-zh
  • jina-embeddings-v2-small-en
  • jina-embeddings-v3
  • m3e-base
  • m3e-large
  • m3e-small
  • multilingual-e5-large
  • text2vec-base-chinese
  • text2vec-base-chinese-paraphrase
  • text2vec-base-chinese-sentence
  • text2vec-base-multilingual
  • text2vec-large-chinese
  • FLUX.1-dev
  • FLUX.1-schnell
  • GOT-OCR2_0
  • HunyuanDiT-v1.2
  • HunyuanDiT-v1.2-Distilled
  • kolors
  • sd-turbo
  • sd3-medium
  • sd3.5-large
  • sd3.5-large-turbo
  • sd3.5-medium
  • sdxl-turbo
  • stable-diffusion-2-inpainting
  • stable-diffusion-inpainting
  • stable-diffusion-v1.5
  • stable-diffusion-xl-base-1.0
  • stable-diffusion-xl-inpainting

图像模型

  • FLUX.1-dev
  • FLUX.1-schnell
  • GOT-OCR2_0
  • HunyuanDiT-v1.2
  • HunyuanDiT-v1.2-Distilled
  • kolors
  • sd-turbo
  • sd3-medium
  • sd3.5-large
  • sd3.5-large-turbo
  • sd3.5-medium
  • sdxl-turbo
  • stable-diffusion-2-inpainting
  • stable-diffusion-inpainting
  • stable-diffusion-v1.5
  • stable-diffusion-xl-base-1.0
  • stable-diffusion-xl-inpainting

音频模型

以下是 Xinference 中内置的音频模型列表:

  • Belle-distilwhisper-large-v2-zh
  • Belle-whisper-large-v2-zh
  • Belle-whisper-large-v3-zh
  • ChatTTS
  • CosyVoice-300M
  • CosyVoice-300M-Instruct
  • CosyVoice-300M-SFT
  • CosyVoice2-0.5B
  • F5-TTS
  • F5-TTS-MLX
  • FishSpeech-1.5
  • Kokoro-82M
  • MeloTTS-Chinese
  • MeloTTS-English
  • MeloTTS-English-v2
  • MeloTTS-English-v3
  • MeloTTS-French
  • MeloTTS-Japanese
  • MeloTTS-Korean
  • MeloTTS-Spanish
  • SenseVoiceSmall
  • whisper-base
  • whisper-base-mlx
  • whisper-base.en
  • whisper-base.en-mlx
  • whisper-large-v3
  • whisper-large-v3-mlx
  • whisper-large-v3-turbo
  • whisper-large-v3-turbo-mlx
  • whisper-medium
  • whisper-medium-mlx
  • whisper-medium.en
  • whisper-medium.en-mlx
  • whisper-small
  • whisper-small-mlx
  • whisper-small.en
  • whisper-small.en-mlx
  • whisper-tiny
  • whisper-tiny-mlx
  • whisper-tiny.en
  • whisper-tiny.en-mlx

重排序模型

以下是 Xinference 中内置的重排序模型列表:

  • bce-reranker-base_v1
  • bge-reranker-base
  • bge-reranker-large
  • bge-reranker-v2-gemma
  • bge-reranker-v2-m3
  • bge-reranker-v2-minicpm-layerwise
  • jina-reranker-v2
  • minicpm-reranker

视频模型

以下是 Xinference 中内置的视频模型列表:

  • CogVideoX-2b
  • CogVideoX-5b
  • HunyuanVideo

http://www.ppmy.cn/server/172952.html

相关文章

解决git add . + git commit之后文件状态还是M 问题

在每次 git add . 和 git commit -m "" 之后,都会有很多文件依旧保持M状态。 原因是每次我使用了huskycommitlint,我的pre-commit里面运行了代码格式化工具, #!/usr/bin/env sh . "$(dirname -- "$0")/_/husky.sh…

Trae 是一款由 AI 驱动的 IDE,让编程更加愉悦和高效。国际版集成了 GPT-4 和 Claude 3.5,国内版集成了DeepSeek-r1

Trae 是一款由 AI 驱动的 IDE,让编程更加愉悦和高效。国际版集成了 GPT-4 和 Claude 3.5,国内版继承了DeepSeek-r1,支持实时代码建议和无缝 GitHub 集成。 当前国内和国际版的AI都是免费的。 安装 国际版安装 国际版下载:下载…

每日一题——接雨水

接雨水问题详解 问题描述 给定一个非负整数数组 height,表示每个宽度为 1 的柱子的高度图。计算按此排列的柱子,下雨之后能接多少雨水。 示例 示例 1: 输入:height [0,1,0,2,1,0,1,3,2,1,2,1] 输出:6 解释&#…

Excel 豆知识 - XLOOKUP 为啥会出 #N/A 错误

XLOOKUP有的时候会出 #VALUE! 这个错误。 因为这个XLOOUP有个参数叫 找不到时的返回值,那么为啥还会返回 #VALUE! 呢? 可能还有别的原因,但是主要原因应该就是 检索范围 和 返回范围 不同。 比如这里检索范围在 B列,是 4-21&…

3.6V-30V宽压输入降压同步IC内置MOS,电流4A/5A/6A,可以满足汽车应急电源,BMS电池,电池组USB口输出等储能应用

今天给大家介绍一下这三款产品,分别是CJ92340,输入电压4.5V-30V,输出可调,电流负载能力可达4A,频率350KHZ。CJ92350,输入电压3.6V-30V,输出可调,频率可调,带载能力达5A。CJ92360,输入电压3.6V-3…

深度探索:美团开源DeepSeek R1 INT8量化技术的性能革命

摘要 美团搜索推荐机器学习团队近日发布了一项重要开源成果——DeepSeek R1的INT8无损满血版。该模型部署在A100硬件上,采用INT8量化技术,在保持BF16精度的同时,实现了高达50%的吞吐量提升。这一突破使得老旧显卡无需更换硬件即可获得显著性能…

C++(蓝桥杯常考点)

前言:这个是针对于蓝桥杯竞赛常考的C内容,容器这些等下棋期再讲 C 在DEVC中注释和取消注释的方法:ctrl/ ASCII值(常用的): A-Z:65-90 a-z:97-122 0-9:48-57 换行/n:10科学计数法:eg&#xff1a…

机器学习-决策树详细解释

目录 一、预备知识 1.信息熵: 2.条件熵: 3.信息增益 4.基于信息增益选择分割特征的过程 5. C4.5算法 6.C435算法选择特征的策略 7 基尼不纯度: 二. 决策树的核心概念 ​1.树的结构 ​2.关键算法 三. 决策树的构建过程 1.特征选择 2.递归分割 3.停止条件 四. 决…