Amazon Bedrock 上线 Stable Diffusion 3.5 Large模型，助力高质量图像生成

前言

在2024 AWS re:Invent大会上预先宣布的重大更新现已上线！用户现可通过Amazon Bedrock访问Stable Diffusion 3.5 Large模型，生成高质量的图像，从文字描述中快速创作出丰富多样的艺术风格图像，为媒体、游戏、广告和零售行业客户提供概念设计、视觉特效及精美产品图的强大支持。

Stable Diffusion 3.5 Large

2024年10月，Stability AI发布了Stable Diffusion 3.5 Large，这款模型拥有81亿参数，是Stable Diffusion系列中最强大的版本，并在Amazon SageMaker HyperPod上完成训练。相比前代模型，它在图像质量与对提示的响应度上显著提升，特别适合以下场景：

故事版设计：加速制作故事版和概念设计图。
视觉特效：实现快速的效果原型开发。
高效创作：生成1百万像素的高质量图像，用于广告、社交媒体内容和宣传活动。

Stable Diffusion 3.5 Large 特点

多样风格：支持生成包括3D、摄影、绘画、线条画等多种视觉风格，满足无限创意需求。
精准提示响应：高级提示响应能力，精确实现用户的文本描述。
多元化输出：生成反映多样化世界的图像，无需复杂的提示设置。

Stable Image Ultra

Amazon Bedrock 已更新Stable Image Ultra 1.1模型，集成了Stable Diffusion 3.5 Large技术。新的Stable Image Ultra在图像生成上实现了以下突破：

优异的版式设计
复杂场景的创意构图
动态灯光与鲜艳色彩呈现
艺术风格的整体协调性

Amazon Bedrock 快速上手指南

1. 启用模型访问权限

通过Amazon Bedrock控制台开启Stability AI模型的访问权限，选择“Stable Diffusion 3.5 Large”。

要在 Amazon Bedrock 中测试 Stability AI 模型，请在左侧菜单窗格中选择Playgrounds下的图像。然后选择选择模型，并选择Stability AI作为类别，并选择Stable Diffusion 3.5 Large作为模型。

您可以使用提示生成图像。以下是生成图像的示例提示：

High-energy street scene in a neon-lit Tokyo alley at night, where steam rises from food carts, and colorful neon signs illuminate the rain-slicked pavement.

2. 使用示例命令生成图像

以下命令展示了如何通过AWS CLI生成一个霓虹东京街景的示例图像：

aws bedrock-runtime invoke-model \--model-id stability.sd3-5-large-v1:0 \--body "{\"text_prompts\":[{\"text\":\"High-energy street scene in a neon-lit Tokyo alley at night, where steam rises from food carts, and colorful neon signs illuminate the rain-slicked pavement.\",\"weight\":1}],\"cfg_scale\":0,\"steps\":10,\"seed\":0,\"width\":1024,\"height\":1024,\"samples\":1}" \--cli-binary-format raw-in-base64-out \--region us-west-2 \
/dev/stdout | jq -r '.images[0]' | base64 --decode > img.jpg

以下是如何使用 Stable Image Ultra 1.1 将 Stable Diffusion 3.5 Large 与AWS SDK for Python (Boto3)stability.stable-image-ultra-v1:1一起包含在模型的底层架构中。

这个简单的应用程序以交互方式请求文本到图像提示，然后调用 Amazon Bedrock 以生成具有模型 ID 的图像。

import base64
import boto3
import json
import osMODEL_ID = "stability.stable-image-ultra-v1:1"bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-west-2")print("Enter a prompt for the text-to-image model:")
prompt = input()body = {"prompt": prompt,"mode": "text-to-image"
}
response = bedrock_runtime.invoke_model(modelId=MODEL_ID, body=json.dumps(body))model_response = json.loads(response["body"].read())base64_image_data = model_response["images"][0]i, output_dir = 1, "output"
if not os.path.exists(output_dir):os.makedirs(output_dir)
while os.path.exists(os.path.join(output_dir, f"img_{i}.png")):i += 1image_data = base64.b64decode(base64_image_data)image_path = os.path.join(output_dir, f"img_{i}.png")
with open(image_path, "wb") as file:file.write(image_data)print(f"The generated image has been saved to {image_path}")

应用程序将生成的图像写入到output创建的目录中（如果不存在）。为了不覆盖现有文件，代码会检查现有文件以找到第一个符合该img_.png格式的文件名。