Input

prompt *

The text prompt used for video generation

image_url *

Click to upload or drag and drop

Supported formats: JPEG, PNG, WEBP Maximum file size: 10MB

URL of the input image. If the input image does not match the chosen aspect ratio, it is resized and center cropped

audio_url *

Click to upload or drag and drop

Supported formats: MP3, WAV, OGG, M4A, FLAC, AAC, X-MS-WMA, MPEG Maximum file size: 10MB

The URL of the audio file

num_frames

Number of frames to generate. Must be between 40 to 120, (must be multiple of 4)

frames_per_second

Frames per second of the generated video. Must be between 4 to 60. When using interpolation and adjust_fps_for_interpolation is set to true (default true,) the final FPS will be multiplied by the number of interpolated frames plus one. For example, if the generated frames per second is 16 and the number of interpolated frames is 1, the final frames per second will be 32. If adjust_fps_for_interpolation is set to false, this value will be used as-is

resolution

Resolution of the generated video (480p, 580p, or 720p)

negative_prompt

Negative prompt for video generation

seed

Random seed for reproducibility. If None, a random seed is chosen

num_inference_steps

Number of inference steps for sampling. Higher values give better quality but take longer

guidance_scale

Classifier-free guidance scale. Higher values give better adherence to the prompt but may decrease quality

shift

Shift value for the video. Must be between 1.0 and 10.0

enable_safety_checker

If set to true, input data will be checked for safety before processing

nsfw_checker

A configurable parameter. Defaults to true in the Playground.

Output

output typevideo

Examples

Explore different use cases and parameter configurations

README

Wan 2.2 A14B API 语音转视频：将音频转为精彩视频

用 Wan 2.2 A14B Turbo API 语音转视频，提升您的数字故事创作。这一革新性的 AI 模型可将静态图像与音频片段转为动态、富有表现力的视频，适合创作者、营销人员和教育者。立即在 Kie.ai 使用，享受无缝集成和卓越的视频生成效果。

获取 Wan 2.2 API 密钥

什么是 Wan 2.2 A14B API 语音转视频功能？

Wan 2.2 A14B API 是一款先进的开源 AI 模型，专为语音转视频生成而优化。以下是三大核心优势：

音频驱动的动画：

自动将音频与视觉元素同步，仅凭一张图像和一段音频即可创造栩栩如生的动作。

高清输出：

支持 480P–720P 输出，画面清晰、专业，适配多种应用场景。

MoE 架构的强大优势：

基于拥有 140 亿参数的专家混合（Mixture-of-Experts）架构，带来高效且高保真的生成效果。

Wan 2.2 A14B 语音转视频 API 主要功能

创新启示：

音频转视频的卓越表现：Wan 2.2 A14B 语音转视频 API 能将音频片段与静态图像转化为逼真动画，动作与表情自然精准。借助先进的同步技术，它能捕捉情感细节，打造沉浸式的叙事体验，非常适合电影级内容制作。

清晰无比：

高分辨率渲染：使用 Wan 2.2 API 可生成 480P–720P 的清晰视频，支持 24 fps 流畅播放。即使在普通硬件上也能达到专业级画质，适用于营销、教育等高清场景。

极速表现：

极速处理：Wan 2.2 A14B API 通过优化推理加速视频生成过程，720P 视频可在约 20–48 秒内生成。其 MoE 架构大幅提升效率，帮助创作者在紧迫的时间内快速迭代作品。

动作协调：

先进的口型同步技术：Wan 2.2 A14B Turbo API 实现音画完美同步，将音素精确转换为自然口型与面部表情，支持多种口音与情绪，呈现跨语言的逼真表演。

个性化定制：

LoRA 集成：Wan 2.2 API 支持使用 LoRA 适配器进行个性化微调，在低显存下也能实现风格定制，无需重训完整模型，帮助创作者制作品牌化或实验性视频。

高效之道：

MoE 架构：Wan 2.2 A14B 语音转视频 API 基于 140 亿参数的 MoE 架构，实现高效生成，支持文本转视频和图像转视频模式。同时，它能保持帧一致性，并支持双语字幕叠加，适用于可扩展且高效的应用场景。

如何使用 Wan 2.2 A14B 语音转视频 API？

只需几个简单步骤，即可开始使用。

注册并获取 API 权限：

在 Kie.ai 注册后即可获取 Wan 2.2 A14B Turbo API 语音转视频的 API 密钥。

准备输入：

上传静态图像和音频片段，确保符合支持的格式要求。

生成视频：

使用 API 端点提交请求，指定分辨率和参数。

下载并优化：

下载输出视频，并可结合 LoRA 进行个性化定制。

常见问题解答

在此查找关于本服务的常见问题答案。

FAQ

什么是 Wan 2.2 A14B Turbo API？

FAQ

Wan 2.2 A14B API 有哪些模型版本？

FAQ

Turbo 模式如何提升视频生成性能？

FAQ

运行 Wan 2.2 A14B API 需要本地 GPU 吗？

FAQ

我可以免费试用 Wan AI API 吗？

FAQ

Wan 2.2 A14B Turbo API 支持什么分辨率和帧率？

FAQ

Wan 2.2 A14B 和 Wan 2.1 有哪些区别？

FAQ