Converts 2D images into 3D models (.glb/.obj) using Hunyuan3D-2.1. Use this skill when the user explicitly asks to generate or create a 3D asset from a picture.
Install
npx skillscat add catfishw/i23dagentskill Install via the SkillsCat registry.
Image to 3D Agent Skill
This skill allows you (the AI agent) to convert 2D images into 3D meshes with PBR textures. It is powered by Hunyuan3D-2.1.
When to use
Use this skill when the user explicitly asks to generate, create, or build a 3D asset, model, or mesh from a 2D image or picture.
How to Execute
You have two ways to interact with this skill. Since the code is located in the same directory as this file, you can run the scripts directly:
Option 1: CLI Wrapper (Terminal)
You can execute the Node.js CLI script directly:
# Generate a 3D model with background removal and textures (default)
node ./bin/cli.js /path/to/input.png -o /path/to/output.glb
# Disable background removal and textures (faster)
node ./bin/cli.js /path/to/input.png -o /path/to/output.glb --no-bg --no-textureNote: The CLI connects to the backend API (default: http://localhost:23555/I23D). Generation takes 1-3 minutes.
Option 2: Model Context Protocol (MCP) Server
If your agent framework supports MCP, you can connect to the server by executing:
node ./mcp.jsThis will expose the following tools to you over stdio:
check_server_status: Verify the backend is reachable.generate_3d_model: Takes animagePathand anoutputPath. Returns success or semantic error messages.
API Configuration
By default, this skill connects to a local backend at http://localhost:23555/I23D.
To use a remote backend (e.g., https://mc.agaii.org/I23D/), set the environment variable:
export I23D_API_URL=https://mc.agaii.org/I23DOr pass it directly to the CLI:
node ./bin/cli.js /path/to/input.png -o output.glb -u https://mc.agaii.org/I23DImportant Rules & Constraints
- Backend Required: This skill requires a running Hunyuan3D-2.1 backend API. Start it locally with
python api_server_enhanced.py --port 23555in the backend folder. - Patience: 3D generation is computationally heavy. Expect 60 seconds to 3 minutes per model.
- Format: Only
.glb(glTF binary) format is officially supported for the output. - Input Constraints: The image should ideally feature a single, clear subject in the center. While the tool attempts background removal automatically, highly cluttered images might fail.