HunyuanImage-3.0-Pixelart-Style-Adapter

Prompt
Astronaut riding a horse on the moon. In pixelart style.
Prompt
Person in hanfu. In pixelart style.
Prompt
Landscape with mountains, trees, and a river. In pixelart style.
Prompt
A portrait of Bill Gates wearing a black shirt. In pixelart style.
Prompt
A young boy in a forest with trees. In pixelart style.
Prompt
A young girl flying a kite on a windy hill. In pixelart style.

Trigger words

You should use In pixelart style. at the end of the prompt to trigger the image generation.

Training

This adapter was trained using this repo.

Configurations:

  • Hardware: Single NVIDIA A100 (80GB)
  • Dataset: 50 images, downscaled to 256×256 resolution
  • Training time: ~3 hours

License

This adapter is a Model Derivative of tencent/HunyuanImage-3.0 and is distributed under the Tencent Hunyuan Community License Agreement.

Use of these weights must comply with the Tencent Hunyuan Community License and its territorial and usage restrictions. See the LICENSE and NOTICE files in this repository for details.

This project is not affiliated with, associated with, sponsored by, or endorsed by Tencent.

Usage and Restrictions

  • This model/adapter is provided for non-commercial research and personal use only.
    Commercial use (including using the model or its outputs in a paid product, service, or large-scale deployment) is not permitted without obtaining appropriate permissions and verifying all relevant rights.

  • Users must comply with:

    • The Tencent Hunyuan Community License Agreement (including territory and acceptable-use limitations).
  • You are solely responsible for ensuring that your use of this adapter and any generated outputs respects third-party IP and applicable laws in your jurisdiction.


Model Details

Model Description

  • Developed by: Pixo
  • Model type: LoRA/adapter for style transfer
  • License: Tencent Hunyuan Community License Agreement
  • Finetuned from model: tencent/HunyuanImage-3.0
  • Language(s): English prompts

Uses

Direct Use

  • Apply a Pixel Art style aesthetic to images generated by HunyuanImage-3.0.
  • Use by loading the base model and applying this adapter.
  • Trigger phrase: In pixelart style. (must be at the end)

Out-of-Scope Use

  • Harmful, deceptive, or NSFW content.
  • Any use that violates the Tencent Hunyuan Community License Agreement.
  • Commercial use of the model or its outputs.

Bias, Risks, and Limitations

  • Trained on a small amount of images, so the style may overfit.
  • May distort anatomy or realism on complex subjects.
  • Style is intentionally exaggerated in anime fashion.

How to Get Started with the Model

1️⃣ Download and clone the HunyuanImage-3.0 repo

git clone https://github.com/Tencent-Hunyuan/HunyuanImage-3.0.git
cd HunyuanImage-3.0/

# Download base model
hf download tencent/HunyuanImage-3.0 --local-dir ./HunyuanImage-3

2️⃣ Download the adapter

# Download from HuggingFace
hf download pixosg/HunyuanImage-3.0-Pixelart-Style-Adapter --local-dir ./hunyuanimage-3-pixelart-style-adapter

3️⃣ Load the base model and adapter

from peft import PeftModel
from hunyuan_image_3.hunyuan import HunyuanImage3ForCausalMM
import torch

model_id = "./HunyuanImage-3"
adapter_model_path = "./hunyuanimage-3-pixelart-style-adapter"

kwargs = dict(
    attn_implementation="sdpa", # Use "flash_attention_2" if FlashAttention is installed
    trust_remote_code=True,
    dtype=torch.bfloat16,
    device_map="auto",
    moe_impl="eager",
    moe_drop_tokens=True,
)

model = HunyuanImage3ForCausalMM.from_pretrained(model_id, **kwargs)
model.load_tokenizer(model_id)

# Option 1
model.load_adapter(adapter_model_path)

# Option 2
model.get_input_embeddings = lambda: model.model.wte
model.set_input_embeddings = lambda value: setattr(model.model, 'wte', value)
model = PeftModel.from_pretrained(model, adapter_model_path, trust_remote_code=True)

# Generate image
prompt = "Astronaut riding a horse on the moon. In pixelart style."
image = model.generate_image(prompt=prompt, stream=True)
image.save("image.png")

4️⃣ Optimized Inference (8-bit Quantization)

While standard BF16 inference typically requires 3x NVIDIA A100 (80GB) GPUs, this 8-bit quantized configuration enables high-quality generation on a single NVIDIA H200 (141GB).

import torch
from transformers import BitsAndBytesConfig
from peft import PeftModel
from hunyuan_image_3.hunyuan import HunyuanImage3ForCausalMM

# ------------------------------------------------------------
# Patch (apply BEFORE model loading)
# Keep attention mask on the same device as the input tensor.
# ------------------------------------------------------------
_orig_prepare = HunyuanImage3ForCausalMM._prepare_attention_mask_for_generation

def _prepare_attention_mask_for_generation_patched(self, inputs_tensor, generation_config, model_kwargs):
    attn_mask = _orig_prepare(self, inputs_tensor, generation_config, model_kwargs)
    if attn_mask is not None and attn_mask.device != inputs_tensor.device:
        attn_mask = attn_mask.to(device=inputs_tensor.device)
    return attn_mask

HunyuanImage3ForCausalMM._prepare_attention_mask_for_generation = _prepare_attention_mask_for_generation_patched
# ------------------------------------------------------------

skip_modules = [
    "vae",
    "vision_model",
    "vision_aligner",
    "patch_embed",
    "timestep_emb",
    "time_embed",
    "time_embed_2",
    "final_layer",
    "lm_head",
]

quant_config = BitsAndBytesConfig(
    load_in_8bit=True,
    llm_int8_threshold=6.0,
    llm_int8_skip_modules=skip_modules,
    llm_int8_enable_fp32_cpu_offload=True,
)

model_id = "./HunyuanImage-3"
adapter_model_path = "./hunyuanimage-3-pixelart-style-adapter"

kwargs = dict(
    attn_implementation="sdpa", # Use "flash_attention_2" if available
    trust_remote_code=True,
    quantization_config=quant_config,
    dtype="auto",
    device_map="auto",
    moe_impl="eager",
    moe_drop_tokens=False,
)

model = HunyuanImage3ForCausalMM.from_pretrained(model_id, **kwargs)
model.load_tokenizer(model_id)

# Apply LoRA adapter
model.get_input_embeddings = lambda: model.model.wte
model.set_input_embeddings = lambda value: setattr(model.model, 'wte', value)
model = PeftModel.from_pretrained(model, adapter_model_path, trust_remote_code=True)

# Generate image
prompt = "Astronaut riding a horse on the moon. In pixelart style."
image = model.generate_image(prompt=prompt, stream=True)
image.save("image.png")
Downloads last month
6
Inference Examples
Examples
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pixosg/HunyuanImage-3.0-Pixelart-Style-Adapter

Adapter
(7)
this model