APT: Adaptive Personalized Training for Diffusion Models with Limited Data

  • Jung Woo Chae
  • , Jiyoon Kim
  • , Jae Woong Choi
  • , Kyungyul Kim
  • , Sangheum Hwang

Research output: Contribution to journalConference articlepeer-review

Abstract

Personalizing diffusion models using limited data presents significant challenges, including overfitting, loss of prior knowledge, and degradation of text alignment. Overfitting leads to shifts in the noise prediction distribution, disrupting the denoising trajectory and causing the model to lose semantic coherence. In this paper, we propose Adaptive Personalized Training (APT), a novel framework that mitigates overfitting by employing adaptive training strategies and regularizing the model's internal representations during fine-tuning. APT consists of three key components: (1) Adaptive Training Adjustment, which introduces an overfitting indicator to detect the degree of overfitting at each time step bin and applies adaptive data augmentation and adaptive loss weighting based on this indicator; (2) Representation Stabilization, which regularizes the mean and variance of intermediate feature maps to prevent excessive shifts in noise prediction; and (3) Attention Alignment for Prior Knowledge Preservation, which aligns the cross-attention maps of the fine-tuned model with those of the pre-trained model to maintain prior knowledge and semantic coherence. Through extensive experiments, we demonstrate that APT effectively mitigates overfitting, preserves prior knowledge, and outperforms existing methods in generating high-quality, diverse images with limited reference data.

Original languageEnglish
Pages (from-to)28619-28628
Number of pages10
JournalProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
DOIs
StatePublished - 2025
Event2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025 - Nashville, United States
Duration: 11 Jun 202515 Jun 2025

Fingerprint

Dive into the research topics of 'APT: Adaptive Personalized Training for Diffusion Models with Limited Data'. Together they form a unique fingerprint.

Cite this