Framing the Sequence: Genre-Aligned Photo Curation via Shot-Scale Embedding

Youngsup Park, Yangmi Lim, Dongwann Kang

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents a lightweight, genre-conditioned photo curation framework that restructures user-selected image sequences based on cinematic shot scale patterns. Unlike prior frame-level approaches, our method explicitly models sequential rhythm and genre style. The proposed pipeline integrates (1) a MobileNetV3-based shot scale classifier optimized for on-device efficiency, (2) a conditional variational autoencoder (cVAE) for embedding temporal shot rhythms conditioned on genre, and (3) a similarity-driven adaptation module that adjusts sequences through swap and crop operations guided by latent distance reduction. Deployed as an iOS application, the system processes an 8-image sequence in ~2.02 s with a footprint under 3 MB. Quantitative evaluations show that the classifier achieved 69.9% Top-1 accuracy (F1 = 0.646), and that adaptation reduced latent distance by 22.7% compared to shuffled baselines. On-device tests confirmed practical feasibility. A user study (n = 24) using Likert ratings revealed that the method improved rhythm perception among film/media experts, though effects on genre recognition and preference were less consistent for general users. Overall, this work contributes a novel, style-aware, and mobile-ready sequencing framework that advances beyond prior frame-level methods and supports applications in memory curation, interactive storytelling, and mobile authoring.

Original languageEnglish
Article number3434
JournalElectronics (Switzerland)
Volume14
Issue number17
DOIs
StatePublished - Sep 2025

Keywords

  • genre conditioning
  • on-device inference
  • photo curation
  • sequence embedding
  • shot scale

Fingerprint

Dive into the research topics of 'Framing the Sequence: Genre-Aligned Photo Curation via Shot-Scale Embedding'. Together they form a unique fingerprint.

Cite this