Abstract
This paper presents a lightweight, genre-conditioned photo curation framework that restructures user-selected image sequences based on cinematic shot scale patterns. Unlike prior frame-level approaches, our method explicitly models sequential rhythm and genre style. The proposed pipeline integrates (1) a MobileNetV3-based shot scale classifier optimized for on-device efficiency, (2) a conditional variational autoencoder (cVAE) for embedding temporal shot rhythms conditioned on genre, and (3) a similarity-driven adaptation module that adjusts sequences through swap and crop operations guided by latent distance reduction. Deployed as an iOS application, the system processes an 8-image sequence in ~2.02 s with a footprint under 3 MB. Quantitative evaluations show that the classifier achieved 69.9% Top-1 accuracy (F1 = 0.646), and that adaptation reduced latent distance by 22.7% compared to shuffled baselines. On-device tests confirmed practical feasibility. A user study (n = 24) using Likert ratings revealed that the method improved rhythm perception among film/media experts, though effects on genre recognition and preference were less consistent for general users. Overall, this work contributes a novel, style-aware, and mobile-ready sequencing framework that advances beyond prior frame-level methods and supports applications in memory curation, interactive storytelling, and mobile authoring.
| Original language | English |
|---|---|
| Article number | 3434 |
| Journal | Electronics (Switzerland) |
| Volume | 14 |
| Issue number | 17 |
| DOIs | |
| State | Published - Sep 2025 |
Keywords
- genre conditioning
- on-device inference
- photo curation
- sequence embedding
- shot scale
Fingerprint
Dive into the research topics of 'Framing the Sequence: Genre-Aligned Photo Curation via Shot-Scale Embedding'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver