Abstract
High-quality image datasets are in high demand for various applications. With many online sources providing manually collected datasets, a persisting challenge is to fully automate the dataset collection process. In this study, we surveyed an automatic image dataset generation field through analyzing a collection of existing studies. Moreover, we examined fields that are closely related to automated dataset generation, such as query expansion, web scraping, and dataset quality. We assess how both noise and regional search engine differences can be addressed using an automated search query expansion focused on hypernyms, allowing for user-specific manual query expansion. Combining these aspects provides an outline of how a modern web scraping application can produce large-scale image datasets.
| Original language | English |
|---|---|
| Pages (from-to) | 602-613 |
| Number of pages | 12 |
| Journal | Journal of Information Processing Systems |
| Volume | 19 |
| Issue number | 4 |
| DOIs | |
| State | Published - 2023 |
Keywords
- Image Dataset Generation
- Query Expansion
- Web Scraping
Fingerprint
Dive into the research topics of 'A Brief Survey into the Field of Automatic Image Dataset Generation through Web Scraping and Query Expansion'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver