TY - JOUR
T1 - Web Scraping for Hospitality Research
T2 - Overview, Opportunities, and Implications
AU - Han, Saram
AU - Anderson, Christopher K.
N1 - Publisher Copyright:
© The Author(s) 2020.
PY - 2021/2
Y1 - 2021/2
N2 - As consumers increasingly research and purchase hospitality and travel services online, new research opportunities have become available to hospitality academics. There is a growing interest in understanding the online travel marketplace among hospitality researchers. Although many researchers have attempted to better understand the online travel market through the use of analytical models, experiments, or survey collection, these studies often fail to capture the full complexity of the market. Academics often rely upon survey data or experiments owing to their ease of collection or potentially to the difficulty in assembling online data. In this study, we hope to equip hospitality researchers with the tools and methods to augment their traditional data sources with the readily available data that consumers use to make their travel choices. In this article, we provide a guideline (and Python code) for how to best collect/scrape publicly available online hotel data. We focus on the collection of online data across numerous platforms, including online travel agents, review sites, and hotel brand sites. We outline some exciting possibilities regarding how these data sources might be utilized, as well as discuss some of the caveats that have to be considered when analyzing online data.
AB - As consumers increasingly research and purchase hospitality and travel services online, new research opportunities have become available to hospitality academics. There is a growing interest in understanding the online travel marketplace among hospitality researchers. Although many researchers have attempted to better understand the online travel market through the use of analytical models, experiments, or survey collection, these studies often fail to capture the full complexity of the market. Academics often rely upon survey data or experiments owing to their ease of collection or potentially to the difficulty in assembling online data. In this study, we hope to equip hospitality researchers with the tools and methods to augment their traditional data sources with the readily available data that consumers use to make their travel choices. In this article, we provide a guideline (and Python code) for how to best collect/scrape publicly available online hotel data. We focus on the collection of online data across numerous platforms, including online travel agents, review sites, and hotel brand sites. We outline some exciting possibilities regarding how these data sources might be utilized, as well as discuss some of the caveats that have to be considered when analyzing online data.
KW - data collection
KW - online review
KW - Python
KW - web scraping
UR - http://www.scopus.com/inward/record.url?scp=85096849869&partnerID=8YFLogxK
U2 - 10.1177/1938965520973587
DO - 10.1177/1938965520973587
M3 - Article
AN - SCOPUS:85096849869
SN - 1938-9655
VL - 62
SP - 89
EP - 104
JO - Cornell Hospitality Quarterly
JF - Cornell Hospitality Quarterly
IS - 1
ER -