Extracting Online Data through Web Scraping in Python
Wednesday, March 13, 2024 1pm to 2:30pm
About this Event
Central Campus
https://forms.office.com/Pages/ResponsePage.aspx?id=ZkN-XZsbz0WOebFLJ99G4SUXqdLlghJDhzZ5bd0gLPtUQktOMFYzSVJPNldUT01PTjU0WExRS1lJUC4uJoin this workshop to discover how to pull online data using advanced web scraping techniques through Python. Learn skills to fill out forms, traverse website pages, extract text reviews and store online numeric data into a dataset for further analysis. This workshop will provide an interactive demo for extracting hotel reviews data (e.g., numeric rating, text reviews, date of stay, etc.) from Tripadvisor and storing it into a Python Pandas DataFrame. Using the Selenium library in Python, participants will be able to interact with a webpage to pull the desired information.
Learning Aims:
- Installation instructions for performing web scraping on personal computer (Python)
- Format of HTML data and how to pull desired fields
- Interactive demo: Tripadvisor Hotel reviews. Apply filters, traverse pages, extract text and store into a dataset for further analysis.
Prerequisites:
- Basic Python knowledge: Introduction to Python Part 1, Part 2
- Basic HTML knowledge: tutorial, tutorial2
- Web Scraping experience: Introduction to Web Scraping in Python BeautifulSoup Recording, Intermediate Web Scraping in Python Selenium Recording
Requirements:
Instructor: Jacob Grippin
Event Details
See Who Is Interested
0 people are interested in this event
User Activity
No recent activity