Sometimes the data you want is spread across a multitude of pages on a website. The dataset you want isn't easily downloadable, rather the data is trapped in elements of these pages. Each page with the same structure, building a web scraper could help create the dataset you desire.
Before you scrape a site, you want to take legality into account. The programmatic nature of scraping means that reusing data gathered in this way usually does not qualify for fair use. For subscription databases, you'll want to take a look at the licensing agreement which will sometimes exclude scraping as an accepted use (look for language around automated or programmatic downloads in the Terms of Use).
Beyond the copyright and licensing of the data you are after, scraping may be disallowed by the hosting site due to the burden placed on the receiving server. Some content providers and databases have APIs explicitly for making programmatic calls, and it is always worth looking for them before creating a scraper.
Questions? Reach out to the Data and Digital Scholarship Librarian, Jess Yao.
This work by the Reed College Library is licensed under a Creative Commons CC-BY Attribution 4.0 International License.
Reed College Library | Email: library@reed.edu | Phone: 503-777-7702 | 3203 Southeast Woodstock Boulevard, Portland, Oregon 97202-8199