Simple web scraping with Bash

With one line of Bash code, Pete scrapes the web and builds a desktop notification app to get the daily snow report.

While recently doing a small project, I was amazed by how much web scraping I could do with just one line of Bash. I used the text-based Lynx browser [1] and then piped the output to a grep search. Figure 1 shows the one-line Bash example that scrapes the current snow depth from the Sunshine Village Snow Forecast web page.

In this article, I will introduce some techniques to easily scrape web pages, and then I will create a desktop notification script that provides the daily snow forecast.

The Lynx Text Browser

For my Bash web scraping, I started out by looking at using command-line tools such as curl [2] with the html2text [3] utility. This technique definitely works, but I found that using the Lynx browser offers a one-step solution with a slightly cleaner text output.


Use Express-Checkout link below to read the full article (PDF).

Posted by Contributor