I wanted to compare zero sugar syrup pricing from several websites. However, typing or copy-pasting data would take a lot of time. So I asked Chat-GPT how to get Data from web-pages into a Google sheet. After several Google import types needing structured webpages, the scrapers were mentioned. So I tried out Octoparse.
Octoparse has a free tier that limits to just 10 tasks that need to be run from your computer and limits exports to just 10.000 records. Enough for what I’m looking for.
So I visited the supermarket website, and performed a product search for zero sugar syrups. Then created a new Octoparse tast on my local install, pasting the URL that presented me with the search results.
In a couple of minutes, I was able to scrape the webpages data, perform some cleaning, and gather the data in a table to my liking. Now I had the data, I wanted to import it into my Google sheet. That was possible! It needed a lot of steps into Google Web services and API’s, but the help was excellent and I was able to create an export connection. This enabled me to push the gathered data into the second sheet of my Google sheet. I used a second sheet because I want te review the data before it is added to my main table on sheet 1.
You can perform actions on any field. For instance these actions are taken on the Inhoud (content) field:
-
- Replace “l” with “”. So remove the small letter L (for liter)
-
- Remove all leading and training spaces
For the Prijs (Price) field, I only needed to replace the dots with comma’s to make it a valid value.
There are loads more possibilities in Octoparse, like for instance scan through the found objects and run a subtask on them.