Table of Contents
From Netflix to Prime Video to Disney+, we’ve never had more TV shows available to watch. At the same time, picking the best next candidate has become increasingly difficult. We want to get the most out of the time we spend in front of the TV, but we don’t always want to let Netflix choose what we should watch next.
If you’re in a similar situation, you’ve come to the right place. In this post, we’ll show you how to use Notion and Google Sheets to import information from IMDb and create a list of shows to watch for your next binge session.
Choosing a Notion Template
We searched for the best Notion database template to track our shows and decided this free TV Show Tracker template is one of the best.
We will be making a couple of changes to the database to add additional information. We want to show the rating from IMDb and the current popularity of each TV show, so we need to create two additional database properties with the Number type:
Synchronizing the Database with Google Sheets
We plan to get the IMDb rating and popularity from IMDb and update our Notion database with that information.
Doing this manually would be tedious. To save time, we’ll use a feature from Sync2Sheets called synced columns.
Synced columns allow us to send data from Google Sheets to Notion. Think of it as something similar to a formula property in Notion, where the data comes from Google Sheets.
If you don’t have Sync2Sheets installed yet, you can download it from the Google Workspace Marketplace and watch this 3-minute YouTube video to learn how to use it.
Using IMPORTXML to Get the Information We Want from IMDb
If you haven’t heard about the =IMPORTXML
formula before, it allows you to import data from various structured data types including XML, HTML, and CSV. In our case, we’ll use it to extract the rating and popularity from IMDb.
For example, here you can see where to find the IMDb Rating and Popularity data for Prison Break:
IMPORTXML
accepts two parameters. The first is a URL pointing to the HTML page you want to parse, and the second is the XPath to query the HTML document and get the data.
It sounds complicated. But with the help of Google Chrome, we’ll be able to get the path easily. The only thing we’ll have to do is copy and paste.
To start, right-click on the element where the data is (in our case the IMDb rating) and then select Inspect. This will open the Chrome Developer Tools and allow us to see the HTML document.
In the screenshot below, you can see the element containing the 8.3 value we want to extract. To copy the value, right-click the HTML element and choose Copy -> Copy XPath.
The result of this action is the XPath that we’ll use with IMPORTXML
to get the show information:
//*[@id="__next"]/main/div/section[1]/section/div[3]/section/section/div[1]/div[2]/div/div[1]/a/div/div/div[2]/div[1]/span[1]
To make this XPath usable in Google Sheets, we’ll change the double quotes to single quotes. The result should look like this:
//*[@id=’__next’]/main/div/section[1]/section/div[3]/section/section/div[1]/div[2]/div/div[1]/a/div/div/div[2]/div[1]/span[1]
We’ll repeat the process with the Popularity, Episodes, and Seasons. If you are having trouble during this step, here are the four XPaths that we will be using:
IMDb rating
//*[@id=’__next’]/main/div/section[1]/section/div[3]/section/section/div[1]/div[2]/div/div[1]/a/div/div/div[2]/div[1]/span[1]
Popularity
//*[@id=’__next’]/main/div/section[1]/section/div[3]/section/section/div[1]/div[2]/div/div[3]/a/div/div/div[2]/div[1]
Episodes
//*[@id=’__next’]/main/div/section[1]/div/section/div/div[1]/section[1]/div[1]/a/h3/span
Seasons
//*[@id=‘browse-episodes-season’]
Getting the Data Into Google Sheets
We now have our database in sync with Google Sheets, and we know how to get the data we want. The only manual task we will need to perform for each TV Show is getting the IMDb link to extract the information.
IMDb search functionality is great and super fast, so it took me around 2 minutes to get the links for all the US TV shows in our Notion database.
To improve the clarity of our formulas, let’s create a sheet named XPaths and copy and paste each XPath into it so we can reference them in IMPORTXML
easily. We can also use named ranges to give the cell a name, which will make it easier to write formulas that work with the data.
Now comes the easy part: adding the formulas to our columns and letting Google Sheets do the work for us. In this example, G
is the column where we have the URLs to IMDb:
- IMDb rating:
=IMPORTXML(G2, Rating)
- Popularity:
=IMPORTXML(G2, Popularity)
- Episodes:
=IMPORTXML(G2, Episodes)
- Seasons:
=INDEX(IMPORTXML(G2, Seasons), 1, 2)
You can click to open the GIF full screen.
Sending the Data to Notion
Now comes the last and most important step: sending all this information to Notion.
The process is simple. First, we’ll create a synced column for the IMDb rating. Once we get email confirmation that the sync has finished, we’ll remove the synced column. We’ll then repeat this process for the remaining three columns.
If you want to keep one of the columns in sync and updated automatically (such as Popularity, which is probably the one that changes more frequently), create that column last and don’t remove it. Sync2Sheets will update the values in Notion automatically each hour.
You can check out the process of creating the synced columns and how Notion gets updated in the video below:
And with that, we are done 🙌 ! We have our Notion TV Shows Tracker database that shows the number of seasons, episodes, popularity, and IMDb rating.
Conclusion
We learned how to scrape data from a website using the =IMPORTXML
formula from Google Sheets and how to get the XPath of what we want using Chrome’s Inspect feature. We also used named ranges to give some clarity to our formulas and then learned how to send that data back into Notion for free using Sync2Sheets.
This TV show tracker is just one of the many ways Sync2Sheets lets you combine the power of Google Sheets and Notion. Now that you know the basic technique, you can easily adapt this method to scrape and analyze all kinds of structured web data.
Do you want us to write about a specific use case? Let us know on X!