Today we will learn how to scrap HTML tables from any page with 3 lines of code. We have already published a post on web scraping please click here to visit. Today we are using pandas for this. We are going to print the first table from this link https://www.stackscale.com/blog/most-popular-programming-languages/. Let’s dive into the code.
import pandas dfs = pandas.read_html("https://www.stackscale.com/blog/most-popular-programming-languages/") print(dfs) # printing first table only, But we have all the tables in this array
We can see that data from the website and what we are printing is same. Now we can do our stuff with this data.
Happy newyear everyone