34
loading...
This website collects cookies to deliver better user experience
.csv
file. We will be using the following libraries to help us with the scraping and managing the extracted data:Requests - This library is required to send an HTTP request to the web page. This will give us access to the HTML content of the webpage we want to scrape.
Beautiful Soup - This library gives us functions to help extract data from the HTML content we receive when we send an HTTP request.
Pandas - This library helps us manage the data that has been extracted. In this case we will use it to save our data to a .csv
file.
pip install beautifulsoup4
pip install requests
pip install pandas
Start by importing all the libraries.
Send an HTTP request to the webpage using its URL. Make sure the response code is 200 which means the request was successful.
Use the BeautifulSoup function to extract the raw HTML from the response received.
From the raw HTML, extract the data we need using different selectors. The selectors used here are ‘class’ and ‘id’.
Save the extracted data into a pandas dataframe in the form of a python dictionary.
Save the dataframe to a csv file. Note: We are using the utf-16BE encoding to render the degree symbol properly in the csv file.