Python Data Skills 3: Fetching API (JSON) Data
Understanding JSON Data, Fetching API Data, HTTP Errors, Timeout Error, JSON Error
We’ve already covered loading an Excel Sheet as a pandas Dataframe and common issues you’ll have to deal with. This time, we’ll focus on another way to load up some data API data into Python.
Table of Contents:
JSON (API) Data
Template for fetching JSON data
HTTP Errors (Status Code)
Timeout Error
JSON Error
1 - JSON (API) Data
Web APIs are essential for facilitating data communication between different software applications. They often use JSON (JavaScript Object Notation) for data transfer. JSON's simplicity and user-friendliness make it suitable for both humans and machines.
In Python, JSON adopts a dictionary-like structure. When you parse a JSON object, it transforms into a Python dictionary. The user can handle JSON data in a similar way to a Python dictionary. Keys represent attributes, and values correspond to the respective data. The straightforward structure and seamless integration with Python have popularized JSON in APIs.
2 - Template for fetching JSON Data
The most common way to fetch data from an API is to use the requests library to retrieve contents of a web api. Then, we will use the JSON library to decode it from a string to a JSON data type. Then, we can either convert it to a pandas dataframe if we wish, or treat it as a dictionary, and work with it.
Here is a code template you can use to fetch data from an API as a JSON:
import requests
import json
def get_api_data():
response = requests.get('http://www.my_api.com/abcd') #Swap this out for your correct URL.
if response.status_code == 200:
data = json.loads(response.text)
print(data)
else:
print("Failed to retrieve data, HTTP status code: ", response.status_code)
get_api_data()
If you want to turn the JSON into a table, just use the json_normalize function. Code snippet below:
import pandas as pd
my_json = get_api_data()
df = pd.json_normalize(data) # DataFrame you can use
Obviously, in the real world, things don’t go this smooth. Now, let’s talk about all of the possible things that can go wrong, and how to handle them.
*The bug fixing content below, is for paid readers only*
3 - HTTP Errors (Status Code)
If the HTTP status code of the response is not 200, it indicates an error. 4xx
Keep reading with a 7-day free trial
Subscribe to Data Science & Machine Learning 101 to keep reading this post and get 7 days of free access to the full post archives.