Superhero data analysis: Making the first request

Marvel comic books, the essence of superhero data.

After just finishing Arjan’s course Next-Level Python: Become a Python Expert I want to get my hands dirty and write a small python superhero data analysis tool for the Marvel superheroes. The main functionality is retrieving the list of heroes from the Marvel API and generate various metrics based on their superhero characteristics.

Base project structure and repository

The project repository can be found on GitHub.

I am using Visual Studio Code as IDE and Poetry for dependency management and packaging. Setting things up with poetry is pretty straightforward.

cd marvel-heroes-analysis-tool
poetry init

# create virtual environments within the project directory 
# (poetry creates the virtual environment by default in {cache-dir}/virtualenvs
poetry config virtualenvs.in-project true 
Bash

In terms of settings things up for running and debugging in Visual Studio Code, I created a launch.json configured for running the Poetry environment in debugging mode.

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "cwd": "${workspaceFolder}",
            "module": "poetry",
            "justMyCode": false
        }
    ]
}
JSON

One final touch is adding a settings.json file for the project related settings, like editor settings, linting, type checking and formatters.

{
  "editor.formatOnSaveMode": "file",
  "editor.formatOnSave": true,
  "editor.codeActionsOnSave": {
    "source.organizeImports": true
  },
  "python.linting.pylintEnabled": true,
  "python.linting.enabled": true,
  "python.analysis.typeCheckingMode": "strict",
  "python.formatting.provider": "black",
  "vim.smartRelativeLine": true,
  "[python]": {
    "editor.defaultFormatter": "ms-python.black-formatter"
  }
}
JSON

Now on to the exciting part.

Heroes assemble

I have created a http_client module in which I will add the GET method for retrieving data from the API:

http_client
├── __init__.py
└── async_request.py

In the async_request module there will be two GET methods, one synchronous and one asynchronous that will make the request to the passed url string. In order for this to work we need to add the requests module to our project via poetry.

poetry add requests 
Bash

And the async_requests.py module will look something like:

"""Http client with asyncrounous methods"""
import asyncio

import requests

JSON = int | str | float | bool | None | dict[str, "JSON"] | list["JSON"]
JSONObject = dict[str, JSON]
JSONList = list[JSON]


def get(url: str) -> JSONObject:
    """Get request syncronous"""
    response = requests.get(url, timeout=10)
    return response.json()


async def get_async(url: str) -> JSONObject:
    """Get asyncronous"""
    return await asyncio.to_thread(get, url)
Python

Let’s go trough each of them:

  • Lines 6-8: Setting up type hints that will make the code more readable. Using type aliases complex types can be assigned names. For further reading about typing, check out this resource
  • Lines 11-14: The GET method for sending the request using the requests module that returns the response as a JSONObject
  • Lines 17-19: The GET async method that calls asynchronously the GET method in a separate thread using the asyncio module

Now actually making the request to the Marvel API to get the characters:

import asyncio
import hashlib
import os
from datetime import datetime

from dotenv import load_dotenv

from http_client import get_async
BASE_URL = "http://gateway.marvel.com/"


async def main() -> None:
    """main()"""

    timestamp = datetime.now()
    public_api_key = os.getenv("PUBLIC_API_KEY")
    private_api_key = os.getenv("PRIVATE_API_KEY")

    to_be_hashed = f"{timestamp}{private_api_key}{public_api_key}"
    md5 = hashlib.md5()
    md5.update(to_be_hashed.encode("utf-8"))
    md5_hex = md5.hexdigest()

    endpoint = "v1/public/characters"

    response = await get_async(
        f"{BASE_URL}{endpoint}?ts={timestamp}&apikey={public_api_key}&hash={md5_hex}"
    )

    print(response)
Python

  • I setup an .env file in the project root folder where I place the API keys and load them using dotenv module
  • I build up the URL for making the requests following the specifications from the API here
  • Finally the request is made and we retrieve the list of superheroes

Next steps

The next step will be to get additional information about the superheroes from the same API or other APIs and aggregate data to continue in building the superhero data analysis tool.