Deploy a Machine Learning Model with FastAPI

3 min readApr 28, 2021

FastAPI is a web framework commonly used to deploy machine learning models behind RESTful APIs. Users and applications use these APIs to retrieve predictions from models.

For this use case, I have a model developed with Keras that classifies Iris flowers to the correct species by predicting class probabilities. I want my server to load that model, accept requests to an endpoint, process inputs passed as path parameters, run the inputs through the model, and pass the output back to the client. At the end, we go over how to pass our input data as a JSON payload instead of as path parameters. We will also take a look at the interactive documentation FastAPI produces for our API.

To get started, install FastAPI and uvicorn (the server) with the command.

Create a new directory with the model and input scaling parameters. Create a new file in the directory named main.py. This script will have the server load the model and the parameters for scaling the input data. Use main.py to accomplish this with

Next, define an instance of the FastAPI class named app. This will be the main building block of our API.

Define a path operation decorator. We want users to be able to retrieve the model predictions by sending a GET request to the /predict path (or endpoint) of our server. The path operation decorator will execute the function under it each time server receives a GET request to the /predict path.

Under the path operation decorator, define the path operation function as an async function. This function will be called when our server receives a GET request to the /predict path. Declare path parameters that will be passed to our function. In this case, our path parameters will be f0, f1, f2, and f3. They are passed in as float values. Scale the inputs and run them through the model. Return the model output to the client.

Now that the server is ready, run it with the command

main refers to the file main.py.
app refers to the variable name of the FastAPI object inside of main.py.
reload tells the server to restart when main.py changes.

Let’s test out the server. Open your browser and go to http://localhost:8000/predict?f0=6.4&f1=2.9&f2=5.2&f3=1.6

The server will return

{"Confidence":{"Virginica":0.6746,"Versicolor":0.3251,"Setosa":0.0003}}

You can also interact with your API by using the requests library in Python.

Visit http://localhost:8000/docs to view interactive documentation of your API generated by SwaggerUI. This lets you test out your API from the browser.

Or visit http://localhost:8000/redoc to view documentation generated by Redocly.

Congrats! You now have an API that serves your machine learning model!

Now, what if we wanted to pass in our data as a JSON payload in the request rather than through path parameters? We use the BaseModel class from pydantic to create a data model to accomplish this. Create a new class named Iris and have it inherit from BaseModel. This is our data model. It has four attributes: four float values representing the features used as input to our model.

Add the data model to our path operation function and declare its type as the class name Iris.

Interact with this new version of the API with the following code that passes the input data as a JSON payload.

View the code used in this demo on GitHub here or download it with

Deploy a Machine Learning Model with FastAPI

Written by Daniel O'Keefe

No responses yet