Serving and deploying Pytorch models for free with FastAPI and AWS Lambda — Part 1 [FAILED]
Disclaimer: this tutorial ended up failing because of the size of the Pytorch library, but there are still a lot of great things to learn about (Github workflows, FastAPI, AWS S3 and Lambda). In the second part, I will try to find another free solution to serve Pytorch models in production.
Link to the Github repo: https://github.com/AchrafAsh/fast-torch-lambda
The solution is as follow:
- the model is built with Pytorch (machine translation)
- the model is served through a REST API built with FastAPI
- the application is then compressed and stored in an AWS S3 bucket
- finally, the API is deployed as a lambda function on AWS Lambda
Setup Git and Github repository
You know the drill:
- Create a new repository
- Add .gitignore (Python template)
- Add README.md
- Clone on your local machine
- Don’t forget to commit changes after every step! ⚠️
Setup virtual environment
Virtual environments ensure a consistent development environment so that you run the same packages (and versions) locally as well as in your remote instance (and with your colleagues if you’re in a team). We will use
pip install virtualenv
Create the virtual environment (
cd inside the app folder):
# virtualenv venv
# if you have the following error: "command not found", try this:
python -m virtualenv venv
Activate the virtual environment:
Install the required dependencies:
requirements.txt file at the root of the app folder and paste this:
You can now install these dependencies by running the command :
pip install -r requirements.txt
Build and test your REST API with FastAPI and Pytest
You can find a great FastAPI tutorial on the official website: https://fastapi.tiangolo.com/
Here is the folder structure:
| ├── __init__.py
| └── test_main.py
Write the tests for your API in
test_main.py and run them with the command: (you can find the tests on Github)
python -m pytest
You should have errors as we didn’t build the app yet. This is called TDD — Test Driven Development! 💡 we write the tests before implementing the functionalities.
Let’s build the app until the tests go green 🚦
(find the code in Github)
Recall: don’t forget to commit your changes regularly
Create automation with Github Actions
The 2 jobs that we need are:
- Continuous Integration (CI)
to run the tests, package our FastAPI into a Lambda function and push it to GitHub.
- Continuous Deployment (CD)
to download the lambda artefact that has been uploaded during the previous job and deploy it to AWS Lambda by linking it with AWS S3. This job is only executed if the previous one succeeded.
So it looks like this (find the whole script in the Github repo):
name: CI/CD Pipelineon:
branches: [ main ]
Now push everything to Github and go to the Actions tab. You should see the log of the Github Action. If you have any error read the logs carefully and make the necessary changes (let me know in the comment down below if you struggle to fix the error).
Configure AWS Lambda and set up CI/CD
Here are the steps to set up everything:
- Signup to AWS, create a Lambda function
- Select: author from scratch
- Select: runtime: python 3.7
- Select: choose or create an execution role
- Select: create function
- Create an S3 bucket in AWS console management. ⚠️ it should be in the same region as your Lambda function
- Change runtime settings (should be
lambda_function.lambda_handlernow), change it to
- Update your Github Actions config file (
main.yml) to deploy your function to Lambda (don’t forget to change
YOUR_S3_BUCKETwith your own names)
GitHub Secrets are used to store confidential information. In our case, we need to store our
Add your secrets to your Github repo. Go to Settings, Secrets.
To find the values:
AWS_SECRET_ACCESS_KEY: go to AWS Management Console, on the top right click on your account and
My Security Credentials,
Create New Access Key
AWS_DEFAULT_REGIONis the AWS Region of your lambda function (and S3 bucket as they need to be the same). Just put the lower case part, e.g.
Hopefully, everything should work 🤞. Push your changes to Github and go to Actions to look at the logs.
Create an API Gateway
Now we need to create a REST API Gateway to allow the world to send requests to our lambda function.
- Go to API Gateway (in AWS Console)
- Click on Build on the card that says REST API but doesn’t have Private on it
- Click New API and enter a name for this API, then Create API
- Actions > Create Method > ANY > Use Lambda Proxy integration > Save
- Add a proxy: Actions > Create Resource > Configure as proxy > Create Resource
- Actions > Deploy API > [New Staging] > “dev” (or “staging” or whatever you want)
Pytorch Deep Learning Model
Install PyTorch and update requirements.txt:
pip install torch torchtext spacy
pip freeze > requirements.txt
Build, train, and save the model
You can find all the code in the Github repo to build, train and save a deep learning model for machine translation.
Serve the model
- create a new POST endpoint
- create utilities: process data (string to encoded tensor), load pre-trained model, feedforward, process output (encoded tensor to string)
Finally, push everything and let your CI/CD do the rest for you 😎
… and it failed … 🤦
Your lambda function should be less than ~200MB, but ours is around 2GB… The reason is that Pytorch itself takes more than 1GB and our model has around one million parameters!
I thought it would be okay because AWS Lambda offers a lot of GB. But this storage is meant to be used during the processing of requests and not to store the lambda function itself.
So we could for example install Pytorch during the processing of the API request but it would take a lot of time (and thus money!).
To be continued…
In part 2 we will try another way using AWS ECR and Docker. But let me figure it out first…