Monitoring FastAPI Applications with RED Metrics
TLDR;
RED Metrics is a set of metrics used to monitor the performance of applications/services, focusing on user experience. Not for measuring resource usage like CPU, memory, or disk utilization.
Introduction
I often build PoC (Proof of Concept) AI application using FastAPI and not yet think too much about the performance of the application. You know at that stage we just hope it works. Until finally we release it to some users, all sorts of problems pop up and we struggle to figure out what is going wrong? Is it because of resource bottleneck? How many times the endpoint is hit until it reaches the bottleneck? Or is there something wrong with our code? This makes it difficult to control because we don't know what is the problem.
One way to control the performance of the application is to measure the performance of the application. As my teacher said, "If we can't measure it, we can't manage it"
There are many ways to measure the performance of our application, one of them is by using RED Metrics.