Skip to content

Computer Vision: Image Classification (End-to-End)

Just an Example Post

This is just an example post. Don't take it too seriously!

This documentation provides a complete guide for building an image classification system using modern computer vision techniques, from data preparation to deployment.


Introduction

This guide focuses on creating an end-to-end image classification pipeline, leveraging popular frameworks like PyTorch, TensorFlow, and ONNX. The steps include loading datasets, training models, evaluating performance, and deploying the system.


Workflow Overview

  1. Data Preparation: Dataset curation, augmentation, and preprocessing.
  2. Model Training: Training a neural network on the dataset.
  3. Evaluation: Measuring the model's performance using metrics like accuracy, F1 score, etc.
  4. Deployment: Hosting the trained model for inference via APIs or web services.

Setup

System Requirements

  • Python: >= 3.8
  • Dependencies: PyTorch, torchvision, TensorFlow, OpenCV, FastAPI, ONNX Runtime
  • Hardware: GPU (NVIDIA CUDA) for training, CPU/GPU for inference

Installation

  1. Clone the repository:

    git clone https://github.com/your-repo/image-classification.git
    cd image-classification
    

  2. Install required libraries:

    pip install -r requirements.txt
    

  3. Verify installation:

    python -m scripts.verify_setup
    


Step-by-Step Guide

1. Data Preparation

  • Organize your dataset:
    dataset/
    ├── train/
    │   ├── class1/
    │   └── class2/
    ├── val/
    │   ├── class1/
    │   └── class2/
    
  • Preprocess images using OpenCV or torchvision transforms:
    from torchvision import transforms
    
    transform = transforms.Compose([
        transforms.Resize((224, 224)),
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    ])
    

2. Model Training

Train a model using PyTorch:

import torch
import torchvision.models as models

model = models.resnet50(pretrained=True)
model.fc = torch.nn.Linear(2048, num_classes)  # Replace final layer

# Training loop
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
criterion = torch.nn.CrossEntropyLoss()

for epoch in range(num_epochs):
    for images, labels in dataloader:
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

3. Evaluation

Evaluate the model:

from sklearn.metrics import classification_report

# Get predictions
predictions = []
ground_truth = []

model.eval()
with torch.no_grad():
    for images, labels in val_dataloader:
        outputs = model(images)
        _, preds = torch.max(outputs, 1)
        predictions.extend(preds.cpu().numpy())
        ground_truth.extend(labels.cpu().numpy())

print(classification_report(ground_truth, predictions, target_names=class_names))

4. Deployment

Export the model to ONNX for optimized inference:

dummy_input = torch.randn(1, 3, 224, 224)  # Example input size
torch.onnx.export(model, dummy_input, "model.onnx", opset_version=11)

Deploy using FastAPI:

from fastapi import FastAPI, File, UploadFile
import onnxruntime as ort
import numpy as np
from PIL import Image

app = FastAPI()
session = ort.InferenceSession("model.onnx")

@app.post("/predict/")
async def predict(file: UploadFile = File(...)):
    image = Image.open(file.file).resize((224, 224))
    input_array = np.expand_dims(np.array(image).astype(np.float32) / 255.0, axis=0)
    result = session.run(None, {"input": input_array})
    return {"prediction": np.argmax(result)}


APIs and Inference

Starting the API Server

Run the FastAPI app:

uvicorn app:app --reload

Access the API documentation at http://127.0.0.1:8000/docs.

Test Inference

Send an image to the endpoint:

curl -X POST "http://127.0.0.1:8000/predict/" -F "file=@path/to/image.jpg"


References!

Comments