CNN: An Interactive Handwritten Digit Classifier with PyTorch, Flask, and Vue
Explore how the @tiesen243/cnn project combines a convolutional neural network, a Python backend, and a Vue.js frontend to predict handwritten digits in real time.
28/10/2024
Introduction
@tiesen243/cnn is a full-stack interactive application that brings handwritten digit recognition to your browser. By leveraging a custom-built convolutional neural network (CNN) powered by PyTorch, a Python Flask web backend, and a Vue.js frontend, this project allows users to draw digits on a canvas and instantly get predictions from a trained neural network.
How It Works
Convolutional Neural Network (CNN) in PyTorch
At the heart of the project is a CNN defined in src/cnn.py
, built using PyTorch. The model consists of:
- Multiple convolutional layers and ReLU activations
- Max pooling for spatial downsampling
- Flattening and fully connected layers
- Softmax output for digit classification (0–9)
The model is trained on the MNIST dataset, a standard benchmark for digit recognition, with support for performance tracking via loss and accuracy metrics.
Training Pipeline
The training process is handled in a Jupyter Notebook (src/train.ipynb
), which:
- Loads and preprocesses the MNIST data
- Defines the CNN architecture
- Trains the model and evaluates accuracy
- Saves the trained model weights for later inference
Flask API Backend
A simple Flask application (src/app.py
) serves as the backend:
- Loads the trained CNN model
- Exposes a
/predict
endpoint that accepts POST requests with an image - Preprocesses the image (resizing, normalization, tensor conversion)
- Returns the predicted digit as a JSON response
Interactive Vue.js Frontend
The frontend is a single-page app built with Vue 3 (src/static/main.js
+ src/templates/index.html
):
- Users can draw digits on an HTML
<canvas>
- The app provides "Predict" and "Clear" buttons
- Upon prediction, the canvas image is sent to the Flask backend, and the predicted digit is displayed in real time
- Uses Tailwind CSS for a modern, responsive design
Key Features
- End-to-End ML Demo: From model definition and training to real-time inference in the browser.
- Live Drawing Canvas: Engaging user experience for digit input.
- RESTful API: Easily extensible for other ML tasks.
- Clean, Modular Code: Separation of concerns between frontend, backend, and model logic.
- GPU Support: PyTorch model runs on CUDA if available for fast inference.
Example Usage
- Draw a digit (0–9) on the canvas.
- Click "Predict" to send your drawing to the backend.
- View the prediction instantly, powered by a deep neural network.
Why This Project Is Cool
- Educational Value: Great example for learning about neural networks, model serving, and frontend-backend integration.
- Full Stack: Combines Python, JavaScript, machine learning, and modern web development.
- Open Source: Easily adaptable for your own image classification projects.
Conclusion
@tiesen243/cnn is a fantastic showcase of bringing AI models into interactive web apps. Whether you're learning about machine learning, web development, or looking for a template to deploy your own models, this project has you covered.