kenil-patel-183
/

mnist-cnn-digit-classifier

@@ -31,4 +31,102 @@ This model classifies handwritten digits (0-9) from 28x28 grayscale images using
 - **Layers**: 4 Convolutional layers with BatchNorm and ReLU activation
 - **Pooling**: MaxPool2d after first conv layer
 - **Final Layer**: Linear layer (3136 → 10)
-- **Parameters**: ~50K trainable parameters

 - **Layers**: 4 Convolutional layers with BatchNorm and ReLU activation
 - **Pooling**: MaxPool2d after first conv layer
 - **Final Layer**: Linear layer (3136 → 10)
+- **Parameters**: ~50K trainable parameters
+## Usage
+**Security Note:** Requires _trust_remote_code=True_ because it uses custom model/processor classes.
+### Using transformers pipeline
+```python
+from transformers import pipeline
+clf = pipeline(
+    "image-classification",
+    model="kenil-patel-183/mnist-cnn-digit-classifier",
+    trust_remote_code=True,   # required due to custom classes
+  )
+preds = clf("path/to/digit.png", top_k=1)
+print(preds)  # [{'label': '7', 'score': 0.998...}]
+```
+### Using manual loading
+```python
+from transformers import AutoConfig, AutoModel, AutoImageProcessor
+from PIL import Image
+model_id = "kenil-patel-183/mnist-cnn-digit-classifier"
+config = AutoConfig.from_pretrained(model_id, trust_remote_code=True)
+model = AutoModel.from_pretrained(model_id, trust_remote_code=True)
+processor = AutoImageProcessor.from_pretrained(model_id, trust_remote_code=True)
+image = Image.open("digit.png")
+inputs = processor(images=image, return_tensors="pt")
+with torch.no_grad():
+    outputs = model(**inputs)
+logits = outputs.logits
+pred = logits.argmax(-1).item()
+print(pred)
+```
+## Model Architecture
+```
+MNISTCNN(
+  (flatten): Flatten(start_dim=1, end_dim=-1)
+  (lin): Linear(in_features=3136, out_features=10, bias=True)
+  (network): Sequential(
+    (0): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1))
+    (1): BatchNorm2d(8, eps=1e-05, momentum=0.1)
+    (2): ReLU()
+    (3): MaxPool2d(kernel_size=(2, 2), stride=2)
+    (4): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1))
+    (5): BatchNorm2d(16, eps=1e-05, momentum=0.1)
+    (6): ReLU()
+    (7): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1))
+    (8): BatchNorm2d(32, eps=1e-05, momentum=0.1)
+    (9): ReLU()
+    (10): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
+    (11): BatchNorm2d(64, eps=1e-05, momentum=0.1)
+    (12): ReLU()
+  )
+)
+```
+## Training Data
+- **Dataset**: MNIST Handwritten Digits
+- **Training samples**: 60,000
+- **Test samples**: 10,000
+- **Image size**: 28x28 grayscale
+- **Classes**: 10 (digits 0-9)
+## Image Preprocessing Requirements
+For best results, input images should be preprocessed as follows:
+1. **Convert to grayscale** if not already
+2. **Resize to 28x28 pixels**
+3. **Convert to tensor** (values between 0 and 1)
+4. **Normalize** with mean=0.1307, std=0.3081
+```python
+transform = transforms.Compose([
+    transforms.Grayscale(),
+    transforms.Resize((28, 28)),
+    transforms.ToTensor(),
+    transforms.Normalize((0.1307,), (0.3081,))
+])
+```
+## Performance
+Achieved 99.25% accuracy on MNIST test set.
+## Limitations
+- **Input format**: Only works with 28x28 grayscale images
+- **Domain**: Optimized for handwritten digits, may not work well on printed text
+- **Background**: Works best with dark digits on light background
+- **Noise**: Performance may degrade with noisy or heavily distorted images