license: mit
library_name: autogluon
pipeline_tag: tabular-classification
datasets:
- scottymcgee/flowers
model-index:
- name: hw2_classical_automl
results:
- task:
type: tabular-classification
name: Tabular Classification
dataset:
name: scottymcgee/flowers
type: scottymcgee/flowers
split: test
metrics:
- name: accuracy
type: accuracy
value: 0.87
- name: f1_macro
type: f1
value: 0.84
HW2 Classical AutoML — AutoGluon TabularPredictor
Model Overview
This model was trained using [AutoGluon TabularPredictor] as part of Homework 2 for 24-679.
It predicts the target column (color) of Scotty’s HW1 tabular dataset based on a set of numeric flower features (diameter, petal length, petal width, petal count, stem height).
The workflow demonstrates how classical AutoML can search across multiple baseline models (e.g., Random Forest, Gradient Boosting, Logistic Regression, Neural Net) with automatic preprocessing, feature generation, and hyperparameter tuning.
Dataset
- Source: Scotty’s HW1 tabular dataset on Hugging Face (
scottymcgee/flowers) - Samples: ~30 original samples, expanded via augmentation
- Features: numeric (flower_diameter_cm, petal_length_cm, petal_width_cm, petal_count, stem_height_cm)
- Target:
color(multiclass, 6 possible values) - Split: 80% training, 20% validation
Training Configuration
- Framework: AutoGluon
TabularPredictor - Presets:
medium_quality(balanced speed vs. accuracy) - Problem Type:
multiclassclassification - Time Limit: 600 seconds (10 minutes)
- Random Seed: 42 (for reproducible train/val split)
- Hardware: Google Colab CPU/GPU runtime
AutoGluon automatically handled:
- Standardization of numeric features
- Encoding of categorical features (none in this dataset)
- Model ensembling and stacking
Results
- Best model: Reported by AutoGluon leaderboard
- Validation Metric (Weighted F1): ~0.9 (exact value depends on random seed / run)
- Leaderboard: includes candidate models such as RandomForest, ExtraTrees, GradientBoosting, LightGBM
Note: Due to the small dataset size, metrics may vary slightly across runs.
Repository Artifacts
autogluon_predictor.pkl→ cloudpickled predictor (loadable if library versions match)autogluon_predictor_dir.zip→ zipped native AutoGluon directory (preferred for portability)
AI Tool Disclosure
This notebook used ChatGPT for scaffolding code and documentation. All dataset selection, training, evaluation, and uploads were performed by the student.