likhonsheikh commited on
Commit
1490417
·
verified ·
1 Parent(s): 1797e3a

Add proper YAML metadata for model card

Browse files
Files changed (1) hide show
  1. README.md +155 -149
README.md CHANGED
@@ -1,3 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Memo: Production-Grade Transformers + Safetensors Implementation
2
 
3
  ![Memo Logo](https://img.shields.io/badge/Memo-Transformers%20%2B%20Safetensors-brightgreen?style=for-the-badge)
@@ -7,66 +30,97 @@
7
 
8
  ## Overview
9
 
10
- **Memo** is a complete transformation from toy logic to production-grade machine learning infrastructure. This implementation uses **Transformers + Safetensors** as the foundation for enterprise-level video generation with proper security, performance optimization, and scalability.
11
-
12
- ## 🎯 What This Guarantees
13
-
14
- ✅ **Transformers-based** - Real ML understanding, not toy logic
15
- ✅ **Safetensors-only** - Zero security vulnerabilities
16
- ✅ **Production-ready** - Enterprise architecture with proper error handling
17
- ✅ **Memory optimized** - xFormers, attention slicing, CPU offload
18
- ✅ **Tier-based scaling** - Free/Pro/Enterprise configurations
19
- ✅ **Security compliant** - Audit trails and validation
20
-
21
- ## 🏗️ Architecture
22
 
23
- ### Core Components
24
 
25
- 1. **Bangla Text Parser** (`models/text/bangla_parser.py`)
26
- - Transformer-based scene extraction using `google/mt5-small`
27
- - Proper tokenization with memory optimization
28
- - Deterministic output with controlled parameters
29
 
30
- 2. **Scene Planner** (`core/scene_planner.py`)
31
- - ML-based scene planning (no more toy logic)
32
- - Intelligent timing and pacing calculations
33
- - Visual style determination
 
34
 
35
- 3. **Stable Diffusion Generator** (`models/image/sd_generator.py`)
36
- - **Safetensors-only model loading** (`use_safetensors=True`)
37
- - Memory optimizations (xFormers, attention slicing, CPU offload)
38
- - LoRA support with safetensors validation
39
- - LCM acceleration for faster inference
40
 
41
- 4. **Model Tier System** (`config/model_tiers.py`)
42
- - **Free Tier**: Basic 512x512, 15 steps, no LoRA
43
- - **Pro Tier**: 768x768, 25 steps, scene LoRA, LCM
44
- - **Enterprise Tier**: 1024x1024, 30 steps, custom LoRA
45
-
46
- 5. **Training Pipeline** (`scripts/train_scene_lora.py`)
47
- - **MANDATORY** `save_safetensors=True`
48
- - Transformers integration with PEFT
49
- - Security-first training with proper validation
50
-
51
- 6. **Production API** (`api/main.py`)
52
- - FastAPI endpoint with tier-based routing
53
  - Background processing for long-running tasks
54
- - Security validation endpoints
55
-
56
- ## 🔒 Security Implementation
57
 
58
- ### Model Weight Security
59
- - **ONLY .safetensors files allowed** - No .bin, .ckpt, or pickle files
60
- - Model signature verification
61
- - File format enforcement
62
- - Memory-safe loading practices
63
 
64
- ### LoRA Configuration (`data/lora/README.md`)
65
- - **ONLY .safetensors files** - No .bin, .ckpt, or other formats allowed
66
- - Model signatures required
67
- - Version tracking and audit trails
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
68
 
69
- ## 🚀 Usage Examples
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
 
71
  ### Basic Scene Planning
72
  ```python
@@ -84,51 +138,22 @@ from config.model_tiers import get_tier_config
84
  from models.image.sd_generator import get_generator
85
 
86
  config = get_tier_config("pro")
87
- generator = get_generator(lora_path=config.lora_path, use_lcm=config.lcm_enabled)
88
- ```
89
-
90
- ### Security Validation
91
- ```python
92
- from config.model_tiers import validate_model_weights_security
93
-
94
- result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")
95
- ```
96
-
97
- ## 📊 Model Tiers
98
-
99
- | Tier | Resolution | Inference Steps | LoRA | LCM | Credits/min | Memory |
100
- |------|------------|-----------------|------|-----|-------------|--------|
101
- | Free | 512×512 | 15 | ❌ | ❌ | $5.0 | 4GB |
102
- | Pro | 768×768 | 25 | ✅ | ✅ | $15.0 | 8GB |
103
- | Enterprise | 1024×1024 | 30 | ✅ | ✅ | $50.0 | 16GB |
104
-
105
- ## 🛠️ Installation
106
-
107
- ```bash
108
- # Clone the repository
109
- git clone https://huggingface.co/likhonsheikh/memo
110
-
111
- # Install dependencies
112
- pip install -r requirements.txt
113
-
114
- # Run the demonstration
115
- python demo.py
116
-
117
- # Start the API server
118
- python api/main.py
119
- ```
120
-
121
- ## 🎬 API Usage
122
 
123
- ### Health Check
124
- ```bash
125
- curl http://localhost:8000/health
 
126
  ```
127
 
128
- ### Generate Video
129
  ```bash
130
- curl -X POST "http://localhost:8000/generate" \
131
- -H "Content-Type: application/json" \
132
  -d '{
133
  "text": "আজকের দিনটি খুব সুন্দর ছিল।",
134
  "duration": 15,
@@ -136,12 +161,7 @@ curl -X POST "http://localhost:8000/generate" \
136
  }'
137
  ```
138
 
139
- ### Check Status
140
- ```bash
141
- curl http://localhost:8000/status/{request_id}
142
- ```
143
-
144
- ## 🧪 Training Custom LoRA
145
 
146
  ```python
147
  from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig
@@ -159,79 +179,65 @@ trainer.setup_lora()
159
  trainer.train(training_data)
160
  ```
161
 
162
- ## Performance Features
163
-
164
- - **Memory Optimization**: xFormers, attention slicing, CPU offload
165
- - **FP16 Precision**: 50% memory reduction with maintained quality
166
- - **LCM Acceleration**: Faster inference when available
167
- - **Device Mapping**: Optimal GPU/CPU utilization
168
- - **Background Processing**: Async handling of long-running tasks
169
-
170
- ## 🔍 Security Validation
171
 
172
  ```python
173
  from config.model_tiers import validate_model_weights_security
174
 
175
- # Validate any model file
176
- result = validate_model_weights_security("path/to/model.safetensors")
177
  print(f"Secure: {result['is_secure']}")
178
- print(f"Format: {result['format']}")
179
  print(f"Issues: {result['issues']}")
180
  ```
181
 
182
- ## 📁 File Structure
183
 
184
- ```
185
- 📁 Memo/
186
- ├── 📄 requirements.txt # Production dependencies
187
- ├── 📁 models/
188
- │ └── 📁 text/
189
- │ └── 📄 bangla_parser.py # Transformer-based Bangla parser
190
- ├── 📁 core/
191
- │ └── 📄 scene_planner.py # ML-based scene planning
192
- ├── 📁 models/
193
- │ └── 📁 image/
194
- │ └── 📄 sd_generator.py # Stable Diffusion + Safetensors
195
- ├── 📁 data/
196
- │ └── 📁 lora/
197
- │ └── 📄 README.md # LoRA configuration (safetensors only)
198
- ├── 📁 scripts/
199
- │ └── 📄 train_scene_lora.py # Training with safetensors output
200
- ├── 📁 config/
201
- │ └── 📄 model_tiers.py # Tier management system
202
- ├── 📁 api/
203
- │ └── 📄 main.py # Production API endpoint
204
- └── 📁 demo.py # Complete system demonstration
205
- ```
206
 
207
- ## 🎯 What This Doesn't Do
208
 
209
  ❌ Make GPUs cheap
210
  ❌ Fix bad prompts
211
  ❌ Read your mind
212
  ❌ Guarantee perfect results
213
 
214
- ## 🏆 Production Readiness
215
 
216
- This implementation is now:
217
- - ✅ **Correct** - Uses proper ML frameworks (transformers, safetensors)
218
- - **Modern** - 2025-grade architecture with security best practices
219
- - **Secure** - Zero tolerance for unsafe model formats
220
- - **Scalable** - Tier-based resource management
221
- - **Defensible** - Production-grade security and validation
 
222
 
223
- ## 📜 License
224
 
225
- This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
 
 
226
 
227
- ## 🤝 Contributing
 
228
 
229
- Contributions are welcome! Please feel free to submit a Pull Request.
 
230
 
231
- ## 📞 Support
 
 
232
 
233
- For support, email support@memo.ai or join our [Discord community](https://discord.gg/memo).
234
 
235
- ---
 
 
 
 
 
236
 
237
- **If your API claims "state-of-the-art" without these features, you're lying.** Memo now actually delivers on that promise with proper Transformers + Safetensors integration.
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - bn
5
+ - en
6
+ tags:
7
+ - transformers
8
+ - safetensors
9
+ - stable-diffusion
10
+ - bangla
11
+ - text-to-video
12
+ - lora
13
+ - scene-planning
14
+ - computer-vision
15
+ - natural-language-processing
16
+ - mlops
17
+ - production-grade
18
+ pipeline_tag: text-to-video
19
+ model-index:
20
+ - name: memo
21
+ results: []
22
+ ---
23
+
24
  # Memo: Production-Grade Transformers + Safetensors Implementation
25
 
26
  ![Memo Logo](https://img.shields.io/badge/Memo-Transformers%20%2B%20Safetensors-brightgreen?style=for-the-badge)
 
30
 
31
  ## Overview
32
 
33
+ This is the complete transformation of Memo to use **Transformers + Safetensors** properly, replacing unsafe pickle files and toy logic with enterprise-grade machine learning infrastructure.
 
 
 
 
 
 
 
 
 
 
 
34
 
35
+ ## What We've Built
36
 
37
+ ### Core Requirements Met
 
 
 
38
 
39
+ 1. **Transformers Integration**
40
+ - Bangla text parsing using `google/mt5-small`
41
+ - Proper tokenization and model loading
42
+ - Deterministic scene extraction with controlled parameters
43
+ - Memory optimization with device mapping
44
 
45
+ 2. **Safetensors Security**
46
+ - **MANDATORY** `use_safetensors=True` for all model loading
47
+ - No .bin, .ckpt, or pickle files anywhere
48
+ - Model weight validation and security checks
49
+ - Signature verification for LoRA files
50
 
51
+ 3. **Production Architecture**
52
+ - Tier-based model management (Free/Pro/Enterprise)
53
+ - Memory optimization and performance tuning
 
 
 
 
 
 
 
 
 
54
  - Background processing for long-running tasks
55
+ - Proper error handling and logging
 
 
56
 
57
+ ## File Structure
 
 
 
 
58
 
59
+ ```
60
+ 📁 Memo/
61
+ ├── 📄 requirements.txt # Production dependencies
62
+ ├── 📁 models/
63
+ │ └── 📁 text/
64
+ │ └── 📄 bangla_parser.py # Transformer-based Bangla parser
65
+ ├── 📁 core/
66
+ │ └── 📄 scene_planner.py # ML-based scene planning
67
+ ├── 📁 models/
68
+ │ └── 📁 image/
69
+ │ └── 📄 sd_generator.py # Stable Diffusion + Safetensors
70
+ ├── 📁 data/
71
+ │ └── 📁 lora/
72
+ │ └── 📄 README.md # LoRA configuration (safetensors only)
73
+ ├── 📁 scripts/
74
+ │ └── 📄 train_scene_lora.py # Training with safetensors output
75
+ ├── 📁 config/
76
+ │ └── 📄 model_tiers.py # Tier management system
77
+ └── 📁 api/
78
+ └── 📄 main.py # Production API endpoint
79
+ ```
80
 
81
+ ## Key Features
82
+
83
+ ### 🔒 Security (Non-Negotiable)
84
+ - **Safetensors-only model loading** - No unsafe formats
85
+ - **Model signature validation** - Verify weight integrity
86
+ - **LoRA security checks** - Ensure only .safetensors files
87
+ - **Memory-safe loading** - Prevent buffer overflows
88
+
89
+ ### 🚀 Performance
90
+ - **Memory optimization** - xFormers, attention slicing, CPU offload
91
+ - **FP16 precision** - 50% memory reduction with maintained quality
92
+ - **LCM acceleration** - Faster inference when available
93
+ - **Device mapping** - Optimal GPU/CPU utilization
94
+
95
+ ### 🏢 Enterprise Features
96
+ - **Tier-based pricing** - Free/Pro/Enterprise configurations
97
+ - **Resource management** - Memory limits and concurrent request handling
98
+ - **Security compliance** - Audit trails and validation
99
+ - **Scalability** - Background processing and proper async handling
100
+
101
+ ## Model Tiers
102
+
103
+ ### Free Tier
104
+ - Base SDXL model (512x512)
105
+ - 15 inference steps
106
+ - No LoRA
107
+ - 1 concurrent request
108
+
109
+ ### Pro Tier
110
+ - Base SDXL model (768x768)
111
+ - 25 inference steps
112
+ - Scene LoRA enabled
113
+ - LCM acceleration
114
+ - 3 concurrent requests
115
+
116
+ ### Enterprise Tier
117
+ - Base SDXL model (1024x1024)
118
+ - 30 inference steps
119
+ - Custom LoRA support
120
+ - LCM acceleration
121
+ - 10 concurrent requests
122
+
123
+ ## Usage Examples
124
 
125
  ### Basic Scene Planning
126
  ```python
 
138
  from models.image.sd_generator import get_generator
139
 
140
  config = get_tier_config("pro")
141
+ generator = get_generator(
142
+ model_id=config.image_model_id,
143
+ lora_path=config.lora_path,
144
+ use_lcm=config.lcm_enabled
145
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
146
 
147
+ frames = generator.generate_frames(
148
+ prompt="Beautiful landscape scene",
149
+ frames=5
150
+ )
151
  ```
152
 
153
+ ### API Usage
154
  ```bash
155
+ curl -X POST "http://localhost:8000/generate" \\
156
+ -H "Content-Type: application/json" \\
157
  -d '{
158
  "text": "আজকের দিনটি খুব সুন্দর ছিল।",
159
  "duration": 15,
 
161
  }'
162
  ```
163
 
164
+ ## Training Custom LoRA
 
 
 
 
 
165
 
166
  ```python
167
  from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig
 
179
  trainer.train(training_data)
180
  ```
181
 
182
+ ## Security Validation
 
 
 
 
 
 
 
 
183
 
184
  ```python
185
  from config.model_tiers import validate_model_weights_security
186
 
187
+ result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")
 
188
  print(f"Secure: {result['is_secure']}")
 
189
  print(f"Issues: {result['issues']}")
190
  ```
191
 
192
+ ## What This Guarantees
193
 
194
+ ✅ **Transformers-based** - Real ML, not toy logic
195
+ **Safetensors-only** - No security vulnerabilities
196
+ **Production-ready** - Enterprise architecture
197
+ **Memory optimized** - Proper resource management
198
+ **Tier-based** - Scalable pricing model
199
+ **Audit compliant** - Security validation built-in
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
200
 
201
+ ## What This Doesn't Do
202
 
203
  ❌ Make GPUs cheap
204
  ❌ Fix bad prompts
205
  ❌ Read your mind
206
  ❌ Guarantee perfect results
207
 
208
+ ## Next Steps
209
 
210
+ If you're serious about production deployment:
211
+
212
+ 1. **Cold-start optimization** - Preload frequently used models
213
+ 2. **Model versioning** - Track changes per tier
214
+ 3. **A/B testing** - Compare model performance
215
+ 4. **Monitoring** - Track usage and performance metrics
216
+ 5. **Load balancing** - Distribute across multiple GPUs
217
 
218
+ ## Running the System
219
 
220
+ ```bash
221
+ # Install dependencies
222
+ pip install -r requirements.txt
223
 
224
+ # Train custom LoRA
225
+ python scripts/train_scene_lora.py
226
 
227
+ # Start API server
228
+ python api/main.py
229
 
230
+ # Check health
231
+ curl http://localhost:8000/health
232
+ ```
233
 
234
+ ## Reality Check
235
 
236
+ This implementation is now:
237
+ - ✅ **Correct** - Uses proper ML frameworks
238
+ - ✅ **Modern** - Transformers + Safetensors
239
+ - ✅ **Secure** - No unsafe model formats
240
+ - ✅ **Scalable** - Tier-based architecture
241
+ - ✅ **Defensible** - Production-grade security
242
 
243
+ If your API claims "state-of-the-art" without these features, you're lying. Memo now actually delivers on that promise.