Spaces:

finhdev
/

clipspace

Runtime error

File size: 6,025 Bytes

---
title: MobileCLIP Image Classifier
emoji: 📸
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit
---

# 📸 MobileCLIP-B Image Classifier

Zero-shot image classification powered by Apple's MobileCLIP-B model, served through an interactive Gradio web interface. This application enables real-time image classification against a dynamic set of text labels, with support for admin-managed label updates and optional Hugging Face Hub persistence.

## 🎯 Key Features

### Core Capabilities
- **🖼️ Zero-Shot Classification**: Upload any image for instant classification without model retraining
- **🏷️ Dynamic Label Management**: Add, remove, and update classification labels on-the-fly
- **📊 Interactive Results**: Visual confidence scores with sortable data tables
- **⚡ Optimized Performance**: Sub-30ms inference on GPU with re-parameterized MobileOne blocks
- **🔒 Secure Admin Panel**: Token-protected label management interface
- **☁️ Hub Persistence**: Optional versioned label storage on Hugging Face Hub

### API Access
- **REST API**: Fully accessible via Gradio's automatic API endpoints
- **Base64 Support**: Direct base64 image input for backend integration
- **Batch Processing**: Efficient handling of multiple classification requests

## 🏗️ Architecture

### Components
- **`app.py`**: Main Gradio interface with public/admin tabs and API endpoints
- **`handler.py`**: Core model management, inference logic, and label operations
- **`reparam.py`**: MobileOne re-parameterization for optimized inference
- **`items.json`**: Default label catalog with metadata

### Model Details
- **Architecture**: MobileCLIP-B with re-parameterized MobileOne image encoder
- **Text Encoder**: Optimized CLIP text transformer
- **Embedding Cache**: Pre-computed text embeddings for fast inference
- **Device Support**: Automatic GPU/CPU detection with float16 optimization

## 🚀 Quick Start

### Environment Variables

Configure in your Space Settings → Variables and secrets:

| Variable | Description | Required |
|----------|-------------|----------|
| `ADMIN_TOKEN` | Secret token for admin operations | Yes (for admin) |
| `HF_LABEL_REPO` | Hub dataset for label storage (e.g., `user/labels`) | No |
| `HF_WRITE_TOKEN` | Token with write permissions to dataset repo | No |
| `HF_READ_TOKEN` | Token with read permissions (defaults to write token) | No |

### Usage Examples

#### Web Interface
1. Navigate to the Space URL
2. Upload an image in the Classification tab
3. Adjust top-k results (default: 10)
4. View ranked predictions with confidence scores

#### API Usage

**Standard Classification:**
```python
import requests

response = requests.post(
    "YOUR_SPACE_URL/api/classify_image",
    files={"image": open("photo.jpg", "rb")},
    data={"top_k": 5}
)
results = response.json()
```

**Base64 Input:**
```python
import base64
import requests

with open("photo.jpg", "rb") as f:
    img_base64 = base64.b64encode(f.read()).decode()

response = requests.post(
    "YOUR_SPACE_URL/api/classify_base64",
    json={
        "image": img_base64,
        "top_k": 10
    }
)
results = response.json()
```

## 🔧 Admin Operations

### Label Management

Authenticated admins can perform the following operations:

#### Add Labels
```json
{
  "op": "upsert_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "items": [
    {"id": 100, "name": "bicycle", "prompt": "a photo of a bicycle"},
    {"id": 101, "name": "airplane", "prompt": "a photo of an airplane"}
  ]
}
```

#### Reload Specific Version
```json
{
  "op": "reload_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "version": 5
}
```

#### Remove Labels
```json
{
  "op": "remove_labels",
  "token": "YOUR_ADMIN_TOKEN",
  "ids": [100, 101]
}
```

### Label Deduplication
- Automatic case-insensitive name deduplication
- Prevents duplicate entries (e.g., "cat", "Cat", "CAT" treated as same)
- ID-based deduplication for consistent label management

## 📦 Hub Integration

When configured with `HF_LABEL_REPO` and tokens, the system automatically:

1. **Saves Snapshots**: Each label update creates versioned snapshots
   - `snapshots/v{N}/embeddings.safetensors`: Pre-computed text embeddings
   - `snapshots/v{N}/meta.json`: Label metadata and model info
   - `snapshots/latest.json`: Points to current version

2. **Loads on Startup**: Fetches latest snapshot or specified version
3. **Fallback**: Uses local `items.json` if Hub unavailable

## 🎨 Default Label Catalog

The bundled `items.json` includes 50+ kid-friendly objects with:
- Unique IDs and display names
- CLIP-optimized prompts
- Category metadata
- Fun facts and rarity ratings

Categories include animals, toys, food, vehicles, nature, and everyday objects.

## ⚡ Performance Optimization

- **GPU Acceleration**: Automatic CUDA detection with float16 inference
- **CPU Fallback**: Graceful degradation with float32 precision
- **Embedding Cache**: Pre-computed text embeddings updated on label changes
- **Re-parameterization**: MobileOne blocks optimized for inference speed
- **Batch Processing**: Efficient matrix operations for multi-label scoring

## 🔐 Security Considerations

- **Token Protection**: Admin operations require `ADMIN_TOKEN`
- **Private Datasets**: Keep label repos private for sensitive applications
- **Input Validation**: Automatic sanitization of uploaded images
- **Memory Management**: Images processed and discarded after inference

## 📄 License

- **Model Weights**: Apple Sample Code License (ASCL)
- **Interface Code**: MIT License

## 🤝 Contributing

Contributions welcome! Areas for improvement:
- Additional label management features
- Performance optimizations
- Extended API capabilities
- Multi-language support

## 📚 Resources

- [MobileCLIP Paper](https://arxiv.org/abs/2311.17049)
- [OpenCLIP Library](https://github.com/mlfoundations/open_clip)
- [Gradio Documentation](https://gradio.app/docs)
- [Hugging Face Spaces](https://huggingface.co/spaces)