🔍 DeepSeek-OCR-WebUI

Visit Application →

🌐 English | 简体中文 | 繁體中文 | 日本語

Intelligent OCR System · Vue 3 Modern UI · Batch Processing · Multi-Mode Support

Features • Quick Start • Screenshots • Contributors

🎉 v4.1 Update: UI Improvements & Model Version Display

Header shows OCR-2 model badge · Footer displays v4.1 · OCR-2

🏷️ OCR-2 Model Badge — Header now shows a prominent OCR-2 badge so users instantly know the model version
🎨 Table Rendering Fix — OCR-detected tables now display with white backgrounds, dark text, and zebra striping for clear readability (previously appeared as dark/unreadable blocks)
📡 Health API model_version — /health endpoint now returns "model_version": "DeepSeek-OCR-2" for programmatic version detection
🔖 Footer Version — Updated to v4.1 · OCR-2

🎉 v4.0 Update: DeepSeek-OCR-2 Model Upgrade!

🚀 Major model upgrade to DeepSeek-OCR-2 (Visual Causal Flow) — better accuracy, higher resolution!

✨ What's New in v4.0

🧠 DeepSeek-OCR-2 Model - Upgraded to the latest DeepSeek-OCR-2 with Visual Causal Flow architecture
🔬 Higher Resolution - Dynamic resolution up to (0-6)×768×768 + 1×1024×1024 (was 640×640)
⚡ Flash Attention 2 - Native flash_attention_2 support on CUDA for optimal inference speed
🎯 Improved Accuracy - Better document understanding, chart parsing, and text recognition
🔄 Full Backward Compatibility - All 7 recognition modes, REST API, and frontend unchanged
🐳 Docker v4.0 - New all-in-one image with pre-downloaded OCR-2 model (Dockerfile.v4.0)
📦 Unified Tokenizer - Switched from AutoProcessor to AutoTokenizer (aligned with official OCR-2 API)

🔧 Technical Changes

Component	v3.6 (OCR v1)	v4.0 (OCR-2)
Model	`deepseek-ai/DeepSeek-OCR`	`deepseek-ai/DeepSeek-OCR-2`
`image_size`	640	768
Attention	`eager`	`flash_attention_2` (CUDA)
Tokenizer	`AutoProcessor`	`AutoTokenizer`
Resolution	Fixed crops	Dynamic (0-6)×768 + 1×1024

💡 All existing features from v3.6 (concurrency, rate limiting, queue management, Vue 3 frontend) are fully preserved.

🎉 v3.6 Update: Backend Concurrency & Rate Limiting!

🚀 Performance optimization with smart queue management and rate limiting!

✨ What's New in v3.6

⚡ Backend Concurrency Optimization - Non-blocking inference with ThreadPoolExecutor
🔒 Rate Limiting - Per-client and per-IP request limits (X-Client-ID header support)
📊 Queue Management - Real-time queue status with position tracking
🏥 Enhanced Health API - Queue depth, status (healthy/busy/full), and rate limit info
🌐 New Languages - Added Traditional Chinese (zh-TW) and Japanese (ja-JP)
🎯 429 Error Handling - Graceful handling when queue is full or rate limited

🙏 Contributors: @cloudman6 (PR #41)

🎉 v3.5 Major Update: Brand New Vue 3 Frontend!

🚀 Complete UI Overhaul with Modern Vue 3 + TypeScript Architecture!

Home Page	Processing Page

✨ What's New in v3.5

🎨 Brand New Vue 3 UI - Modern, responsive design with Naive UI components
⚡ TypeScript Support - Full type safety and better developer experience
📦 Dexie.js Database - Local IndexedDB for offline page management
🔄 Real-time Processing Queue - Visual OCR progress with queue management
🏥 Health Check System - Backend status monitoring with visual indicators
📄 Enhanced PDF Support - Smooth PDF rendering with page-by-page processing
🌐 i18n Ready - Built-in internationalization (EN/CN/TW/JP)
🧪 E2E Testing - Comprehensive Playwright test coverage

👥 Contributors

🌟 Special Thanks to Our Amazing Contributors! 🌟

This project is the result of an outstanding collaboration. The Vue 3 frontend was developed through a successful merge of PR #34.

_CloudMan
_{🏆 Vue 3 Frontend Lead Developer}
_{164 commits · Complete UI Rewrite}

_neosun100
_{🎯 Project Maintainer}
_{Backend · Docker · Integration}

💡 About the Vue 3 Frontend: @cloudman6 contributed an exceptional Vue 3 + TypeScript frontend with 164 commits, including comprehensive E2E tests, modern UI components, and production-ready architecture. This collaboration transformed DeepSeek-OCR-WebUI into a professional-grade application!

📖 Introduction

DeepSeek-OCR-WebUI is an intelligent document recognition web application powered by the DeepSeek-OCR model. It provides a modern, intuitive interface for converting images and PDFs to structured text with high accuracy.

✨ Core Highlights

Feature	Description
🎯 7 Recognition Modes	Document, OCR, Chart, Find, Freeform, and more
🖼️ Bounding Box Visualization	Find mode with automatic position annotation
📦 Batch Processing	Process multiple images/pages sequentially
📄 PDF Support	Upload PDFs, auto-convert to images
🎨 Modern Vue 3 UI	Responsive design with Naive UI
🌐 Multilingual	EN, 简体中文, 繁體中文, 日本語
🍎 Apple Silicon	Native MPS acceleration for M1/M2/M3/M4
🐳 Docker Ready	One-command deployment
⚡ GPU Acceleration	NVIDIA CUDA support

🚀 Features

7 Recognition Modes

Mode	Icon	Description	Use Cases
Doc to Markdown	📄	Preserve format and layout	Contracts, papers, reports
General OCR	📝	Extract all visible text	Image text extraction
Plain Text	📋	Pure text without format	Simple text recognition
Chart Parser	📊	Recognize charts and formulas	Data charts, math formulas
Image Description	🖼️	Generate detailed descriptions	Image understanding
Find & Locate	🔍	Find and annotate positions	Invoice field locating
Custom Prompt	✨	Customize recognition needs	Flexible tasks

🆕 Vue 3 Frontend Features

┌─────────────────────────────────────────────────────────────┐
│  📁 Page Sidebar          │  📄 Document Viewer             │
│  ├─ Thumbnail List        │  ├─ High-res Image Display      │
│  ├─ Drag & Drop Reorder   │  ├─ OCR Overlay Toggle          │
│  ├─ Batch Selection       │  ├─ Zoom Controls               │
│  └─ Quick Actions         │  └─ Status Indicators           │
├─────────────────────────────────────────────────────────────┤
│  🔄 Processing Queue      │  📝 Result Panel                │
│  ├─ Real-time Progress    │  ├─ Markdown Preview            │
│  ├─ Cancel/Retry          │  ├─ Word/PDF Export             │
│  └─ Health Monitoring     │  └─ Copy to Clipboard           │
└─────────────────────────────────────────────────────────────┘

🖼️ Screenshots

Home Page

Clean, modern landing page with quick access to all features

Processing Interface

Full-featured document processing with sidebar, viewer, and results panel

Quick Start Guide

Step-by-step guide: Import files → Select pages → Choose OCR mode → Get results

📦 Quick Start

🐳 Docker (Recommended)

# Pull and run
docker pull neosun/deepseek-ocr:v4.1
docker run -d \
  --name deepseek-ocr \
  --gpus all \
  -p 8001:8001 \
  --shm-size=8g \
  neosun/deepseek-ocr:v4.1

# Access: http://localhost:8001

Available Docker Tags

Tag	Description
`latest`	Latest stable (= v4.1)
`v4.1`	UI improvements & model version display
`v4.0`	DeepSeek-OCR-2 model upgrade
`v3.6`	Backend concurrency & rate limiting
`v3.5`	Vue 3 frontend version
`v3.3.1-fix-bfloat16`	BFloat16 compatibility fix

🍎 Mac (Apple Silicon)

# Clone and setup
git clone https://github.com/neosun100/DeepSeek-OCR-WebUI.git
cd DeepSeek-OCR-WebUI

# Create conda environment
conda create -n deepseek-ocr python=3.11
conda activate deepseek-ocr

# Install dependencies
pip install -r requirements-mac.txt

# Start service
./start.sh
# Access: http://localhost:8001

🐧 Linux (Native)

# With NVIDIA GPU
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
./start.sh

🔌 API & Integration

REST API

import requests

# Single image OCR
with open("image.png", "rb") as f:
    response = requests.post(
        "http://localhost:8001/ocr",
        files={"file": f},
        data={"prompt_type": "ocr"}
    )
    print(response.json()["text"])

# PDF OCR (all pages)
with open("document.pdf", "rb") as f:
    response = requests.post(
        "http://localhost:8001/ocr-pdf",
        files={"file": f},
        data={"prompt_type": "document"}
    )
    print(response.json()["merged_text"])

Endpoints:

GET /health - Health check
POST /ocr - Single image OCR
POST /ocr-pdf - PDF OCR (all pages)
POST /pdf-to-images - Convert PDF to images

📖 Full API Documentation: API.md

MCP (Model Context Protocol)

Enable AI assistants like Claude Desktop to use OCR:

{
  "mcpServers": {
    "deepseek-ocr": {
      "command": "python",
      "args": ["/path/to/mcp_server.py"]
    }
  }
}

📖 MCP Setup Guide: MCP_SETUP.md

🌐 Multilingual Support

Language	Code	Status
🇺🇸 English	en-US	✅ Default
🇨🇳 简体中文	zh-CN	✅
🇹🇼 繁體中文	zh-TW	✅
🇯🇵 日本語	ja-JP	✅

Switch language via the selector in the top-right corner.

📊 Version History

v4.1 (2026-02-20) - UI Improvements & Model Version Display

🏷️ UI & API Enhancements:

✅ OCR-2 model badge in header for instant version recognition
✅ Table rendering fix: white background, dark text, zebra striping
✅ Health API returns model_version: "DeepSeek-OCR-2"
✅ Footer updated to v4.1 · OCR-2

v4.0 (2026-02-20) - DeepSeek-OCR-2 Model Upgrade

🧠 Major Model Upgrade:

✅ Upgraded to DeepSeek-OCR-2 (Visual Causal Flow)
✅ Dynamic resolution: (0-6)×768×768 + 1×1024×1024
✅ Flash Attention 2 on CUDA for optimal inference speed
✅ Switched from AutoProcessor to AutoTokenizer
✅ image_size upgraded from 640 to 768
✅ New Dockerfile.v4.0 with pre-downloaded OCR-2 model
✅ Full backward compatibility with all v3.6 features

v3.6 (2026-01-20) - Backend Concurrency & Rate Limiting

⚡ Performance Optimization:

✅ Non-blocking inference with ThreadPoolExecutor
✅ Concurrency control with asyncio.Semaphore (OCR: 1, PDF: 2)
✅ Queue system with MAX_OCR_QUEUE_SIZE and dynamic status
✅ Per-IP and per-Client-ID rate limiting (X-Client-ID header)
✅ 429 error handling (queue full, client limit, IP limit)
✅ Health indicator with 3 status colors (green/yellow/red)
✅ OCR queue popover with real-time position display

🙏 Contributors: @cloudman6 (PR #41)

v3.5 (2026-01-17) - Vue 3 Frontend

🎨 Complete UI Overhaul:

✅ Vue 3 + TypeScript + Naive UI
✅ Dexie.js local database
✅ Real-time processing queue
✅ Health check monitoring
✅ E2E test coverage (Playwright)
✅ GitHub links in header

🙏 Contributors: @cloudman6 (164 commits)

v3.3.1 (2025-12-16) - BFloat16 Fix

✅ Fixed GPU compatibility for RTX 20xx, GTX 10xx
✅ Auto-detect compute capability

v3.3 (2025-11-05) - Apple Silicon

✅ Native MPS backend for Mac M1/M2/M3/M4
✅ Multi-platform architecture

v3.2 (2025-11-04) - PDF Support

✅ PDF upload and conversion
✅ ModelScope auto-fallback

📖 Documentation

Document	Description
API.md	REST API reference
MCP_SETUP.md	MCP integration guide
DOCKER_HUB.md	Docker deployment
CHANGELOG.md	Version history

📈 Star History

⭐ If this project helps you, please give it a Star! ⭐

🤝 Contributing

Contributions welcome! Please:

Fork this repository
Create feature branch (git checkout -b feature/AmazingFeature)
Commit changes (git commit -m 'Add AmazingFeature')
Push to branch (git push origin feature/AmazingFeature)
Open Pull Request

📄 License

This project is licensed under the MIT License.

🙏 Acknowledgments

DeepSeek-AI - DeepSeek-OCR model
@cloudman6 - Vue 3 frontend development
All contributors and users

Made with ❤️ by neosun100 & cloudman6

Name		Name	Last commit message	Last commit date
Latest commit History 200 Commits
.github/workflows		.github/workflows
DeepSeek-OCR-master		DeepSeek-OCR-master
assets		assets
backends		backends
frontend		frontend
images		images
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
ABOUT.md		ABOUT.md
API.md		API.md
BUGFIX_SUMMARY.md		BUGFIX_SUMMARY.md
CHANGELOG.md		CHANGELOG.md
COMPLETE_SETUP.md		COMPLETE_SETUP.md
DEPLOYMENT_SUMMARY.md		DEPLOYMENT_SUMMARY.md
DOCKER_HUB.md		DOCKER_HUB.md
DeepSeek_OCR_paper.pdf		DeepSeek_OCR_paper.pdf
Dockerfile		Dockerfile
Dockerfile.allinone		Dockerfile.allinone
Dockerfile.gpu		Dockerfile.gpu
Dockerfile.v3.4		Dockerfile.v3.4
Dockerfile.v3.4-vue3		Dockerfile.v3.4-vue3
Dockerfile.v3.6		Dockerfile.v3.6
Dockerfile.v4.0		Dockerfile.v4.0
ENHANCED_FEATURES.md		ENHANCED_FEATURES.md
FINAL_SUMMARY.txt		FINAL_SUMMARY.txt
FIND_MODE_GUIDE.md		FIND_MODE_GUIDE.md
FIND_MODE_V2_GUIDE.md		FIND_MODE_V2_GUIDE.md
FIXES.md		FIXES.md
FIXES_SUMMARY.md		FIXES_SUMMARY.md
GITHUB_INTEGRATION.md		GITHUB_INTEGRATION.md
GPU_MANAGEMENT.md		GPU_MANAGEMENT.md
GPU_UPGRADE_SUMMARY.md		GPU_UPGRADE_SUMMARY.md
I18N_IMPLEMENTATION.md		I18N_IMPLEMENTATION.md
I18N_TEST_GUIDE.md		I18N_TEST_GUIDE.md
LICENSE		LICENSE
MCP_SETUP.md		MCP_SETUP.md
MULTILINGUAL_SUPPORT.txt		MULTILINGUAL_SUPPORT.txt
PUSH_SUMMARY.md		PUSH_SUMMARY.md
QUICKSTART_GPU.md		QUICKSTART_GPU.md
QUICK_START.md		QUICK_START.md
README.md		README.md
README_MULTIPLATFORM.md		README_MULTIPLATFORM.md
README_ja.md		README_ja.md
README_v4.md		README_v4.md
README_zh-CN.md		README_zh-CN.md
README_zh-TW.md		README_zh-TW.md
TEST_REPORT_v3.3.1.md		TEST_REPORT_v3.3.1.md
UPDATE_SUMMARY.md		UPDATE_SUMMARY.md
boundary_issue.png		boundary_issue.png
deepseek-ocr.service		deepseek-ocr.service
docker-compose.gpu.yml		docker-compose.gpu.yml
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
find_mode_issue.png		find_mode_issue.png
fix_ui.py		fix_ui.py
fix_ui_final.py		fix_ui_final.py
fix_ui_footer.py		fix_ui_footer.py
gpu_manager.py		gpu_manager.py
i18n.js		i18n.js
mcp_server.py		mcp_server.py
ocr_ui_enhanced.html		ocr_ui_enhanced.html
ocr_ui_modern.html		ocr_ui_modern.html
ocr_ui_modern.html.backup		ocr_ui_modern.html.backup
ocr_ui_modern_backup.html		ocr_ui_modern_backup.html
ocr_ui_modern_backup_v3.html		ocr_ui_modern_backup_v3.html
optimize_ui.py		optimize_ui.py
requirements-mac.txt		requirements-mac.txt
requirements.txt		requirements.txt
start.sh		start.sh
start_gpu.sh		start_gpu.sh
test_gpu_management.sh		test_gpu_management.sh
update_readmes.sh		update_readmes.sh
verify_deployment.sh		verify_deployment.sh
verify_mac_env.sh		verify_mac_env.sh
web_service.py		web_service.py
web_service_gpu.py		web_service_gpu.py
web_service_unified.py		web_service_unified.py
web_service_vllm_backup.py		web_service_vllm_backup.py
公众号文章_v3.2更新.md		公众号文章_v3.2更新.md

License

neosun100/DeepSeek-OCR-WebUI

Folders and files

Latest commit

History

Repository files navigation

🔍 DeepSeek-OCR-WebUI

🎉 v4.1 Update: UI Improvements & Model Version Display

🎉 v4.0 Update: DeepSeek-OCR-2 Model Upgrade!

✨ What's New in v4.0

🔧 Technical Changes

🎉 v3.6 Update: Backend Concurrency & Rate Limiting!

✨ What's New in v3.6

🎉 v3.5 Major Update: Brand New Vue 3 Frontend!

✨ What's New in v3.5

👥 Contributors

🌟 Special Thanks to Our Amazing Contributors! 🌟

📖 Introduction

✨ Core Highlights

🚀 Features

7 Recognition Modes

🆕 Vue 3 Frontend Features

🖼️ Screenshots

Home Page

Processing Interface

Quick Start Guide

📦 Quick Start

🐳 Docker (Recommended)

Available Docker Tags

🍎 Mac (Apple Silicon)

🐧 Linux (Native)

🔌 API & Integration

REST API

MCP (Model Context Protocol)

🌐 Multilingual Support

📊 Version History

v4.1 (2026-02-20) - UI Improvements & Model Version Display

v4.0 (2026-02-20) - DeepSeek-OCR-2 Model Upgrade

v3.6 (2026-01-20) - Backend Concurrency & Rate Limiting

v3.5 (2026-01-17) - Vue 3 Frontend

v3.3.1 (2025-12-16) - BFloat16 Fix

v3.3 (2025-11-05) - Apple Silicon

v3.2 (2025-11-04) - PDF Support

📖 Documentation

📈 Star History

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages