# paddleocr-vl-cpu
**Repository Path**: cjy13/paddleocr-vl-cpu
## Basic Information
- **Project Name**: paddleocr-vl-cpu
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-12-18
- **Last Updated**: 2025-12-18
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
PaddleOCR-VL CPU API Server
**🎯 Run PaddleOCR-VL on CPU with OpenAI-compatible API**
**在 CPU 上运行 PaddleOCR-VL,提供 OpenAI 兼容 API**
[English](#english) | [简体中文](#chinese)
---
## 🌟 Why This Project?
Break the GPU barrier! This project enables you to run PaddleOCR-VL (PaddlePaddle's excellent OCR model) on CPU with an OpenAI-compatible API interface.
| Feature | Official PaddleOCR | This Project |
|---------|-------------------|--------------|
| **CPU Support** | ❌ GPU Required | ✅ Full CPU Support |
| **API Format** | Custom | OpenAI-compatible |
| **Framework** | PaddlePaddle | Pure PyTorch |
| **Deployment** | Manual setup | One-line script |
| **Service Management** | Manual | tmux automation |
### ✨ Key Features
- 🖥️ **CPU-Friendly** - Optimized for CPU inference with automatic memory management
- 🔌 **OpenAI-Compatible** - Drop-in replacement for OpenAI's vision API
- 🚀 **Easy Deployment** - One command to install and start
- ⚡ **Streaming Support** - Real-time response streaming
- 💾 **Auto Memory Cleanup** - Prevents memory leaks for long-running services
- 🖼️ **Multiple Formats** - Supports JPEG, PNG, BMP, Base64, and URLs
---
## 🚀 Quick Start
### Prerequisites
- Python 3.12.3
- 16GB+ RAM (recommended for CPU inference)
- Ubuntu/Linux (Windows users please use WSL)
### Installation
```bash
# Clone repository
git clone https://github.com/Think-Core/paddleocr-vl-cpu.git
cd paddleocr-vl-cpu
# One-click setup (downloads model automatically ~2GB)
chmod +x quick_init_setup.sh
./quick_init_setup.sh
```
The script will:
1. ✅ Create Python virtual environment
2. ✅ Install all dependencies
3. ✅ Download PaddleOCR-VL-0.9B model (~2GB)
4. ✅ Verify installation
### Start Service
```bash
# Start server
./quick_start.sh start
# Check status
./quick_start.sh status
# View logs
./quick_start.sh attach # Press Ctrl+B then D to detach
# Stop server
./quick_start.sh stop
```
Server runs on `http://localhost:7777`
---
## 📖 Usage Examples
### Basic OCR
```python
import requests
import base64
# Read and encode image
with open("document.jpg", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
# OCR request
response = requests.post(
"http://localhost:7777/v1/chat/completions",
json={
"model": "paddleocr-vl",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Extract all text from this image"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image_b64}"
}
}
]
}],
"max_tokens": 2048
}
)
print(response.json()["choices"][0]["message"]["content"])
```
### Streaming Response
```python
import json
response = requests.post(
"http://localhost:7777/v1/chat/completions",
json={
"model": "paddleocr-vl",
"messages": [...], # same as above
"stream": True
},
stream=True
)
for line in response.iter_lines():
if line:
line = line.decode('utf-8')
if line.startswith('data: '):
data = line[6:]
if data != '[DONE]':
chunk = json.loads(data)
content = chunk["choices"][0]["delta"].get("content", "")
print(content, end='', flush=True)
```
### cURL Example
```bash
curl -X POST http://localhost:7777/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "paddleocr-vl",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "Read this document"},
{
"type": "image_url",
"image_url": {"url": "data:image/jpeg;base64,{image_b64}"}
}
]
}]
}'
```
---
## 🎯 Use Cases
- 📄 **Document Digitization** - Convert scanned documents to text
- 🧾 **Invoice Processing** - Extract invoice information automatically
- 📋 **Form Recognition** - Parse structured forms and tables
- 🔍 **ID Card Reading** - Extract text from ID cards and certificates
- 📚 **Book Scanning** - Digitize printed books and articles
- 📝 **Handwriting OCR** - Recognize handwritten text (limited accuracy)
---
## ⚠️ Important Notes
### Security Warning
**🔴 This server has NO authentication by default!**
**Do NOT expose directly to the internet!**
For production deployment:
- ✅ Use reverse proxy (Nginx/Caddy) with HTTPS
- ✅ Add API key authentication
- ✅ Configure firewall rules
- ✅ Enable rate limiting
- ✅ Set up monitoring
### Limitations
- ❌ CPU-only (no GPU acceleration in this version)
- ❌ Single request processing (no parallel requests)
- ❌ No built-in authentication
- ❌ Model files (~2GB) not included in repository
---
## 🛠️ Troubleshooting
### Model Download Failed
```bash
# Manual download
mkdir -p models && cd models
git lfs install
git clone https://www.modelscope.cn/PaddlePaddle/PaddleOCR-VL.git PaddleOCR-VL-0.9B
```
### Out of Memory
- Reduce image size
- Lower `max_tokens`
- Restart service: `./quick_start.sh restart`
### Service Won't Start
```bash
# Check port usage
lsof -i :7777
# View logs
./quick_start.sh attach
```
---
## 📁 Project Structure
```
paddleocr-vl-cpu/
├── paddleocr_api.py # Main API server
├── requirements_frozen.txt # Locked dependencies
├── quick_init_setup.sh # Setup script
├── quick_start.sh # Service management
├── python_version.txt # Python version lock
└── models/ # Model files (gitignored)
```
---
## 🙏 Acknowledgments
- **PaddleOCR Team** - For the excellent [PaddleOCR-VL model](https://www.modelscope.cn/PaddlePaddle/PaddleOCR-VL)
- **@hjdhnx** - For inspiration from [this discussion](https://linux.do/t/topic/1054681)
- **FastAPI** - Modern web framework
- **Transformers** - Hugging Face's ML library
---
## 📄 License
MIT License - see [LICENSE](LICENSE) file for details.
**Note:** PaddleOCR-VL model has its own license. Check the [model repository](https://www.modelscope.cn/PaddlePaddle/PaddleOCR-VL) for details.
---
## 📧 Support
- 🐛 **Bug Reports**: [GitHub Issues](https://github.com/Think-Core/paddleocr-vl-cpu/issues)
- 💬 **Discussions**: [GitHub Discussions](https://github.com/Think-Core/paddleocr-vl-cpu/discussions)
- ⭐ **Star this project** if you find it helpful!
---
# 中文文档
## 🌟 为什么选择这个项目?
打破 GPU 限制!本项目让你能在 CPU 上运行 PaddleOCR-VL(飞桨团队优秀的 OCR 模型),并提供 OpenAI 兼容的 API 接口。
| 特性 | 官方 PaddleOCR | 本项目 |
|-----|---------------|--------|
| **CPU 支持** | ❌ 需要 GPU | ✅ 完整 CPU 支持 |
| **API 格式** | 自定义 | OpenAI 兼容 |
| **框架依赖** | PaddlePaddle | 纯 PyTorch |
| **流式输出** | ❌ | ✅ 实时流式 |
| **部署方式** | 手动配置 | 一行命令 |
| **服务管理** | 手动管理 | tmux 自动化 |
### ✨ 核心特点
- 🖥️ **CPU 友好** - 针对 CPU 推理优化,自动内存管理
- 🔌 **OpenAI 兼容** - 可直接替换 OpenAI 视觉 API
- 🚀 **快速部署** - 一条命令完成安装和启动
- ⚡ **流式支持** - 实时流式响应
- 💾 **自动清理** - 防止长时间运行的内存泄漏
- 🖼️ **多格式支持** - 支持 JPEG、PNG、BMP、Base64、URL
---
## 🚀 快速开始
### 环境要求
- Python 3.12.3
- 16GB+ 内存(CPU 推理推荐配置)
- Ubuntu/Linux(Windows 用户请使用 WSL)
### 安装
```bash
# 克隆仓库
git clone https://github.com/Think-Core/paddleocr-vl-cpu.git
cd paddleocr-vl-cpu
# 一键安装(自动下载模型 约2GB)
chmod +x quick_init_setup.sh
./quick_init_setup.sh
```
脚本会自动完成:
1. ✅ 创建 Python 虚拟环境
2. ✅ 安装所有依赖
3. ✅ 下载 PaddleOCR-VL-0.9B 模型(约 2GB)
4. ✅ 验证安装
### 启动服务
```bash
# 启动服务器
./quick_start.sh start
# 检查状态
./quick_start.sh status
# 查看日志
./quick_start.sh attach # 按 Ctrl+B 然后 D 退出
# 停止服务
./quick_start.sh stop
```
服务运行在 `http://localhost:7777`
---
## 📖 使用示例
### 基础 OCR
```python
import requests
import base64
# 读取并编码图片
with open("document.jpg", "rb") as f:
image_b64 = base64.b64encode(f.read()).decode()
# OCR 请求
response = requests.post(
"http://localhost:7777/v1/chat/completions",
json={
"model": "paddleocr-vl",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "识别这张图片中的所有文字"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{image_b64}"
}
}
]
}],
"max_tokens": 2048
}
)
print(response.json()["choices"][0]["message"]["content"])
```
### 流式响应
```python
import json
response = requests.post(
"http://localhost:7777/v1/chat/completions",
json={
"model": "paddleocr-vl",
"messages": [...], # 同上
"stream": True
},
stream=True
)
for line in response.iter_lines():
if line:
line = line.decode('utf-8')
if line.startswith('data: '):
data = line[6:]
if data != '[DONE]':
chunk = json.loads(data)
content = chunk["choices"][0]["delta"].get("content", "")
print(content, end='', flush=True)
```
### cURL 示例
```bash
curl -X POST http://localhost:7777/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "paddleocr-vl",
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": "识别文字"},
{
"type": "image_url",
"image_url": {"url": "data:image/jpeg;base64,{image_b64}"}
}
]
}]
}'
```
---
## 🎯 应用场景
- 📄 **文档数字化** - 将扫描文档转换为文本
- 🧾 **发票处理** - 自动提取发票信息
- 📋 **表单识别** - 解析结构化表单和表格
- 🔍 **证件识别** - 提取身份证、证书等文字
- 📚 **书籍扫描** - 数字化纸质书籍和文章
- 📝 **手写识别** - 识别手写文字(准确度有限)
---
## ⚠️ 重要提示
### 安全警告
**🔴 服务器默认无身份验证!**
**切勿直接暴露到公网!**
生产部署建议:
- ✅ 使用反向代理(Nginx/Caddy)配置 HTTPS
- ✅ 添加 API 密钥验证
- ✅ 配置防火墙规则
- ✅ 启用速率限制
- ✅ 设置监控告警
### 限制说明
- ❌ 仅支持 CPU(本版本不含 GPU 加速)
- ❌ 单次请求处理(不支持并发请求)
- ❌ 无内置身份验证
- ❌ 模型文件(约 2GB)不包含在仓库中
---
## 🛠️ 故障排除
### 模型下载失败
```bash
# 手动下载
mkdir -p models && cd models
git lfs install
git clone https://www.modelscope.cn/PaddlePaddle/PaddleOCR-VL.git PaddleOCR-VL-0.9B
```
### 内存不足
- 压缩图片尺寸
- 降低 `max_tokens`
- 重启服务:`./quick_start.sh restart`
### 服务无法启动
```bash
# 检查端口占用
lsof -i :7777
# 查看日志
./quick_start.sh attach
```
---
## 📁 项目结构
```
paddleocr-vl-cpu/
├── paddleocr_api.py # API 服务器主程序
├── requirements_frozen.txt # 锁定的依赖版本
├── quick_init_setup.sh # 安装脚本
├── quick_start.sh # 服务管理脚本
├── python_version.txt # Python 版本锁定
└── models/ # 模型文件(已忽略)
```
---
## 🙏 致谢
- **PaddleOCR 团队** - 提供出色的 [PaddleOCR-VL 模型](https://www.modelscope.cn/PaddlePaddle/PaddleOCR-VL)
- **@hjdhnx** - 来自 [这个讨论](https://linux.do/t/topic/1054681) 的启发
- **FastAPI** - 现代化 Web 框架
- **Transformers** - Hugging Face 的 ML 库
---
## 📄 许可证
MIT 许可证 - 详见 [LICENSE](LICENSE) 文件。
**注意:** PaddleOCR-VL 模型有自己的许可证,请查看 [模型仓库](https://www.modelscope.cn/PaddlePaddle/PaddleOCR-VL) 了解详情。
---
## 📧 支持
- 🐛 **错误报告**:[GitHub Issues](https://github.com/Think-Core/paddleocr-vl-cpu/issues)
- 💬 **问题讨论**:[GitHub Discussions](https://github.com/Think-Core/paddleocr-vl-cpu/discussions)
- ⭐ **觉得有用请点星!**
---
**Made with ❤️ for the OCR community**
**用 ❤️ 为 OCR 社区打造**
[⭐ Star](https://github.com/Think-Core/paddleocr-vl-cpu) · [🐛 Report Bug](https://github.com/Think-Core/paddleocr-vl-cpu/issues) · [💡 Request Feature](https://github.com/Think-Core/paddleocr-vl-cpu/issues)