# local-deep-research

**Repository Path**: devdz/local-deep-research

## Basic Information

- **Project Name**: local-deep-research
- **Description**: Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and your private documents. Everything Local & Encrypted.
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2026-03-09
- **Last Updated**: 2026-03-17

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# Local Deep Research

<div align="center">

[![GitHub stars](https://img.shields.io/github/stars/LearningCircuit/local-deep-research?style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/stargazers)
[![Docker Pulls](https://img.shields.io/docker/pulls/localdeepresearch/local-deep-research?style=for-the-badge)](https://hub.docker.com/r/localdeepresearch/local-deep-research)
[![PyPI Downloads](https://img.shields.io/pypi/dm/local-deep-research?style=for-the-badge)](https://pypi.org/project/local-deep-research/)

[![Trendshift](https://trendshift.io/api/badge/repositories/14116)](https://trendshift.io/repositories/14116)

[![Commits](https://img.shields.io/github/commit-activity/m/LearningCircuit/local-deep-research?style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/commits/main)
[![Last Commit](https://img.shields.io/github/last-commit/LearningCircuit/local-deep-research?style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/commits/main)

[![SimpleQA Accuracy](https://img.shields.io/badge/SimpleQA-~95%25_Accuracy-gold?style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/tree/main/community_benchmark_results)
[![SQLCipher](https://img.shields.io/badge/Database-SQLCipher_Encrypted-red?style=for-the-badge&logo=sqlite&logoColor=white)](docs/SQLCIPHER_INSTALL.md)

[![🔒 Security](https://github.com/LearningCircuit/local-deep-research/actions/workflows/security-tests.yml/badge.svg?branch=main&style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/security-tests.yml)
[![CodeQL](https://github.com/LearningCircuit/local-deep-research/actions/workflows/codeql.yml/badge.svg?branch=main&style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/security/code-scanning)
[![🔍 Semgrep](https://github.com/LearningCircuit/local-deep-research/actions/workflows/semgrep.yml/badge.svg?branch=main&style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/semgrep.yml)
[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/LearningCircuit/local-deep-research/badge)](https://securityscorecards.dev/viewer/?uri=github.com/LearningCircuit/local-deep-research)

[![✅ All Tests](https://github.com/LearningCircuit/local-deep-research/actions/workflows/tests.yml/badge.svg?branch=main&style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/tests.yml)
[![🔧 Pre-commit](https://github.com/LearningCircuit/local-deep-research/actions/workflows/pre-commit.yml/badge.svg?branch=main&style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/pre-commit.yml)
[![🐳 Container Security](https://github.com/LearningCircuit/local-deep-research/actions/workflows/container-security.yml/badge.svg?branch=main&style=for-the-badge)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/container-security.yml)

[![🐳 Docker Publish](https://github.com/LearningCircuit/local-deep-research/actions/workflows/docker-publish.yml/badge.svg)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/docker-publish.yml)
[![📦 PyPI Publish](https://github.com/LearningCircuit/local-deep-research/actions/workflows/publish.yml/badge.svg)](https://github.com/LearningCircuit/local-deep-research/actions/workflows/publish.yml)

[![Discord](https://img.shields.io/discord/1352043059562680370?style=for-the-badge&logo=discord)](https://discord.gg/ttcqQeFcJ3)
[![Reddit](https://img.shields.io/badge/Reddit-r/LocalDeepResearch-FF4500?style=for-the-badge&logo=reddit)](https://www.reddit.com/r/LocalDeepResearch/)


**AI-powered research assistant for deep, iterative research**

*Performs deep, iterative research using multiple LLMs and search engines with proper citations*
</div>

## 🚀 What is Local Deep Research?

LDR is an AI research assistant that performs systematic research by:

- **Breaking down complex questions** into focused sub-queries
- **Searching multiple sources** in parallel (web, academic papers, local documents)
- **Verifying information** across sources for accuracy
- **Creating comprehensive reports** with proper citations

It aims to help researchers, students, and professionals find accurate information quickly while maintaining transparency about sources.

## 🎯 Why Choose LDR?

- **Privacy-Focused**: Run entirely locally with Ollama + SearXNG
- **Flexible**: Use any LLM, any search engine, any vector store
- **Comprehensive**: Multiple research modes from quick summaries to detailed reports
- **Transparent**: Track costs and performance with built-in analytics
- **Open Source**: MIT licensed with an active community

## 🔒 Security & Privacy

<div align="center">

[![AES-256](https://img.shields.io/badge/Encryption-AES--256-green?style=for-the-badge&logo=lock&logoColor=white)](docs/SQLCIPHER_INSTALL.md)
[![Per-User DB](https://img.shields.io/badge/Security-Per--User_Isolated_DB-purple?style=for-the-badge&logo=shield&logoColor=white)](docs/MIGRATION_GUIDE_v1.md)
[![Zero Knowledge](https://img.shields.io/badge/Privacy-Zero_Knowledge-blue?style=for-the-badge&logo=eye-slash&logoColor=white)](docs/SQLCIPHER_INSTALL.md)

</div>

- **Signal-level encryption**: SQLCipher with AES-256 (same technology used by Signal messenger) protects all user data at rest
- **Per-user isolated databases**: Each user has their own encrypted database - complete data isolation
- **Zero-knowledge architecture**: No password recovery mechanism ensures true privacy
- **Advanced key derivation**: PBKDF2-SHA512 with 256,000 iterations prevents brute-force attacks
- **Data integrity**: HMAC-SHA512 verification prevents tampering

## 📊 Performance

**~95% accuracy on SimpleQA benchmark** (preliminary results)
- Tested with GPT-4.1-mini + SearXNG + focused-iteration strategy
- Comparable to state-of-the-art AI research systems
- Local models can achieve similar performance with proper configuration
- [Join our community benchmarking effort →](https://github.com/LearningCircuit/local-deep-research/tree/main/community_benchmark_results)

## ✨ Key Features

### 🔍 Research Modes
- **Quick Summary** - Get answers in 30 seconds to 3 minutes with citations
- **Detailed Research** - Comprehensive analysis with structured findings
- **Report Generation** - Professional reports with sections and table of contents
- **Document Analysis** - Search your private documents with AI

### 🛠️ Advanced Capabilities
- **[LangChain Integration](docs/LANGCHAIN_RETRIEVER_INTEGRATION.md)** - Use any vector store as a search engine
- **[REST API](docs/api-quickstart.md)** - Authenticated HTTP access with per-user databases
- **[Benchmarking](docs/BENCHMARKING.md)** - Test and optimize your configuration
- **[Analytics Dashboard](docs/analytics-dashboard.md)** - Track costs, performance, and usage metrics
- **Real-time Updates** - WebSocket support for live research progress
- **Export Options** - Download results as PDF or Markdown
- **Research History** - Save, search, and revisit past research
- **Adaptive Rate Limiting** - Intelligent retry system that learns optimal wait times
- **Keyboard Shortcuts** - Navigate efficiently (ESC, Ctrl+Shift+1-5)
- **Per-User Encrypted Databases** - Secure, isolated data storage for each user

### 📰 News & Research Subscriptions
- **Automated Research Digests** - Subscribe to topics and receive AI-powered research summaries
- **Customizable Frequency** - Daily, weekly, or custom schedules for research updates
- **Smart Filtering** - AI filters and summarizes only the most relevant developments
- **Multi-format Delivery** - Get updates as markdown reports or structured summaries
- **Topic & Query Support** - Track specific searches or broad research areas

### 🌐 Search Sources

#### Free Search Engines
- **Academic**: arXiv, PubMed, Semantic Scholar
- **General**: Wikipedia, SearXNG
- **Technical**: GitHub, Elasticsearch
- **Historical**: Wayback Machine
- **News**: The Guardian, Wikinews

#### Premium Search Engines
- **Tavily** - AI-powered search
- **Google** - Via SerpAPI or Programmable Search Engine
- **Brave Search** - Privacy-focused web search

#### Custom Sources
- **Local Documents** - Search your files with AI
- **LangChain Retrievers** - Any vector store or database
- **Meta Search** - Combine multiple engines intelligently

[Full Search Engines Guide →](docs/search-engines.md)

## ⚡ Quick Start

### Option 1: Docker

```bash
# Step 1: Pull and run SearXNG for optimal search results
docker run -d -p 8080:8080 --name searxng searxng/searxng

# Step 2: Pull and run Local Deep Research
docker run -d -p 5000:5000 --network host \
  --name local-deep-research \
  --volume 'deep-research:/data' \
  -e LDR_DATA_DIR=/data \
  localdeepresearch/local-deep-research
```

### Option 2: Docker Compose (Recommended)

LDR uses Docker compose to bundle the web app and all its dependencies so
you can get up and running quickly.

#### Option 2a: Quick Start (One Command)

**Default: CPU-only base (works on all platforms)**

The base configuration works on macOS (M1/M2/M3/M4 and Intel), Windows, and Linux without requiring any GPU hardware.

**Quick Start Command:**

**Note:** `curl -O` will overwrite existing docker-compose.yml files in the current directory.

Linux/macOS:

```bash
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d
```

Windows (PowerShell required):

```powershell
curl.exe -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml
if ($?) { docker compose up -d }
```

**Use with a different model:**

```bash
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && MODEL=gpt-oss:20b docker compose up -d
```

---

##### **Option 2a-GPU: Add NVIDIA GPU Acceleration (Linux only)**

⚠️ **Breaking Change:** If you previously used `docker compose up -d` with GPU support, you now need to use the override file (see commands below).

For users with NVIDIA GPUs who want hardware acceleration.

**Prerequisites:**

Install the NVIDIA Container Toolkit first (Ubuntu/Debian):

```bash
# Install NVIDIA Container Toolkit (for GPU support)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
  && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt-get update
sudo apt-get install nvidia-container-toolkit -y
sudo systemctl restart docker

# Verify installation
nvidia-smi
```

**Verify:** The `nvidia-smi` command should display your GPU information. If it fails, check your NVIDIA driver installation.

**Note:** For RHEL/CentOS/Fedora, Arch, or other Linux distributions, see the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).

**Quick Start Commands:**

**Note:** `curl -O` will overwrite existing files in the current directory.

```bash
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && \
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.gpu.override.yml && \
docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d
```

**Use with a different model:**

```bash
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && \
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.gpu.override.yml && \
MODEL=gpt-oss:20b docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml up -d
```

**Optional: Create an alias for convenience**

```bash
alias docker-compose-gpu='docker compose -f docker-compose.yml -f docker-compose.gpu.override.yml'
# Then simply use: docker-compose-gpu up -d
```

---

Open http://localhost:5000 after ~30 seconds. This starts LDR with SearXNG and all dependencies.

#### Option 2b: DIY docker-compose
See [docker-compose.yml](./docker-compose.yml) for a docker-compose file with reasonable defaults to get up and running with ollama, searxng, and local deep research all running locally.

Things you may want/need to configure:
* Ollama GPU driver
* Ollama context length (depends on available VRAM)
* Ollama keep alive (duration model will stay loaded into VRAM and idle before getting unloaded automatically)
* Deep Research model (depends on available VRAM and preference)

#### Option 2c: Use Cookie Cutter to tailor a docker-compose to your needs:

##### Prerequisites

- [Docker](https://docs.docker.com/engine/install/)
- [Docker Compose](https://docs.docker.com/compose/install/)
- `cookiecutter`: Run `pip install --user cookiecutter`

Clone the repository:

```bash
git clone https://github.com/LearningCircuit/local-deep-research.git
cd local-deep-research
```

### Configuring with Docker Compose

Cookiecutter will interactively guide you through the process of creating a
`docker-compose` configuration that meets your specific needs. This is the
recommended approach if you are not very familiar with Docker.

In the LDR repository, run the following command
to generate the compose file:

```bash
cookiecutter cookiecutter-docker/
docker compose -f docker-compose.default.yml up
```

[Docker Compose Guide →](docs/docker-compose-guide.md)

### Option 3: Unraid

**For Unraid users:**

Local Deep Research is fully compatible with Unraid servers!

#### Quick Install (Template Method)

1. Navigate to **Docker** tab → **Docker Repositories**
2. Add template repository:
   ```
   https://github.com/LearningCircuit/local-deep-research
   ```
3. Click **Add Container** → Select **LocalDeepResearch** from template
4. Configure paths (default: `/mnt/user/appdata/local-deep-research/`)
5. Click **Apply**

#### Docker Compose Manager Plugin

If you prefer using Docker Compose on Unraid:

1. Install "Docker Compose Manager" from Community Applications
2. Create a new stack with the compose file from this repo
3. Update volume paths to Unraid format (`/mnt/user/appdata/...`)

**Features on Unraid:**
- ✅ Pre-configured template with sensible defaults
- ✅ Automatic SearXNG and Ollama integration
- ✅ NVIDIA GPU passthrough support (optional)
- ✅ Integration with Unraid shares for document search
- ✅ Backup integration with CA Appdata Backup plugin

[Complete Unraid Setup Guide →](docs/deployment/unraid.md)

### Option 4: Python Package

```bash
# Step 1: Install the package
pip install local-deep-research

# Step 2: Setup SearXNG for best results
docker pull searxng/searxng
docker run -d -p 8080:8080 --name searxng searxng/searxng

# Step 3: Install Ollama from https://ollama.ai

# Step 4: Download a model
ollama pull gemma3:12b

# Step 5: Start the web interface
python -m local_deep_research.web.app
```

> **Note:** For development from source, see the [Development Guide](docs/developing.md).

#### Optional Dependencies

VLLM support (for running transformer models directly):
```bash
pip install "local-deep-research[vllm]"
```
This installs torch, transformers, and vllm for advanced local model hosting. Most users running Ollama or LlamaCpp don't need this.

[Full Installation Guide →](https://github.com/LearningCircuit/local-deep-research/wiki/Installation)

## 💻 Usage Examples

### Python API
```python
from local_deep_research.api import LDRClient, quick_query

# Option 1: Simplest - one line research
summary = quick_query("username", "password", "What is quantum computing?")
print(summary)

# Option 2: Client for multiple operations
client = LDRClient()
client.login("username", "password")
result = client.quick_research("What are the latest advances in quantum computing?")
print(result["summary"])
```

### HTTP API

*The code example below shows the basic API structure - for working examples, see the link below*

```python
import requests
from bs4 import BeautifulSoup

# Create session and authenticate
session = requests.Session()
login_page = session.get("http://localhost:5000/auth/login")
soup = BeautifulSoup(login_page.text, "html.parser")
login_csrf = soup.find("input", {"name": "csrf_token"}).get("value")

# Login and get API CSRF token
session.post("http://localhost:5000/auth/login",
            data={"username": "user", "password": "pass", "csrf_token": login_csrf})
csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]

# Make API request
response = session.post("http://localhost:5000/api/start_research",
                       json={"query": "Your research question"},
                       headers={"X-CSRF-Token": csrf})
```

🚀 **[Ready-to-use HTTP API Examples → examples/api_usage/http/](examples/api_usage/http/)**
- ✅ **Automatic user creation** - works out of the box
- ✅ **Complete authentication** with CSRF handling
- ✅ **Result retry logic** - waits until research completes
- ✅ **Progress monitoring** and error handling

### Command Line Tools

```bash
# Run benchmarks from CLI
python -m local_deep_research.benchmarks --dataset simpleqa --examples 50

# Manage rate limiting
python -m local_deep_research.web_search_engines.rate_limiting status
python -m local_deep_research.web_search_engines.rate_limiting reset
```

## 🔗 Enterprise Integration

Connect LDR to your existing knowledge base:

```python
from local_deep_research.api import quick_summary

# Use your existing LangChain retriever
result = quick_summary(
    query="What are our deployment procedures?",
    retrievers={"company_kb": your_retriever},
    search_tool="company_kb"
)
```

Works with: FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and any LangChain-compatible retriever.

[Integration Guide →](docs/LANGCHAIN_RETRIEVER_INTEGRATION.md)

## 📊 Performance & Analytics

### Benchmark Results
Early experiments on small SimpleQA dataset samples:

| Configuration | Accuracy | Notes |
|--------------|----------|--------|
| gpt-4.1-mini + SearXNG + focused_iteration | 90-95% | Limited sample size |
| gpt-4.1-mini + Tavily + focused_iteration | 90-95% | Limited sample size |
| gemini-2.0-flash-001 + SearXNG | 82% | Single test run |

Note: These are preliminary results from initial testing. Performance varies significantly based on query types, model versions, and configurations. [Run your own benchmarks →](docs/BENCHMARKING.md)

### Built-in Analytics Dashboard
Track costs, performance, and usage with detailed metrics. [Learn more →](docs/analytics-dashboard.md)

## 🤖 Supported LLMs

### Local Models (via Ollama)
- Llama 3, Mistral, Gemma, DeepSeek
- LLM processing stays local (search queries still go to web)
- No API costs

### Cloud Models
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude 3)
- Google (Gemini)
- 100+ models via OpenRouter

[Model Setup →](docs/env_configuration.md)

## 📚 Documentation

### Getting Started
- [Installation Guide](https://github.com/LearningCircuit/local-deep-research/wiki/Installation)
- [Frequently Asked Questions](docs/faq.md)
- [API Quickstart](docs/api-quickstart.md)
- [Configuration Guide](docs/env_configuration.md)

### Core Features
- [All Features Guide](docs/features.md)
- [Search Engines Guide](docs/search-engines.md)
- [Analytics Dashboard](docs/analytics-dashboard.md)

### Advanced Features
- [LangChain Integration](docs/LANGCHAIN_RETRIEVER_INTEGRATION.md)
- [Benchmarking System](docs/BENCHMARKING.md)
- [Elasticsearch Setup](docs/elasticsearch_search_engine.md)
- [SearXNG Setup](docs/SearXNG-Setup.md)

### Development
- [Docker Compose Guide](docs/docker-compose-guide.md)
- [Development Guide](docs/developing.md)
- [Security Guide](docs/security/CODEQL_GUIDE.md)
- [Release Guide](docs/RELEASE_GUIDE.md)

### Examples & Tutorials
- [API Examples](examples/api_usage/)
- [Benchmark Examples](examples/benchmarks/)
- [Optimization Examples](examples/optimization/)

## 📰 Featured In

> "Local Deep Research **deserves special mention** for those who prioritize privacy... **tuned to use open-source LLMs** that can run on consumer GPUs or even CPUs. Journalists, researchers, or companies with sensitive topics can investigate information **without queries ever hitting an external server**."
>
> — [Medium: Open-Source Deep Research AI Assistants](https://medium.com/@leucopsis/open-source-deep-research-ai-assistants-157462a59c14)

- [Hacker News](https://news.ycombinator.com/item?id=43330164) - Community discussion
- [Zhihu (知乎)](https://zhuanlan.zhihu.com/p/30886269290) - Chinese tech community coverage

## 🤝 Community & Support

- [Discord](https://discord.gg/ttcqQeFcJ3) - Get help and share research techniques
- [Reddit](https://www.reddit.com/r/LocalDeepResearch/) - Updates and showcases
- [GitHub Issues](https://github.com/LearningCircuit/local-deep-research/issues) - Bug reports

## 🚀 Contributing

We welcome contributions! See our [Contributing Guide](CONTRIBUTING.md) to get started.

## 📄 License

MIT License - see [LICENSE](LICENSE) file.

Built with: [LangChain](https://github.com/hwchase17/langchain), [Ollama](https://ollama.ai), [SearXNG](https://searxng.org/), [FAISS](https://github.com/facebookresearch/faiss)

> **Support Free Knowledge:** Consider donating to [Wikipedia](https://donate.wikimedia.org), [arXiv](https://arxiv.org/about/give), or [PubMed](https://www.nlm.nih.gov/pubs/donations/donations.html).