# bgiserver

**Repository Path**: jinfagang/BigServer

## Basic Information

- **Project Name**: bgiserver
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-10-03
- **Last Updated**: 2025-10-03

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# BigServer

BigServer is a high-performance LLM serving framework that supports multiple backends (vLLM, SGLang, Transformers) with an OpenAI-compatible API. It's designed to be easy to use and customize, supporting various models including Qwen3VL, MiniCPM, and more.


BigServer的使用场景：

- `vllm serve` 无法解决的场景，诸如需要用户验、健全、request审查等；
- 需要对推理过程进行精确控制的场景，例如自定义convtemplate、自定义停止词等；
- 需要对推理过程进行监控、分析等的场景；

总之， `bigserver` 通过融合多个后端，取代单一后端推理，同时包含了多个后端的离线使用方法，可以极大的方便用户定制化、自定义自己的推理后端。


## Features

- OpenAI-compatible API endpoints
- Support for multiple inference backends (vLLM, SGLang, Transformers)
- Easy model switching and management
- Multimodal model support (for models like Qwen3VL)
- FastAPI-based with async/await support
- Highly customizable and extensible

## Installation

```bash
pip install bigserver
```

Or for development:

```bash
git clone <repository-url>
cd BigServer
pip install -e .
```

To install with specific backends:

```bash
# For vLLM backend
pip install bigserver[vllm]

# For SGLang backend
pip install bigserver[sglang]

# For both
pip install bigserver[vllm,sglang]
```

## Quick Start

Start the server with the default model:

```bash
python api.py
```

Or programmatically:

```python
from bigserver import serve

serve(
    model_name="microsoft/DialoGPT-medium",  # or any Hugging Face model
    backend="transformers",  # vllm, sglang, transformers
    port=8000
)
```

## Usage

Once the server is running, you can use it with OpenAI-compatible clients:

```python
import openai

openai.base_url = "http://localhost:8000/v1"
openai.api_key = "dummy-key"  # BigServer doesn't require a real API key

# Chat completion
response = openai.ChatCompletion.create(
    model="microsoft/DialoGPT-medium",
    messages=[
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.choices[0].message.content)
```

## Configuration

You can configure the server using environment variables:

```bash
export MODEL_NAME="your-model-name"
export BACKEND="transformers"  # vllm, sglang, transformers
export MODEL_PATH="/path/to/local/model"
export HOST="0.0.0.0"
export PORT=8000
```

Or by passing parameters to the `serve` function:

```python
from bigserver import serve

serve(
    model_name="Qwen/Qwen2-7B",
    backend="transformers",
    model_path="/path/to/local/model",  # optional
    host="0.0.0.0",
    port=8000
)
```

## Supported Models

BigServer supports a wide range of models that work with the supported backends:

- Hugging Face transformers models
- Vision-language models (QwenVL, MiniCPM-V, etc.)
- Text generation models
- Instruction-following models
- Multimodal models (with appropriate backends)

## Backends

BigServer supports three different inference backends:

1. **Transformers**: Standard Hugging Face implementation, good for development and experimentation
2. **vLLM**: High-performance inference with advanced optimization techniques
3. **SGLang**: Efficient serving framework with runtime optimization

## Architecture

```
┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│   API Client    │───▶│   FastAPI App    │───▶│  Model Manager  │
│ (OpenAI compat) │    │ (Endpoints)      │    │                 │
└─────────────────┘    └──────────────────┘    └─────────────────┘
                                                        │
                                                        ▼
                                            ┌─────────────────────┐
                                            │   Backend Manager   │
                                            │                     │
                                            ├─────────────────────┤
                                            │ ├─ VLLM Backend     │
                                            │ ├─ SGLang Backend   │
                                            │ └─ Transformers Bknd│
                                            └─────────────────────┘
```

## Contributing

We welcome contributions! Please see our [Issues](https://github.com/example/bigserver/issues) page for ways to contribute.

## License

This project is licensed under the MIT License - see the LICENSE file for details.