# RynnEC
**Repository Path**: ramon09/RynnEC
## Basic Information
- **Project Name**: RynnEC
- **Description**: RynnEC: Bringing MLLMs into Embodied World
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-12-29
- **Last Updated**: 2025-12-29
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
If our project helps you, please give us a star â on GitHub to support us. đđ
[](https://github.com/alibaba-damo-academy/RynnEC/blob/main/LICENSE)
[](https://huggingface.co/collections/Alibaba-DAMO-Academy/rynnec-6893547fe802ace82cee8884)
[](https://huggingface.co/datasets/Alibaba-DAMO-Academy/RynnEC-Bench)
[](https://huggingface.co/spaces/Alibaba-DAMO-Academy/RynnEC)
[](https://www.youtube.com/watch?v=vsMxbzsmrQc)
[](https://www.modelscope.cn/collections/RynnEC-969b7cafd2d344)
[](https://www.modelscope.cn/models/DAMO_Academy/RynnEC-Bench)
[](https://arxiv.org/abs/2508.14160)
https://github.com/user-attachments/assets/3c12371e-ce95-4465-bc51-bff0b13749b5
## đ° News
* **[2025.08.17]** đ¤ [RynnEC-7B model](https://huggingface.co/Alibaba-DAMO-Academy/RynnEC-7B) checkpoint has been released in Huggingface.
* **[2025.08.08]** đĨđĨ Release our [RynnEC-2B model](https://huggingface.co/Alibaba-DAMO-Academy/RynnEC-2B), [RynnEC-Bench](https://huggingface.co/datasets/Alibaba-DAMO-Academy/RynnEC-Bench) and training code.
## đ Introduction
**RynnEC** is a video multi-modal large language model (MLLM) specifically designed for embodied cognition
tasks.
## đ ī¸ Requirements and Installation
Basic Dependencies:
* Python >= 3.10
* Pytorch >= 2.4.0
* CUDA Version >= 11.8
* transformers >= 4.46.3
Install required packages:
```bash
git clone https://github.com/alibaba-damo-academy/RynnEC
cd RynnEC
pip install -e .
pip install flash-attn --no-build-isolation
```
## đ Model Zoo
| Model | Base Model | HF Link |
| -------------------- | ------------ | ------------------------------------------------------------ |
| RynnEC-2B | [Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) & [VideoLLaMA3-2B-Image](https://huggingface.co/DAMO-NLP-SG/VideoLLaMA3-2B-Image) | [Alibaba-DAMO-Academy/RynnEC-2B](https://huggingface.co/Alibaba-DAMO-Academy/RynnEC-2B) |
| RynnEC-7B | [Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) & [VideoLLaMA3-7B-Image](https://huggingface.co/DAMO-NLP-SG/VideoLLaMA3-7B-Image) | [Alibaba-DAMO-Academy/RynnEC-7B](https://huggingface.co/Alibaba-DAMO-Academy/RynnEC-7B) |
### CookBook
Checkout [inference notebooks](inference/notebooks/) that demonstrate how to use RynnEC on various applications such as basic object understanding, spatial understanding and video object segmentation in egocentric world.
| Notebooks | Description |
| :-------------------- | ------------------------------------------------------------------------ |
| [Object Understanding](inference/notebooks/1.object_understanding.ipynb) | Demonstrates how to use RynnEC for **general object recognition and understanding** |
| [Spatial Understanding](inference/notebooks/2.spatial_understanding.ipynb) | Demonstrations of using RynnEC for **spatial understanding** with 3D awareness|
| [Video Object Segmentation](inference/notebooks/3.object_segmentation.ipynb) | Demonstrations of using RynnEC for **video object segmentation** with text-based instructions|
## đ¤ Demo
It is highly recommended to try our [online demo](https://huggingface.co/spaces/Alibaba-DAMO-Academy/RynnEC) first.
Otherwise, you can launch a gradio app locally:
```bash
python inference/gradio_demo.py --model-path Alibaba-DAMO-Academy/RynnEC-2B
options:
--model-path MODEL_PATH, --model_path MODEL_PATH
--port SERVER_PORT, --server_port SERVER_PORT
Optional. Port of the model server.
```
## đšī¸ RynnEC-Bench
RynnEC-Bench evaluates the models in two key areas: `object cognition` and `spatial cognition`, evaluating a total of `22` embodied cognitive abilities.
For more details, please refer to [RynnEC-Bench](benchmark).
## đ Training
### Step1: Prepare training data
To use our training code, please organize the annotation files in the following format:
```json
[
// image QA
{
"image": ["images/xxx.jpg"],
"conversations": [
{
"from": "human",
"value": "\nWhat are the colors of the bus in the image?"
},
{
"from": "gpt",
"value": "The bus in the image is white and red."
},
...
]
},
// Video QA
{
"video": ["videos/xxx.mp4"],
"conversations": [
{
"from": "human",
"value": "