# DiffSHEG
**Repository Path**: sheldongchen/DiffSHEG
## Basic Information
- **Project Name**: DiffSHEG
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: BSD-3-Clause
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-09-10
- **Last Updated**: 2025-09-10
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
##
DiffSHEG: A Diffusion-Based Approach for Real-Time Speech-driven Holistic 3D Expression and Gesture Generation
(CVPR 2024 Official Repo)
[Junming Chen](https://jeremycjm.github.io)
†1,2, [Yunfei Liu](http://liuyunfei.net/)
2, [Jianan Wang](https://scholar.google.com/citations?user=mt5mvZ8AAAAJ&hl=en&inst=1381320739207392350)
2, [Ailing Zeng](https://ailingzeng.site/)
2, [Yu Li](https://yu-li.github.io/)
*2, [Qifeng Chen](https://cqf.io)
*1
1HKUST 2International Digital Economy Academy (IDEA)
*Corresponding authors †Work done during an internship at IDEA
#### [Project Page](https://jeremycjm.github.io/proj/DiffSHEG/) · [Paper](https://arxiv.org/abs/2401.04747) · [Video](https://www.youtube.com/watch?v=HFaSd5do-zI)

## Environment
We have tested on Ubuntu 18.04 and 20.04.
```
cd assets
```
- Option 1: conda install
```
conda env create -f environment.yml
conda activate diffsheg
```
- Option 2: pip install
```
conda create -n "diffsheg" python=3.9
conda activate diffsheg
pip install -r requirements.txt
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
```
- Untar data.tar.gz for data statistics
```
tar zxvf data.tar.gz
mv data ../
```
## Checkpoints
[Google Drive](https://drive.google.com/file/d/1JPoMOcGDrvkFt7QbN6sEyYAPOOWkVN0h/view)
## Inference on a Custom Audio
First specify the '--test_audio_path' argument to your test audio path in the following mentioned bash files. Note that the audio should be a .wav file.
- Use model trained on BEAT dataset:
```
bash inference_custom_audio_beat.sh
```
- Use model trained on SHOW dataset:
```
bash inference_custom_audio_talkshow.sh
```
## Training