# mediapipe-python-sample **Repository Path**: sxupgljd/mediapipe-python-sample ## Basic Information - **Project Name**: mediapipe-python-sample - **Description**: No description available - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-03-24 - **Last Updated**: 2025-03-24 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README > [!IMPORTANT] > MediaPipe 旧版解决方案的支持已于2023年3月1日结束。
> 旧版解决方案的示例已移至 [_legacy](_legacy) 目录。
> MediaPipe 保持向后兼容性，当前包仍可运行旧版解决方案的示例。
# mediapipe-python-sample 这是 [google-ai-edge/mediapipe](https://github.com/google-ai-edge/mediapipe) Python包的示例脚本集。
截至2024/9/1，已为以下15个Python实现的功能提供了示例： * [物体检测（Object Detection）](https://ai.google.dev/edge/mediapipe/solutions/vision/object_detector?hl=ja) * [图像分类（Image Classification）](https://ai.google.dev/mediapipe/solutions/vision/image_classifier?hl=ja) * [图像分割（Image Segmentation）](https://ai.google.dev/mediapipe/solutions/vision/image_segmenter?hl=ja) * [交互式分割（Interactive segmentation）](https://ai.google.dev/mediapipe/solutions/vision/interactive_segmenter?hl=ja) * [手部检测（Hand Landmark detection）](https://ai.google.dev/mediapipe/solutions/vision/hand_landmarker?hl=ja) * [手势识别（Gesture Recognition）](https://ai.google.dev/mediapipe/solutions/vision/gesture_recognizer?hl=ja) * [图像嵌入表示（Image Embedding）](https://ai.google.dev/mediapipe/solutions/vision/image_embedder?hl=ja) * [人脸检测（Face Detection）](https://ai.google.dev/mediapipe/solutions/vision/face_detector?hl=ja) * [人脸关键点检测（Face Landmark Detection）](https://ai.google.dev/mediapipe/solutions/vision/face_landmarker?hl=ja) * [人脸风格化（Face Stylization）](https://ai.google.dev/mediapipe/solutions/vision/face_stylizer?hl=ja) * [姿态估计（Pose Landmark Detection）](https://ai.google.dev/mediapipe/solutions/vision/pose_landmarker?hl=ja) * [文本分类（Text Classification）](https://ai.google.dev/mediapipe/solutions/text/text_classifier?hl=ja) * [文本嵌入表示（Text Embedding）](https://ai.google.dev/mediapipe/solutions/text/text_embedder?hl=ja) * [文本语言检测（Language Detector）](https://ai.google.dev/mediapipe/solutions/text/language_detector?hl=ja) * [音频分类（Audio Classification）](https://ai.google.dev/mediapipe/solutions/audio/audio_classifier?hl=ja) # 环境要求 * mediapipe 0.10.14 或更高版本 * opencv-python 4.10.0.84 或更高版本 * tqdm 4.66.5 或更高版本 ※用于下载权重文件 * requests 2.32.3 或更高版本 ※用于下载权重文件 * scipy 1.14.1 或更高版本 ※仅在运行音频分类（Audio Classification）示例时需要 * numpy 1.26.4 ※NumPy需要1.x版本 ```bash pip install -r requirements.txt ``` # 演示以下是运行演示的方法。 ### 物体检测（Object Detection） ```bash python sample_object_detection.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --model
使用模型[0, 1, 2, 3, 4, 5, 6, 7] ※如果model目录中没有目标模型的权重文件则下载
使用[COCO数据集](https://cocodataset.org/#home)训练的权重，支持的标签见[labelmap.txt](https://storage.googleapis.com/mediapipe-tasks/object_detector/labelmap.txt)
默认：0
* 0:EfficientDet-Lite0(int8) * 1:EfficientDet-Lite0(float 16) * 2:EfficientDet-Lite0(float 32) * 3:EfficientDet-Lite2(int8) * 4:EfficientDet-Lite2(float 16) * 5:EfficientDet-Lite2(float 32) * 6:SSDMobileNet-V2(int8) * 7:SSDMobileNet-V2(float 32) * --score_threshold
分数阈值
默认：0.5

### 图像分类（Image Classification） ```bash python sample_image_classification.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --model
使用模型[0, 1, 2, 3] ※如果model目录中没有目标模型的权重文件则下载
使用[ImageNet](https://www.image-net.org/)训练的权重，支持的标签见[labels.txt](https://storage.googleapis.com/mediapipe-tasks/image_classifier/labels.txt)
默认：0
* 0:EfficientNet-Lite0(int8) * 1:EfficientNet-Lite0(float 32) * 2:EfficientNet-Lite2(int8) * 3:EfficientNet-Lite2(float 32) * --max_results
结果输出数量
默认：5

### 图像分割（Image Segmentation） ```bash python sample_image_segmentation.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --model
使用模型[0, 1, 2, 3, 4] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:SelfieSegmenter(square) * 1:SelfieSegmenter(landscape) * 2:HairSegmenter * 3:SelfieMulticlass(256x256) * 4:DeepLab-V3

### 交互式分割（Interactive segmentation） ```bash python sample_interactive_image_segmentation.py ```

命令行参数选项

* --image
指定图像路径
默认：asset/hedgehog01.jpg * --model
使用模型[0] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:MagicTouch

### 手部检测（Hand Landmark detection） ```bash python sample_hand_landmarks_detection.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --unuse_mirror
不使用镜像显示
默认：未指定 * --model
使用模型[0] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:HandLandmarker (full) * --num_hands
检测数量
默认：2 * --use_world_landmark
显示世界坐标
默认：未指定

### 手势识别（Gesture Recognition） ```bash python sample_hand_gesture_recognition.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --unuse_mirror
不使用镜像显示
默认：未指定 * --model
使用模型[0] ※如果model目录中没有目标模型的权重文件则下载
可识别的手势包括"Closed fist"、"Open palm"、"Pointing up"、"Thumbs down"、"Thumbs up"、"Victory"、"Love"、"Unknown"
默认：0
* 0:HandGestureClassifier

### 图像嵌入表示（Image Embedding） ```bash python sample_image_embedding.py ```

命令行参数选项

* --image01
指定图像路径1
默认：asset/hedgehog01.jpg * --image02
指定图像路径2
默认：asset/hedgehog02.jpg * --model
使用模型[0, 1] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:MobileNet-V3 (small) * 1:MobileNet-V3 (large) * --unuse_l2_normalize
不对特征向量进行L2范数归一化
默认：未指定 * --unuse_quantize
不对特征向量进行标量量化转换为字节
默认：未指定

### 人脸检测（Face Detection） ```bash python sample_face_landmark_detection.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --model
使用模型[0] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:BlazeFace (short-range)

### 人脸关键点检测（Face Landmark Detection） ```bash python sample_face_landmark_detection.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --model
使用模型[0] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:FaceLandscapeer * --num_faces
检测数量
默认：1 * --unuse_output_face_blendshapes
不输出面部混合形状
默认：未指定 * --unuse_output_facial_transformation_matrixes
不输出面部变换矩阵
默认：未指定

### 人脸风格化（Face Stylization） ```bash python sample_face_stylization.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --model
使用模型[0, 1, 2] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:Color sketch * 1:Color ink * 2:Oil painting

### 姿态估计（Pose Landmark Detection） ```bash python sample_pose_landmark_detection.py ```

命令行参数选项

* --device
指定摄像头设备号
默认：0 * --video
指定视频路径 ※如果指定则优先于摄像头
默认：None * --width
摄像头捕获时的宽度
默认：960 * --height
摄像头捕获时的高度
默认：540 * --unuse_mirror
不使用镜像显示
默认：未指定 * --model
使用模型[0, 1, 2] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:Pose landmarker(lite) * 1:Pose landmarker(Full) * 2:Pose landmarker(Heavy) * --use_output_segmentation_masks
执行分割
默认：未指定 * --use_world_landmark
显示世界坐标
默认：未指定

### 文本分类（Text Classification） ```bash python sample_text_classification.py ```

命令行参数选项

* --input_text
输入文本
默认：I'm looking forward to what will come next. * --model
使用模型[0, 1] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:BERT-classifier * 1:Average word embedding

### 文本嵌入表示（Text Embedding） ```bash python sample_text_embedding.py ```

命令行参数选项

* --input_text01
输入文本1
默认：I'm feeling so good * --input_text02
输入文本2
默认：I'm okay I guess * --model
使用模型[0] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:Universal Sentence Encoder * --unuse_l2_normalize
不对特征向量进行L2范数归一化
默认：未指定 * --use_quantize
对特征向量进行标量量化转换为字节
默认：未指定

### 文本语言检测（Language Detector） ```bash python sample_text_language_detection.py ```

命令行参数选项

* --input_text
输入文本
默认：分久必合合久必分 * --model
使用模型[0, 1] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:Language Detector

### 音频分类（Audio Classification） ```bash python sample_audio_classification.py ```

命令行参数选项

* --input_audio
输入音频文件路径
默认：asset/hyakuninisshu_02.wav * --model
使用模型[0] ※如果model目录中没有目标模型的权重文件则下载
默认：0
* 0:YamNet * --max_results
结果输出数量
默认：5

# 参考 * [google-ai-edge/mediapipe](https://github.com/google-ai-edge/mediapipe) # 作者高橋かずひと(https://twitter.com/KzhtTkhs) # 许可证 mediapipe-python-sample 使用 [Apache-2.0 许可证](LICENSE)。 # 图片、视频、音频许可证示例中使用的图片等资源来自以下来源： * [免费素材网站](https://www.pakutaso.com)：《带刺的仙人掌和刺猬》(https://www.pakutaso.com/20190257050post-19488.html) * [免费素材网站](https://www.pakutaso.com)：《钻进人类鞋子的刺猬》(https://www.pakutaso.com/20171041289post-13677.html) * [免费素材网站](https://www.pakutaso.com)：《完全藏在鞋子里的刺猬》(https://www.pakutaso.com/20171039289post-13676.html) * [NHK创意库](https://www.nhk.or.jp/archives/creative/)：《猫咖啡馆的猫（3）》(https://www2.nhk.or.jp/archives/movies/?id=D0002161325_00000) * [NHK创意库](https://www.nhk.or.jp/archives/creative/)：《寅次郎雕像特写》(https://www2.nhk.or.jp/archives/movies/?id=D0002022189_00000) * [NHK创意库](https://www.nhk.or.jp/archives/creative/)：《音频百人一首二》(https://www2.nhk.or.jp/archives/movies/?id=D0002110102_00000)