FunASR 部署

FunASR希望在语音识别的学术研究和工业应用之间架起一座桥梁。

通过发布工业级语音识别模型的训练和微调,研究人员和开发人员可以更方便地进行语音识别模型的研究和生产,并推动语音识别生态的发展。让语音识别更有趣!

参考:FunASR

安装

1
pip3 install -U funasr  -i https://pypi.tuna.tsinghua.edu.cn/simple

测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
funasr ++model=paraformer-zh ++vad_model="fsmn-vad" ++punc_model="ct-punc" ++input=/home/sjl/.cache/modelscope/hub/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/example/asr_example.wav
funasr version: 1.1.6.
Check update of funasr, and it would cost few times. You may disable it by set `disable_update=True` in AutoModel
You are using the latest version of funasr-1.1.6
[2024-09-10 22:30:03,671][root][INFO] - download models from model hub: ms
2024-09-10 22:30:04,808 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
[2024-09-10 22:30:06,197][root][INFO] - Loading pretrained params from /home/sjl/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
[2024-09-10 22:30:06,200][root][INFO] - ckpt: /home/sjl/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt
/home/sjl/.conda/envs/torch-gpu/lib/python3.9/site-packages/funasr/train_utils/load_pretrained_model.py:38: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
src_state = torch.load(path, map_location=map_location)
[2024-09-10 22:30:06,476][root][INFO] - scope_map: ['module.', 'None']
[2024-09-10 22:30:06,477][root][INFO] - excludes: None
[2024-09-10 22:30:06,548][root][INFO] - Loading ckpt: /home/sjl/.cache/modelscope/hub/iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch/model.pt, status: <All keys matched successfully>
[2024-09-10 22:30:06,859][root][INFO] - Building VAD model.
[2024-09-10 22:30:06,859][root][INFO] - download models from model hub: ms
2024-09-10 22:30:07,208 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
[2024-09-10 22:30:07,480][root][INFO] - Loading pretrained params from /home/sjl/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
[2024-09-10 22:30:07,480][root][INFO] - ckpt: /home/sjl/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt
[2024-09-10 22:30:07,482][root][INFO] - scope_map: ['module.', 'None']
[2024-09-10 22:30:07,482][root][INFO] - excludes: None
[2024-09-10 22:30:07,483][root][INFO] - Loading ckpt: /home/sjl/.cache/modelscope/hub/iic/speech_fsmn_vad_zh-cn-16k-common-pytorch/model.pt, status: <All keys matched successfully>
[2024-09-10 22:30:07,486][root][INFO] - Building punc model.
[2024-09-10 22:30:07,487][root][INFO] - download models from model hub: ms
2024-09-10 22:30:07,794 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
Building prefix dict from the default dictionary ...
[2024-09-10 22:30:08,976][jieba][DEBUG] - Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
[2024-09-10 22:30:08,976][jieba][DEBUG] - Loading model from cache /tmp/jieba.cache
Loading model cost 0.281 seconds.
[2024-09-10 22:30:09,257][jieba][DEBUG] - Loading model cost 0.281 seconds.
Prefix dict has been built successfully.
[2024-09-10 22:30:09,257][jieba][DEBUG] - Prefix dict has been built successfully.
[2024-09-10 22:30:19,126][root][INFO] - Loading pretrained params from /home/sjl/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
[2024-09-10 22:30:19,127][root][INFO] - ckpt: /home/sjl/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt
[2024-09-10 22:30:19,357][root][INFO] - scope_map: ['module.', 'None']
[2024-09-10 22:30:19,357][root][INFO] - excludes: None
[2024-09-10 22:30:19,419][root][INFO] - Loading ckpt: /home/sjl/.cache/modelscope/hub/iic/punc_ct-transformer_cn-en-common-vocab471067-large/model.pt, status: <All keys matched successfully>
rtf_avg: 0.026: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 6.90it/s]
0%| | 0/1 [00:00<?, ?it/s/home/sjl/.conda/envs/torch-gpu/lib/python3.9/site-packages/funasr/models/paraformer/model.py:251: FutureWarning: `torch.cuda.amp.autocast(args...)` is deprecated. Please use `torch.amp.autocast('cuda', args...)` instead.
with autocast(False):
rtf_avg: 0.129: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.57it/s]
rtf_avg: -0.015: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 64.17it/s]
rtf_avg: 0.118, time_speech: 5.547, time_escape: 0.655: 100%|███████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.52it/s]
[{'key': 'asr_example', 'text': '欢迎大家来体验达摩院推出的语音识别模型。', 'timestamp': [[880, 1120], [1120, 1360], [1380, 1540], [1540, 1780], [1780, 2020], [2020, 2180], [2180, 2420], [2480, 2600], [2600, 2780], [2780, 3020], [3040, 3240], [3240, 3480], [3480, 3700], [3700, 3900], [3900, 4140], [4180, 4420], [4420, 4620], [4620, 4780], [4780, 5195]]}]

实时语音 CPU

参考:FunASR实时语音听写服务开发指南

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.10
mkdir -p ./funasr-runtime-resources/models
docker run -p 10096:10095 -it --privileged=true -v $PWD/funasr-runtime-resources/models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.10

cd FunASR/runtime
cat test.sh
nohup bash run_server_2pass.sh \
--download-model-dir /workspace/models \
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx \
--online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \
--punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
--itn-dir thuduj12/fst_itn_zh \
--hotword /workspace/models/hotwords.txt > log.txt 2>&1 &
root@2bf110b5e876:/workspace/FunASR/runtime# cat log.txt

chmod +x test.sh && ./test.sh

客户端

参考:FunASR实时语音听写便捷部署教程

可以使用脚本一键部署服务端的容器,实测CPU识别速度挺快,延迟不到一秒。比faster whisper在Jetson的效果好很多。Jetson 部署见 Jetson 语音识别