zlzhang1124/voice_activity_detection

Stars
121
Rank 292,246 (Top 6 %)
Language
Python
License
GNU General Publi...
Created over 4 years ago
Updated over 4 years ago

zlzhang1124/voice_activity_detection

zlzhang1124

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Audio Split 基于双门限法的语音端点检测及语音分割

voice_activity_detection

Audio Split 基于双门限法的语音端点检测及语音分割

如果您觉得有一点点用，请隔空比个心（或者，点一下 "Star" 也可以~）

根据短时能量和过零率，基于双门限法的语音端点检测及语音分割

直接运行audio_split.py，会对./raw_audio文件夹下的所有音频文件进行分割：

首先对原音频转成16kHz，16bit，PCM格式，单通道的.wav文件，保存在./convert2wav文件夹下；
再对转换后的文件进行第一次分割，保存在./detected_split1文件夹下；
再次对分割后的文件分割，保存在./detected_split2文件夹下；
最后根据时长限制，加速音频，保存在./duration_limit文件夹下。

以上各步骤可选，参数均可自由设置，程序里有详细注释。

另外，对于acoustic_feature.py，请看我另一个仓库:声学特征提取

关于./raw_audio文件夹下的两个示例文件，运行程序会有两张plot输出：

图1 汉语：“蓝天白云”的语音端点检测

图2 一些汉语数字的语音端点检测

Python Import

关于本程序的依赖库（其中Librosa最好和我使用的版本一致，其他版本都没测试过）：

Librosa-0.7.2
Numpy-1.18.1
matplotlib-3.1.3
Scipy-1.4.1
Soundfile-0.9.0

License 开源许可协议

GPL v3.0 © ZZL

赞助

如果你喜欢本程序，并且它对你有些许帮助，欢迎给我打赏一杯奶茶哈~

微信:

支付宝:

AcousticFeatureExtraction

Acoustic feature extraction using Librosa library and openSMILE toolkit.使用Librosa音频处理库和openSMILE工具包，进行简单的声学特征提取

WD-detection

Automated Detection of WD Based on Improved MFCC with Signal Decomposition