基于Freeswitch + Unimrcp + 谷歌ASR 的语音识别的实现

全鸿晖

2023-12-01

准备：

你要有国际信用卡开通谷歌云服务的账号，并申请一个项目，然后使用ASR语音识别
装好一台Freeswitch服务器，并编译好unimrcp.so模块
装好一台centOS 7服务器，用来跑unimrcp

开始：

一，unimrcp服务器的安装和配置

官方文档：http://www.unimrcp.org/manuals/html/GoogleSRRPMInstallationManual.html

强烈建议看官方文档，科学而且规范.

安（下载）装unimrcp需要unimrcp官网的账号，不然你下载不了，而且后续的申请免费使用的许可证，也是要账号的。这个不花钱，申请吧。

账号注册地址：https://www.unimrcp.org/profile-registration

以上步骤我得到了我的账号：michael_cctv 密码：michael_password

使用YUM 方式来安装，但是要先添加unimrcp 的yum仓库

地址为 /etc/yum.repos.d/unimrcp.repo , 将官网申请的账号密码替换下面配置

[unimrcp]

name=UniMRCP Packages for Red Hat / Cent OS-$releasever $basearch

baseurl=https://username:password@unimrcp.org/repo/yum/main/rhel$releasever/$basearch/

enabled=1

sslverify=1

gpgcheck=1

gpgkey=https://unimrcp.org/keys/unimrcp-gpg-key.public

[unimrcp-noarch]

name=UniMRCP Packages for Red Hat / Cent OS-$releasever noarch

baseurl=https://username:password@unimrcp.org/repo/yum/main/rhel$releasever/noarch/

enabled=1

sslverify=1

gpgcheck=1

gpgkey=https://unimrcp.org/keys/unimrcp-gpg-key.public

验证是否能连接仓库：yum repolist unimrcp

yum repolist unimrcp-noarch

查看有哪些安装包：

yum --disablerepo="*" --enablerepo="unimrcp" list available

yum --disablerepo="*" --enablerepo="unimrcp-noarch" list available

2.安装谷歌ASR插件yum install unimrcp-gsr

In order to install the additional data files for the sample client application umc, the following command can be used （不是必须装，但是也装了吧） yum install umc-addons

(官网有手动安装RPM包的方法，步骤太多，有兴趣可以自己玩。)

3.申请许可证以及导入。

我们申请的是试用许可证，只有两个并发通道，只能试用一个月。

A,在unimrcp的安装目录下面有一个程序用来收集服务器信息的 /opt/unimrcp/bin/unilicnodegen ，程序跑一次后会得到一个文本文件（名字是unimode.info）

B, 申请试用的页面，将unimode.info文件作为附件发给官方，最慢隔天就能收到官方邮件发送回来的试用许可证。比如我收到的是 umsgsr_2469f3d1-f33d-4c58-9d83-f037ed1416dd.lic

C, 将许可证放到指定目录。cp umsgsr_*.lic /opt/unimrcp/data

二，申请谷歌云证书

因为你要用Google Cloud Speech-to-Text API, 假设你的国际信用卡和谷歌云账号都已经办理OK，那么申请一个证书给unimrcp服务器来授权试用吧。

1. 打开 Cloud Platform Console.

https://console.cloud.google.com

2. In the drop-down menu at the top, select a project My First Project created by default or create a new project.（创建项目）

3. 项目计费（新账号有300美元的赠送，用来测试足够了）

4. 激活 Speech-to-Text API.

5. 生成证书

5.1. In the Google Cloud Platform Console, navigate to API & Services > Credentials > Create credentials > Service account key

5.2. Under Service account, select New service account.

5.3. Under Service account name, enter a service account name of your choice. For example, accessor.

5.4. Under Role, select Project > Owner.

To better understand the Cloud IAM roles that you can grant to your service account to access Cloud Platform resources, check out the following page.

https://cloud.google.com/iam/docs/understanding-roles

5.5. Under Key type, leave JSON selected.

5.6. Click Create to create a new service account and download the json credentials file.

6. 证书安装。谷歌的授权证书是一个JSON文件。拷贝到指定unimrcp安装目录即可
cp *.json /opt/unimrcp/data

三， unimrcp服务器插件配置。

为了谷歌ASR在unimrcp服务器上能用，改/opt/unimrcp/conf/unimrcpserver.xml 文件

改《plugin-factory》，其他插件基本没用，也可以关闭。

<plugin-factory>

<engine id="GSR-1" name="umsgsr" enable="true"/>

</plugin-factory>

为了记录日志，改/opt/unimrcp/conf/logger.xml 文件

增加 <source name="GSR-PLUGIN" priority="INFO" masking="NONE"/>

谷歌ASR插件的配置在/opt/unimrcp/conf/umsgsr.xml. 默认配置基本能用，但是你要做单个词识别，还是句子识别，或者修改识别的时间，超时设置，都是在这个文件里面去改。里面有参数说明。多改多试，适合自己即可。

四验证是否安装成功

加载unimrcp服务

cd /opt/unimrcp/bin

./unimrcpserver

跑起来了应该能看到这一句

[INFO] Load Plugin [GSR-1] [/opt/unimrcp/plugin/umsgsr.so]

看到谷歌ASR插件许可证的情况

[NOTICE] UniMRCP GSR License

-product name: umsgsr

-product version: 1.0.0

-license owner: Name

-license type: trial

-issue date: 2017-05-11

-exp date: 2017-06-10

-channel count: 2

-feature set: 0

[NOTICE] Set Google App Credentials /opt/unimrcp/data/My First Project-a78…c15.json

运行客户端来测试unimrcp

A, 加载unimrcp客户端

cd /opt/unimrcp/bin

./umc

B, 运行 run gsr1 （其中gsr1是指存储在/opt/unimrcp/conf/umc-scenarios/gsr1.xml中的配置文件，文件里面配置了调用call steve 这个PCM文件去做识别）

控制台输出：

<?xml version="1.0"?>

<instance>call Steve</instance>

<input mode="speech">call Steve</input>

</interpretation>

</result>

五： Freeswitch的配置

Freeswitch的网卡IP 192.168.252.100，是个DMZ 地址，另外在局域网还映射了10.10.3.100这个局域网地址。 Unimrcp服务器的局域网地址10.10.80.173 （模拟了NAT环境）

/usr/local/freeswitch/conf/autoload_configs/unimrcp.conf.xml 配置如下。

<!-- UniMRCP logging level to appear in freeswitch.log. Options are:

</settings>

<X-PRE-PROCESS cmd="include" data="../mrcp_profiles/*.xml"/>

</profiles>

</configuration>

/usr/local/freeswitch/conf/mrcp_profiles目录下面有很多配置文件，不同配置文件对应了不同的unimrcp的引擎，不用的都可以删掉。我建了一个文件unimrcpserver-mrcp-v2.xml

<!-- rtcp bye policies (rtcp must be enabled first)

0 - disable rtcp bye

1 - send rtcp bye at the end of session

2 - send rtcp bye also at the end of each talkspurt (input)

-->

</synthparams>

</recogparams>

</profile>

</include>

在dialplan中使用识别

<!-- <action application="detect_speech" data="unimrcp:uni2 {start-input-timers=false,input-timeout=60000,recognition-timeout=60000}builtin:speech/transcribe uni2"/>

-->

</condition>

</extension>

</condition>

</extension>

划重点：由电话会议呼叫过来，作为一个单独的客户端去识别长语音，识别到的语音通过百度翻译输出多种文字。

</condition>

</extension>

基于Freeswitch + Unimrcp + 谷歌ASR 的语音识别的实现

相关阅读

相关文章

相关问答

相关文档