speech api_如何使用Web Speech API构建文本语音转换应用

刁星渊

2023-12-01

speech api

介绍 (Introduction)

Assuming that you’ve used several apps over the years, there is a very high chance that you have interacted with apps that provide some form of voice experience. It could be an app with text-to-speech functionality, like reading your text messages or notifications aloud. It could also be an app with voice recognition functionality like Siri or Google Assistant.

假设您多年来使用过多个应用程序，那么您与提供某种形式的语音体验的应用程序进行交互的可能性很大。它可能是一个具有文本到语音功能的应用程序，例如大声阅读您的短信或通知。它也可能是具有语音识别功能的应用程序，例如Siri或Google Assistant。

With the advent of HTML5, there has been a very fast growth in the number of API available on the web platform. Over the years, we have come across API such as WebSocket, File, Geolocation, Notification, Battery, Vibration, DeviceOrientation, WebRTC, etc. Some of these API have gained very high support across various browsers.

随着HTML5的到来，Web平台上可用的API的数量有了非常快速的增长。多年以来，我们遇到了诸如WebSocket，文件，地理位置，通知，电池，振动，DeviceOrientation，WebRTC等API，其中一些API在各种浏览器中都获得了很高的支持。

There are a couple of API known as the Web Speech API that have been developed to make it easy to seamlessly build varying kinds of voice applications and experiences for the web. These API are still pretty experimental, although there is increasing support for most of them across all the modern browsers.

已经开发了一些称为Web语音API的API ，可以轻松无缝地为Web构建各种语音应用程序和体验。这些API仍处于试验阶段，尽管在所有现代浏览器中对它们中的大多数都有越来越多的支持。

第1步-使用Web Speech API (Step 1 — Using the Web Speech API)

The Web Speech API is broken into two major interfaces:

Web Speech API分为两个主要接口：

SpeechSynthesis - For text-to-speech applications. This allows apps to read out their text content using the device’s speech synthesizer. The available voice types are represented by a SpeechSynthesisVoice object, while the text to be uttered is represented by a SpeechSynthesisUtterance object. See the support table for the SpeechSynthesis interface to learn more about browser support.
SpeechSynthesis-适用于文本到语音的应用程序。这使应用程序可以使用设备的语音合成器读出其文本内容。可用的语音类型由SpeechSynthesisVoice对象表示，而要说出的文本则由SpeechSynthesisUtterance对象表示。请参阅SpeechSynthesis接口的支持表，以了解有关浏览器支持的更多信息。
SpeechRecognition - For applications that require asynchronous voice recognition. This allows apps to recognize voice context from an audio input. A SpeechRecognition object can be created using the constructor. The SpeechGrammar interface exists for representing the set of grammar that the app should recognize. See the support table for the SpeechRecognition interface to learn more about browser support.
语音识别 -对于需要异步语音识别应用。这使应用程序可以从音频输入中识别语音上下文。可以使用构造函数创建SpeechRecognition对象。 SpeechGrammar接口用于表示应用程序应识别的一组语法。请参阅SpeechRecognition界面的支持表，以了解有关浏览器支持的更多信息。

In this tutorial, you will use the SpeechSynthesis interface to build a text-to-speech app. Here is a demo screenshot of what the app will look like (without the sound):

在本教程中，您将使用SpeechSynthesis界面构建文本到语音应用程序。这是该应用程序外观的演示屏幕截图(无声音)：

获取参考 (Getting a Reference)

Getting a reference to a SpeechSynthesis object can be done with a single line of code:

SpeechSynthesis一行代码即可获得对SpeechSynthesis对象的引用：

var synthesis = window.speechSynthesis;

It is very useful to check if SpeechSynthesis is supported by the browser before using the functionality it provides. The following code snippet shows how to check for browser support:

在使用浏览器提供的功能之前，检查浏览器是否支持SpeechSynthesis非常有用。以下代码段显示了如何检查浏览器支持：

if ('speechSynthesis' in window) {
  var synthesis = window.speechSynthesis;

} else {
  console.log('Text-to-speech not supported.');
}

获取可用的声音 (Getting Available Voices)

In this step you will build on your already existing code to get the available speech voices. The getVoices() method returns a list of SpeechSynthesisVoice objects representing all the available voices on the device.

在这一步中，您将基于已经存在的代码来获取可用的语音。 getVoices()方法返回一个SpeechSynthesisVoice对象列表， SpeechSynthesisVoice对象代表设备上所有可用的语音。

Take a look at the following code snippet:

看一下以下代码片段：

if ('speechSynthesis' in window) {

  var synthesis = window.speechSynthesis;

  // Regex to match all English language tags e.g en, en-US, en-GB
  var langRegex = /^en(-[a-z]{2})?$/i;

  // Get the available voices and filter the list to only have English speakers
  var voices = synthesis.getVoices().filter(voice => langRegex.test(voice.lang));

  // Log the properties of the voices in the list
  voices.forEach(function(voice) {
    console.log({
      name: voice.name,
      lang: voice.lang,
      uri: voice.voiceURI,
      local: voice.localService,
      default: voice.default
    })
  });

} else {
  console.log('Text-to-speech not supported.');
}

In the above snippet, you get the list of available voices on the device, and filter the list using the langRegex regular expression to ensure that we get voices for only English speakers. Finally, you loop through the voices in the list and log the properties of each to the console.

在上面的代码段中，您将获得设备上可用语音的列表，并使用langRegex正则表达式过滤列表以确保我们仅获得英语使用者的语音。最后，您遍历列表中的声音并将每个声音的属性记录到控制台。

构建言语表达 (Constructing Speech Utterances)

In this step you will construct speech utterances by using the SpeechSynthesisUtterance constructor and setting values for the available properties. The following code snippet creates a speech utterance for reading the text "Hello World".

在这一步中，您将通过使用SpeechSynthesisUtterance构造函数并为可用属性设置值来构造语音。以下代码段创建了语音朗读，用于读取文本"Hello World" 。

if ('speechSynthesis' in window) {

  var synthesis = window.speechSynthesis;

  // Get the first `en` language voice in the list
  var voice = synthesis.getVoices().filter(function(voice) {
    return voice.lang === 'en';
  })[0];

  // Create an utterance object
  var utterance = new SpeechSynthesisUtterance('Hello World');

  // Set utterance properties
  utterance.voice = voice;
  utterance.pitch = 1.5;
  utterance.rate = 1.25;
  utterance.volume = 0.8;

  // Speak the utterance
  synthesis.speak(utterance);

} else {
  console.log('Text-to-speech not supported.');
}

Here, you get the first en language voice from the list of available voices. Next, you create a new utterance using the SpeechSynthesisUtterance constructor. You then set some of the properties on the utterance object like voice, pitch, rate, and volume. Finally, it speaks the utterance using the speak() method of SpeechSynthesis.

在这里，你得到的第一个en从可用的声音列表中的语言声音。接下来，使用SpeechSynthesisUtterance构造函数创建一个新的话语。然后，您可以在发声对象上设置一些属性，例如voice ， pitch ， rate和volume 。最后，它使用SpeechSynthesis speak()方法说出话语。

Note: There is a limit to the size of text that can be spoken in an utterance. The maximum length of the text that can be spoken in each utterance is 32,767 characters.

注意：话语中的文字大小有限制。每个发音中可以说出的文字的最大长度为32,767个字符。

Notice that you passed the text to be uttered in the constructor. You can also set the text to be uttered by setting the text property of the utterance object. This overrides whatever text that was passed in the constructor. Here is a simple example:

请注意，您在构造函数中传递了要说出的文本。您还可以通过设置话语对象的text属性来设置要说出的text 。这将覆盖在构造函数中传递的所有文本。这是一个简单的示例：

var synthesis = window.speechSynthesis;
var utterance = new SpeechSynthesisUtterance("Hello World");

// This overrides the text "Hello World" and is uttered instead
utterance.text = "My name is Glad.";

synthesis.speak(utterance);

说出话语 (Speaking an Utterance)

In the previous code snippet, we have seen how to speak utterances by calling the speak() method on the SpeechSynthesis instance. We simply pass in the SpeechSynthesisUtterance instance as argument to the speak() method to speak the utterance.

在前面的代码片段中，我们已经了解了如何通过在SpeechSynthesis实例上调用SpeechSynthesis speak()方法来说出SpeechSynthesis 。我们只需将SpeechSynthesisUtterance实例作为对SpeechSynthesisUtterance speak()方法的参数进行传递即可说出话语。

var synthesis = window.speechSynthesis;

var utterance1 = new SpeechSynthesisUtterance("Hello World");
var utterance2 = new SpeechSynthesisUtterance("My name is Glad.");
var utterance3 = new SpeechSynthesisUtterance("I'm a web developer from Nigeria.");

synthesis.speak(utterance1);
synthesis.speak(utterance2);
synthesis.speak(utterance3);

There are a couple of other things you can do with the SpeechSynthesis instance such as pause, resume and cancel utterances. Hence the pause(), resume() and cancel() methods are available as well on the SpeechSynthesis instance.

您可以对SpeechSynthesis实例执行其他SpeechSynthesis例如暂停，恢复和取消语音。因此， SpeechSynthesis实例也可以使用pause() ， resume()和cancel()方法。

第2步-构建文本语音转换应用 (Step 2 — Building the Text-to-Speech App)

We have seen the basic aspects of the SpeechSynthesis interface. We will now start building our text-to-speech application. Before we begin, ensure that you have Node and NPM installed on your machine.

我们已经了解了SpeechSynthesis接口的基本方面。现在，我们将开始构建文本到语音应用程序。在开始之前，请确保您在计算机上安装了Node和NPM。

Run the following commands on your terminal to setup a project for the app and install the dependencies.

在终端上运行以下命令，以设置应用程序项目并安装依赖项。

Create a new project directory:

创建一个新的项目目录：

mkdir web-speech-app
mkdir网络语音应用

Move into the newly created project directory:

移至新创建的项目目录：

cd web-speech-app
cd网络语音应用

Initialize the project:

初始化项目：

npm init -y
npm初始化-y

Install dependencies needed for the project:

安装项目所需的依赖项：

npm install express cors axios
npm install express cors axios

Modify the "scripts" section of the package.json file to look like the following snippet:

修改package.json文件的"scripts"部分，使其类似于以下片段：

package.json

"scripts": {
  "start": "node server.js"
}

Now that you have initialized a project for the application, you will proceed to setup a server for the app using Express.

现在，您已经为应用程序初始化了一个项目，您将继续使用Express为该应用程序设置服务器。

Create a new server.js file and add the following content to it:

创建一个新的server.js文件，并添加以下内容：

server.js

const cors = require('cors');
const path = require('path');
const axios = require('axios');
const express = require('express');

const app = express();
const PORT = process.env.PORT || 5000;

app.set('port', PORT);

// Enable CORS(Cross-Origin Resource Sharing)
app.use(cors());

// Serve static files from the /public directory
app.use('/', express.static(path.join(__dirname, 'public')));

// A simple endpoint for fetching a random quote from QuotesOnDesign
app.get('/api/quote', (req, res) => {
  axios.get('http://quotesondesign.com/wp-json/posts?filter[orderby]=rand&filter[posts_per_page]=1')
    .then((response) => {
      const [ post ] = response.data;
      const { title, content } = post || {};

      return (title && content)
        ? res.json({ status: 'success', data: { title, content } })
        : res.status(500).json({ status: 'failed', message: 'Could not fetch quote.' });
    })
    .catch(err => res.status(500).json({ status: 'failed', message: 'Could not fetch quote.' }));
});

app.listen(PORT, () => console.log(`> App server is running on port ${PORT}.`));

Here, you set up a Node server using Express. You enabled CORS (Cross-Origin Request Sharing) using the cors() middleware. You also use the express.static() middleware to serve static files from the /public directory in the project root. This will enable you to serve the index page that you will create soon.

在这里，您可以使用Express设置节点服务器。您使用cors()中间件启用了CORS(跨域请求共享)。您还可以使用express.static()中间件从项目根目录中的/public目录提供静态文件。这将使您能够为即将创建的索引页面提供服务。

Finally, you set up a GET /api/quote route for fetching a random quote from the QuotesOnDesign API service. You are using axios(a promise based HTTP client library) to make the HTTP request.

最后，您设置了GET /api/quote路由，用于从QuotesOnDesign API服务中获取随机报价。您正在使用axios (基于Promise的HTTP客户端库)发出HTTP请求。

Here is what a sample response from the QuotesOnDesign API looks like:

这是QuotesOnDesign API的示例响应如下所示：


   
   
    
    Output
   
   [
  {
    "ID": 2291,
    "title": "Victor Papanek",
    "content": "<p>Any attempt to separate design, to make it a thing-by-itself, works counter to the inherent value of design as the primary, underlying matrix of life.</p>\n",
    "link": "https://quotesondesign.com/victor-papanek-4/",
    "custom_meta": {
      "Source": "<a href=\"http://www.amazon.com/Design-Real-World-Ecology-Social/dp/0897331532\">book</a>"
    }
  }
]

When you fetch a quote successfully, the quote’s title and content are returned in the data field of the JSON response. Otherwise, a failure JSON response with a 500 HTTP status code would be returned.

成功获取报价后，报价的title和content将在JSON响应的data字段中返回。否则，将返回带有500 HTTP状态代码的失败JSON响应。

Next, you will create an index page for the app view. First create a new public folder in the root of your project. Next, create a new index.html file in the newly created public folder and add the following content to it:

接下来，您将为应用程序视图创建一个索引页面。首先创建一个新的public在项目的根文件夹。接下来，创建一个新index.html在新创建的文件public文件夹和添加以下内容吧：

public/index.html

public / index.html

<html>

<head>
    <title>Daily Quotes</title>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.1/css/bootstrap.min.css" integrity="sha384-WskhaSGFgHYWDcbwN70/dfYBj47jz9qbsMId/iRN3ewGhXQFZCSftd1LZCfmhktB" crossorigin="anonymous">
</head>

<body class="position-absolute h-100 w-100">
    <div id="app" class="d-flex flex-wrap align-items-center align-content-center p-5 mx-auto w-50 position-relative"></div>

    <script src="https://unpkg.com/jquery/dist/jquery.min.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/feather-icons/dist/feather.min.js"></script>
    <script src="main.js"></script>
</body>

</html>

This creates a basic index page for the app with just one <div id="app"> which will serve as the mount point for all the dynamic content of the app. You have also added a link to the Bootstrap CDN to get some default Bootstrap 4 styling for the app. You have also included jQuery for DOM manipulations and Ajax requests, and Feather icons for elegant SVG icons.

这将为应用程序创建一个基本索引页面，其中只有一个<div id="app"> ，它将用作<div id="app">所有动态内容的挂载点。您还添加了指向Bootstrap CDN的链接，以获取该应用程序的一些默认Bootstrap 4样式。您还包括用于DOM操作和Ajax请求的jQuery ，以及用于优雅SVG图标的Feather图标。

第3步-构建主脚本 (Step 3 — Building the Main Script)

Now you are down to the last piece that powers the app—the main script. Create a new main.js file in the public directory of your app and add the following content to it:

现在，您到了为应用程序供电的最后一个部分-主脚本。在应用的public目录中创建一个新的main.js文件，并向其中添加以下内容：

public/main.js

public / main.js

jQuery(function($) {

  let app = $('#app');

  let SYNTHESIS = null;
  let VOICES = null;

  let QUOTE_TEXT = null;
  let QUOTE_PERSON = null;

  let VOICE_SPEAKING = false;
  let VOICE_PAUSED = false;
  let VOICE_COMPLETE = false;

  let iconProps = {
    'stroke-width': 1,
    'width': 48,
    'height': 48,
    'class': 'text-secondary d-none',
    'style': 'cursor: pointer'
  };

  function iconSVG(icon) {}

  function showControl(control) {}

  function hideControl(control) {}

  function getVoices() {}

  function resetVoice() {}

  function fetchNewQuote() {}

  function renderQuote(quote) {}

  function renderVoiceControls(synthesis, voice) {}

  function updateVoiceControls() {}

  function initialize() {}

  initialize();

});

This code uses jQuery to execute a function when the DOM is loaded. You get a reference to the #app element and initialize some variables. You also declare a couple of empty functions that you will implement in the following sections. Finally, we call the initialize() function to initialize the application.

加载DOM时，此代码使用jQuery执行函数。您将获得对#app元素的引用并初始化一些变量。您还声明了将在以下各节中实现的几个空函数。最后，我们调用initialize()函数初始化应用程序。

The iconProps variable contains a couple of properties that will be used for rendering Feather icons as SVG to the DOM.

iconProps变量包含几个属性，这些属性将用于将Feather图标作为SVG呈现到DOM。

With that code in place, you are ready to start implementing the functions. Modify the public/main.js file to implement the following functions:

使用该代码后，您就可以开始实现这些功能了。修改public/main.js文件以实现以下功能：

public/main.js

public / main.js

// Gets the SVG markup for a Feather icon
function iconSVG(icon) {
  let props = $.extend(iconProps, { id: icon });
  return feather.icons[icon].toSvg(props);
}

// Shows an element
function showControl(control) {
  control.addClass('d-inline-block').removeClass('d-none');
}

// Hides an element
function hideControl(control) {
  control.addClass('d-none').removeClass('d-inline-block');
}

// Get the available voices, filter the list to have only English filters
function getVoices() {
  // Regex to match all English language tags e.g en, en-US, en-GB
  let langRegex = /^en(-[a-z]{2})?$/i;

  // Get the available voices and filter the list to only have English speakers
  VOICES = SYNTHESIS.getVoices()
    .filter(function (voice) { return langRegex.test(voice.lang) })
    .map(function (voice) {
      return { voice: voice, name: voice.name, lang: voice.lang.toUpperCase() }
    });
}

// Reset the voice variables to the defaults
function resetVoice() {
  VOICE_SPEAKING = false;
  VOICE_PAUSED = false;
  VOICE_COMPLETE = false;
}

The iconSVG(icon) function takes a Feather icon name string as argument (e.g 'play-circle') and returns the SVG markup for the icon. Check here to see the complete list of available feather icons. Also check the Feather documentation to learn more about the API.

iconSVG(icon)函数将Feather图标名称字符串作为参数(例如'play-circle' )，并返回该图标的SVG标记。检查此处以查看可用羽毛图标的完整列表。另请参阅Feather文档以了解有关API的更多信息。

The getVoices() function uses the SYNTHESIS object to fetch the list of all the available voices on the device. Then, it filters the list using a regular expression to get the voices of only English speakers.

getVoices()函数使用SYNTHESIS对象来获取设备上所有可用语音的列表。然后，它使用正则表达式过滤列表，以仅获取说英语的人的声音。

Next, you will implement the functions for fetching and rendering quotes on the DOM. Modify the public/main.js file to implement the following functions:

接下来，您将实现用于在DOM上获取和呈现引号的功能。修改public/main.js文件以实现以下功能：

public/main.js

public / main.js

function fetchNewQuote() {
  // Clean up the #app element
  app.html('');

  // Reset the quote variables
  QUOTE_TEXT = null;
  QUOTE_PERSON = null;

  // Reset the voice variables
  resetVoice();

  // Pick a voice at random from the VOICES list
  let voice = (VOICES && VOICES.length > 0)
    ? VOICES[ Math.floor(Math.random() * VOICES.length) ]
    : null;

  // Fetch a quote from the API and render the quote and voice controls
  $.get('/api/quote', function (quote) {
    renderQuote(quote.data);
    SYNTHESIS && renderVoiceControls(SYNTHESIS, voice || null);
  });
}

function renderQuote(quote) {

  // Create some markup for the quote elements
  let quotePerson = $('<h1 id="quote-person" class="mb-2 w-100"></h1>');
  let quoteText = $('<div id="quote-text" class="h3 py-5 mb-4 w-100 font-weight-light text-secondary border-bottom border-gray"></div>');

  // Add the quote data to the markup
  quotePerson.html(quote.title);
  quoteText.html(quote.content);

  // Attach the quote elements to the DOM
  app.append(quotePerson);
  app.append(quoteText);

  // Update the quote variables with the new data
  QUOTE_TEXT = quoteText.text();
  QUOTE_PERSON = quotePerson.text();

}

Here in the fetchNewQuote() method, you first reset the app element and variables. You then pick a voice randomly using Math.random() from the list of voices stored in the VOICES variable. You use $.get() to make an AJAX request to the /api/quote endpoint, to fetch a random quote and render the quote data to the view alongside the voice controls.

在fetchNewQuote()方法中，您首先要重置app元素和变量。然后，您可以使用Math.random()从VOICES变量中存储的语音列表中随机选择一种语音。您可以使用$.get()向/api/quote端点发出AJAX请求，以获取随机报价并将报价数据呈现到与语音控件一起的视图中。

The renderQuote(quote) method receives a quote object as its argument and adds the contents to the DOM. Finally, it updates the quote variables: QUOTE_TEXT and QUOTE_PERSON.

renderQuote(quote)方法接收一个quote对象作为其参数，并将其内容添加到DOM中。最后，它更新报价变量： QUOTE_TEXT和QUOTE_PERSON 。

If you you look at the fetchNewQuote() function, you’ll notice that you made a call to the renderVoiceControls() function. This function is responsible for rendering the controls for playing, pausing, and stopping the voice output. It also renders the current voice in use and the language.

如果您查看fetchNewQuote()函数，您会注意到您对renderVoiceControls()函数进行了调用。此功能负责呈现用于播放，暂停和停止语音输出的控件。它还会渲染当前使用的语音和语言。

Make the following modifications to the public/main.js file to implement the renderVoiceControls() function:

对public/main.js文件进行以下修改，以实现renderVoiceControls()函数：

public/main.js

public / main.js

function renderVoiceControls(synthesis, voice) {

  let controlsPane = $('<div id="voice-controls-pane" class="d-flex flex-wrap w-100 align-items-center align-content-center justify-content-between"></div>');

  let voiceControls = $('<div id="voice-controls"></div>');

  // Create the SVG elements for the voice control buttons
  let playButton = $(iconSVG('play-circle'));
  let pauseButton = $(iconSVG('pause-circle'));
  let stopButton = $(iconSVG('stop-circle'));

  // Helper function to enable pause state for the voice output
  let paused = function () {
    VOICE_PAUSED = true;
    updateVoiceControls();
  };

  // Helper function to disable pause state for the voice output
  let resumed = function () {
    VOICE_PAUSED = false;
    updateVoiceControls();
  };

  // Click event handler for the play button
  playButton.on('click', function (evt) {});

  // Click event handler for the pause button
  pauseButton.on('click', function (evt) {});

  // Click event handler for the stop button
  stopButton.on('click', function (evt) {});

  // Add the voice controls to their parent element
  voiceControls.append(playButton);
  voiceControls.append(pauseButton);
  voiceControls.append(stopButton);

  // Add the voice controls parent to the controlsPane element
  controlsPane.append(voiceControls);

  // If voice is available, add the voice info element to the controlsPane
  if (voice) {
    let currentVoice = $('<div class="text-secondary font-weight-normal"><span class="text-dark font-weight-bold">' + voice.name + '</span> (' + voice.lang + ')</div>');

    controlsPane.append(currentVoice);
  }

  // Add the controlsPane to the DOM
  app.append(controlsPane);

  // Show the play button
  showControl(playButton);

}

Here, you create container elements for the voice controls and the controls pane. You use the iconSVG() function created earlier to get the SVG markup for the control buttons and create the button elements as well. You define the paused() and resumed() helper functions, which will be used while setting up the event handlers for the buttons.

在这里，您可以为语音控件和控件窗格创建容器元素。您可以使用之前创建的iconSVG()函数来获取控制按钮的SVG标记并创建按钮元素。您定义了paused()和resumed()辅助函数，这些函数将在为按钮设置事件处理程序时使用。

Finally, you render the voice control buttons and the voice info to the DOM. It is also configured so only the Play button is shown initially.

最后，将语音控制按钮和语音信息呈现给DOM。它还进行了配置，因此最初仅显示“ 播放”按钮。

Next, you will implement the click event handlers for the voice control buttons you defined in the previous section. Set up the event handlers as shown in the following code snippet:

接下来，您将为上一节中定义的语音控制按钮实现click事件处理程序。设置事件处理程序，如以下代码片段所示：

playButton.on('click', function (evt) {
  evt.preventDefault();

  if (VOICE_SPEAKING) {

    // If voice is paused, it is resumed when the playButton is clicked
    if (VOICE_PAUSED) synthesis.resume();
    return resumed();

  } else {

    // Create utterances for the quote and the person
    let quoteUtterance = new SpeechSynthesisUtterance(QUOTE_TEXT);
    let personUtterance = new SpeechSynthesisUtterance(QUOTE_PERSON);

    // Set the voice for the utterances if available
    if (voice) {
      quoteUtterance.voice = voice.voice;
      personUtterance.voice = voice.voice;
    }

    // Set event listeners for the quote utterance
    quoteUtterance.onpause = paused;
    quoteUtterance.onresume = resumed;
    quoteUtterance.onboundary = updateVoiceControls;

    // Set the listener to activate speaking state when the quote utterance starts
    quoteUtterance.onstart = function (evt) {
      VOICE_COMPLETE = false;
      VOICE_SPEAKING = true;
      updateVoiceControls();
    }

    // Set event listeners for the person utterance
    personUtterance.onpause = paused;
    personUtterance.onresume = resumed;
    personUtterance.onboundary = updateVoiceControls;

    // Refresh the app and fetch a new quote when the person utterance ends
    personUtterance.onend = fetchNewQuote;

    // Speak the utterances
    synthesis.speak(quoteUtterance);
    synthesis.speak(personUtterance);

  }

});

pauseButton.on('click', function (evt) {
  evt.preventDefault();

  // Pause the utterance if it is not in paused state
  if (VOICE_SPEAKING) synthesis.pause();
  return paused();
});

stopButton.on('click', function (evt) {
  evt.preventDefault();

  // Clear the utterances queue
  if (VOICE_SPEAKING) synthesis.cancel();
  resetVoice();

  // Set the complete status of the voice output
  VOICE_COMPLETE = true;
  updateVoiceControls();
});

Here, you set up the click event listeners for the voice control buttons. When the Play button is clicked, it starts speaking the utterances starting with the quoteUtterance and then the personUtterance. However, if the voice output is in a paused state, it resumes it.

在这里，您可以为语音控制按钮设置点击事件监听器。单击“ 播放”按钮后，它将开始讲话，首先是quoteUtterance ，然后是personUtterance 。但是，如果语音输出处于暂停状态，它将恢复它。

You set VOICE_SPEAKING to true in the onstart event handler for the quoteUtterance. The app will also refresh and fetch a new quote when the personUtterance ends.

您在VOICE_SPEAKING的onstart事件处理程序中将quoteUtterance设置为true 。当personUtterance结束时，该应用程序还将刷新并获取新报价。

The Pause button pauses the voice output, while the Stop button ends the voice output and removes all utterances from the queue, using the cancel() method of the SpeechSynthesis interface. The code calls the updateVoiceControls() function each time to display the appropriate buttons.

使用“语音SpeechSynthesis界面的cancel()方法，“ 暂停”按钮将暂停语音输出，而“ 停止”按钮将结束语音输出并从队列中删除所有语音。该代码每次都调用updateVoiceControls()函数以显示适当的按钮。

You have made a couple of calls and references to the updateVoiceControls() function in the previous code snippets. This function is responsible for updating the voice controls to display the appropriate controls based on the voice state variables.

在前面的代码片段中，您已经进行了两次调用和对updateVoiceControls()函数的引用。此功能负责更新语音控件，以根据语音状态变量显示适当的控件。

Go ahead and make the following modifications to the public/main.js file to implement the updateVoiceControls() function:

继续对public/main.js文件进行以下修改，以实现updateVoiceControls()函数：

public/main.js

public / main.js

function updateVoiceControls() {

  // Get a reference to each control button
  let playButton = $('#play-circle');
  let pauseButton = $('#pause-circle');
  let stopButton = $('#stop-circle');

  if (VOICE_SPEAKING) {

    // Show the stop button if speaking is in progress
    showControl(stopButton);

    // Toggle the play and pause buttons based on paused state
    if (VOICE_PAUSED) {
      showControl(playButton);
      hideControl(pauseButton);
    } else {
      hideControl(playButton);
      showControl(pauseButton);
    }

  } else {
    // Show only the play button if no speaking is in progress
    showControl(playButton);
    hideControl(pauseButton);
    hideControl(stopButton);
  }

}

In this code you first get a reference to each of the voice control button elements. Then, you specify which voice control buttons should be visible at different states of the voice output.

在此代码中，您首先获得对每个语音控制按钮元素的引用。然后，您指定在语音输出的不同状态下应显示哪些语音控制按钮。

You are now ready to implement the initialize() function. This function is responsible for initializing the application. Add the following code snippet to the public/main.js file to implement the initialize() function.

现在，您准备实现initialize()函数。此功能负责初始化应用程序。将以下代码片段添加到public/main.js文件中，以实现initialize()函数。

public/main.js

public / main.js

function initialize() {
  if ('speechSynthesis' in window) {

    SYNTHESIS = window.speechSynthesis;

    let timer = setInterval(function () {
      let voices = SYNTHESIS.getVoices();

      if (voices.length > 0) {
        getVoices();
        fetchNewQuote();
        clearInterval(timer);
      }
    }, 200);

  } else {

    let message = 'Text-to-speech not supported by your browser.';

    // Create the browser notice element
    let notice = $('<div class="w-100 py-4 bg-danger font-weight-bold text-white position-absolute text-center" style="bottom:0; z-index:10">' + message + '</div>');

    fetchNewQuote();
    console.log(message);

    // Display non-support info on DOM
    $(document.body).append(notice);

  }
}

This code first checks if speechSynthesis is available on the window global object and is then assigned to the SYNTHESIS variable if it is available. Next, you set up an interval for fetching the list of available voices.

此代码首先检查speechSynthesis是否在window全局对象上可用，然后将其分配给SYNTHESIS变量(如果可用)。接下来，设置一个间隔以获取可用语音列表。

You are using an interval here because there is a known asynchronous behavior with SpeechSynthesis.getVoices() that makes it return an empty array at the initial call because the voices have not been loaded yet. The interval ensures that you get a list of voices before fetching a random quote and clearing the interval.

您在此处使用时间间隔是因为SpeechSynthesis.getVoices()存在一个已知的异步行为，由于尚未加载语音，因此它在初始调用时将返回一个空数组。间隔可确保您在获取随机报价并清除间隔之前获得语音清单。

You have now successfully completed the text-to-speech app. You can start the app by running npm start on your terminal. The app should start running on port 5000 if it is available. Visit http://localhost:5000 your browser to see the app.

您现在已经成功完成了“文本转语音”应用程序。您可以通过在终端上运行npm start来启动应用程序。如果可用，该应用程序应开始在端口5000上运行。访问浏览器http：// localhost：5000以查看该应用程序。

Once running, your app will look like the following screenshot:

运行后，您的应用将如下图所示：

结论 (Conclusion)

In this tutorial, you used the Web Speech API to build a text-to-speech app for the web. You can learn more about the Web Speech API and also find some helpful resources here.

在本教程中，您使用了Web Speech API来为Web构建文本到语音的应用程序。您可以了解有关Web Speech API的更多信息，也可以在此处找到一些有用的资源。

If you’d like to continue refining your app, there are a couple of interesting features you can still implement and experiment with such as volume controls, voice pitch controls, speed/rate controls, percentage of text uttered, etc.

如果您想继续完善您的应用程序，您仍然可以实现和尝试一些有趣的功能，例如音量控制，音高控制，速度/速率控制，说出的文字百分比等。

The complete source code for this tutorial checkout the web-speech-demo repository on GitHub.

本教程的完整源代码签出了GitHub上的web-speech-demo存储库。

翻译自: https://www.digitalocean.com/community/tutorials/how-to-build-a-text-to-speech-app-with-web-speech-api

speech api

speech api_如何使用Web Speech API构建文本语音转换应用

介绍 (Introduction)

第1步-使用Web Speech API (Step 1 — Using the Web Speech API)

获取参考 (Getting a Reference)

获取可用的声音 (Getting Available Voices)

构建言语表达 (Constructing Speech Utterances)

说出话语 (Speaking an Utterance)

第2步-构建文本语音转换应用 (Step 2 — Building the Text-to-Speech App)

第3步-构建主脚本 (Step 3 — Building the Main Script)

结论 (Conclusion)

相关阅读

相关文章

相关问答

相关文档