问题：

如何访问谷歌文本到语音beta功能（2021年3月1日发布）

孙辰阳

2023-03-14

2021年3月1日，谷歌文本到语音发布了测试版功能，包括对ssml的支持

我希望使用这些测试版功能，但我不知道它们发布到了哪个频道，也不知道如何访问它们。我在文档中没有找到任何能让我找到它们的面包屑。

我注意到在TTS产品主页上，演示功能使用了v1beta1，但不支持

也就是说，对于ssml：

<speak>
Blah Blah English Text. <voice name="ko-KR-Wavenet-D"> Blah Blah Korean Text.</voice> <break time="400ms" /> Blah Blah English Text.
</speak>

演示显示了以下json请求正文：

{
  "audioConfig": {
    "audioEncoding": "LINEAR16",
    "pitch": 0,
    "speakingRate": 1
  },
  "input": {
    "ssml": "<speak> Blah Blah English Text. Blah Blah Korean Text. <break time=\"400ms\" /> Blah Blah English Text. </speak>"
  },
  "voice": {
    "languageCode": "en-US",
    "name": "en-US-Wavenet-D"
  }
}

我们在自己的脚本中尝试了使用google文本到语音api从csv提示表生成音频，历史上我们一直使用通用版本。当我们切换到v1beta1时，脚本仍然有效，但是

我们的脚本使用：const textToSpeech=require（“@google cloud/text-to-speech”）和常规版本const client=new textToSpeech。TextToSpeechClient（）

我们一直在尝试使用const client=new textToSpeech访问3月1日的测试版功能。v1beta1。TextToSpeechClient（）

楚硕

2023-03-14

根据文本到语音API的发布说明

SSML的文档说

您可以参考下面的节点。js代码并输出音频文件。

tts1。js

// Imports the Google Cloud client library
const textToSpeech = require('@google-cloud/text-to-speech');
// Import other required libraries
const fs = require('fs');
const util = require('util');
// Creates a client
const client = new textToSpeech.v1beta1.TextToSpeechClient();
async function quickStart() {
 // The text to synthesize


 const ssml =  '<speak>And then she asked, <voice name="en-IN-Wavenet-D"> where were you yesterday </voice><break time="250ms"/> in her sweet and gentle voice.</speak>'

 // Construct the request
 const request = {
   input: {ssml: ssml},
   // Select the language and SSML voice gender (optional)
   voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},
   // select the type of audio encoding
   audioConfig: {audioEncoding: 'MP3'},
 };

 // Performs the text-to-speech request
 const [response] = await client.synthesizeSpeech(request);
 // Write the binary audio content to a local file
 const writeFile = util.promisify(fs.writeFile);
 await writeFile('output.mp3', response.audioContent, 'binary');
 console.log('Audio content written to file: output.mp3');
}
quickStart();

输出mp3文件：output1（使用v1beta1）

我还尝试在节点中不使用v1beta1版本。js，工作正常。

tts2。js:

// Imports the Google Cloud client library
const textToSpeech = require('@google-cloud/text-to-speech');

// Import other required libraries
const fs = require('fs');
const util = require('util');
// Creates a client
const client = new textToSpeech.TextToSpeechClient();
async function quickStart() {
 // The text to synthesize


 const ssml =  '<speak>And then she asked, <voice name="en-IN-Wavenet-D"> where were you yesterday </voice><break time="250ms"/> in her sweet and gentle voice.</speak>'

 // Construct the request
 const request = {
   input: {ssml: ssml},
   // Select the language and SSML voice gender (optional)
   voice: {languageCode: 'en-US', ssmlGender: 'NEUTRAL'},
   // select the type of audio encoding
   audioConfig: {audioEncoding: 'MP3'},
 };

 // Performs the text-to-speech request
 const [response] = await client.synthesizeSpeech(request);
 // Write the binary audio content to a local file
 const writeFile = util.promisify(fs.writeFile);
 await writeFile('output.mp3', response.audioContent, 'binary');
 console.log('Audio content written to file: output.mp3');
}
quickStart();

输出mp3文件：输出（不含v1beta1版本）

除此之外，我想通知你，我也尝试过使用Python客户端库，它也像预期的那样工作。

file1.py

from google.cloud import texttospeech

# Instantiates a client
client = texttospeech.TextToSpeechClient()

# Set the text input to be synthesized
synthesis_input = texttospeech.SynthesisInput(
  
 ssml=  '<speak>And then she asked, <voice name="en-IN-Wavenet-D"> where were you yesterday</voice><break time="250ms"/> in her sweet and gentle voice.</speak>'
    )

# Build the voice request, select the language code ("en-US") and the ssml
# voice gender ("neutral")
voice = texttospeech.VoiceSelectionParams(
   language_code="en-US", ssml_gender=texttospeech.SsmlVoiceGender.NEUTRAL
)

# Select the type of audio file you want returned
audio_config = texttospeech.AudioConfig(
   audio_encoding=texttospeech.AudioEncoding.MP3
)

# Perform the text-to-speech request on the text input with the selected
# voice parameters and audio file type
response = client.synthesize_speech(
   input=synthesis_input, voice=voice, audio_config=audio_config
)

# The response's audio_content is binary.
with open("output.mp3", "wb") as out:
   # Write the response to the output file.
   out.write(response.audio_content)
   print('Audio content written to file "output.mp3"')

输出文件：输出（使用Python）

如何访问谷歌文本到语音beta功能（2021年3月1日发布）

共有1个答案

相关问答

相关文章

相关阅读

相关工具

相关文档