SoundTouch音频处理库源码分析及算法提取（1)

邢卓

2023-12-01

SoundTouch音频处理库的使用异常简单，经过简单的编译之后，设置编译环境，以vc为例

，直接在include包含SoundTouch目录下的include路径，接着在lib添加SoundTouch目录下

的lib路径，然后在代码的头文件中添加头文件以及引用的库。如下：根据_DEBUG这个宏，

我们可以进行一些编译预处理，假如是以DEBUG编译就采用debug库，其他的话就采用

release库。他们的区别就是文件名后面是否多了一个“D”。
#include <SoundTouch.h>
#ifdef _DEBUG
#pragma comment(lib, "SoundTouchD.lib")
#else
#pragma comment(lib, "SoundTouch.lib")
#endif
当然你也可以直接在vc的项目工程中直接添加，某些人比较喜欢如此。
最重要的一点还要声明一个命名空间，至于原因，和SoundTouch这个库的声明定义有关，

以下在分析的时候会提到。
using namespace soundtouch
然后就可以直接在自己的代码中定义一个类变量SoundTouch m_SoundTouch;
SoundTouch 类的声明包含在SoundTouch.h和SoundTouch.cpp之中，由FIFOProcessor类直

接派生，而FIFOProcessor类又直接从基类FIFOSamplePipe派生。同时声明SoundTouch这个

类包含在命名空间 soundtouch，这就是为什么我们使用这个库的时候需要声明命名空间的

主要原因。感觉有点多余。且仅仅定义了一些常量，诸如版本号，版本ID号等等，这两个

父类都包含在FIFOSamplePipe.h和FIFOSamplePipe.cpp文件中。

不管什么库，如果要使用的话，一般的流程都是先定义然后进行一些必要的初始化，

SoundTouch（以下简称ST)也不例外。ST的初始化也和他的编译一样异常的简单，具体可以

参考他的例子SoundStretch来实现，也可以参考源代码中有关SoundTouch这个类的声明，

现在只关心我们会用到的那部分，可以看到在private中定义了另外两个类指针

RateTransposer*，TDStretch*；
RateTransposer从FIFOProcessor派生，而FIFOProcessor又直接从基类FIFOSamplePipe派

生，TDStretch和RateTransposer类似。由此可见，单单从两个类的名字上看：拉长？传输

速率？不难想象出这个库对声音信号的处理可能就是“拉长”，然后“变速”。难道就是传说

中的不变调变速？事实正是如此。这还不是我们现在关心的话题。
……
private:
    /// Rate transposer class instance
    class RateTransposer *pRateTransposer;
    /// Time-stretch class instance
    class TDStretch *pTDStretch;
    /// Virtual pitch parameter. Effective rate & tempo are calculated from

these parameters.
float virtualRate;
/// Virtual pitch parameter. Effective rate & tempo are calculated from

these parameters.
float virtualTempo;
/// Virtual pitch parameter. Effective rate & tempo are calculated from

these parameters.
    float virtualPitch;
    /// Flag: Has sample rate been set?
    BOOL bSrateSet;
    /// Calculates effective rate & tempo valuescfrom 'virtualRate',

'virtualTempo' and
    /// 'virtualPitch' parameters.
    void calcEffectiveRateAndTempo();
protected :
    /// Number of channels
    uint channels;
    /// Effective 'rate' value calculated from 'virtualRate', 'virtualTempo'

and 'virtualPitch'
float rate;
/// Effective 'tempo' value calculated from 'virtualRate', 'virtualTempo'

and 'virtualPitch'
    float tempo;
   /// Sets new rate control value. Normal rate = 1.0, smaller values
    /// represent slower rate, larger faster rates.
    void setRate(float newRate);
    /// Sets new tempo control value. Normal tempo = 1.0, smaller values
    /// represent slower tempo, larger faster tempo.
    void setTempo(float newTempo);
    /// Sets new rate control value as a difference in percents compared
    /// to the original rate (-50 .. +100 %)
    void setRateChange(float newRate);
    /// Sets new tempo control value as a difference in percents compared
    /// to the original tempo (-50 .. +100 %)
    void setTempoChange(float newTempo);
    /// Sets new pitch control value. Original pitch = 1.0, smaller values
    /// represent lower pitches, larger values higher pitch.
    void setPitch(float newPitch);
    /// Sets pitch change in octaves compared to the original pitch
    /// (-1.00 .. +1.00)
    void setPitchOctaves(float newPitch);
    /// Sets pitch change in semi-tones compared to the original pitch
    /// (-12 .. +12)
    void setPitchSemiTones(int newPitch);
    void setPitchSemiTones(float newPitch);
    /// Sets the number of channels, 1 = mono, 2 = stereo
    void setChannels(uint numChannels);
    /// Sets sample rate.
    void setSampleRate(uint srate);
    /// Changes a setting controlling the processing system behaviour. See the
    /// 'SETTING_...' defines for available setting ID's.
    /// /return 'TRUE' if the setting was succesfully changed
    BOOL setSetting(int settingId,   ///< Setting ID number. see SETTING_...

defines.
int value///< New setting value.
);
……
参考ST提供的例子SoundStretch，初始化SoundTouch这个类：
m_SoundTouch.setSampleRate(sampleRate);//设置声音的采样频率
m_SoundTouch.setChannels(channels);//设置声音的声道
m_SoundTouch.setTempoChange(tempoDelta); //这个就是传说中的变速不变调
m_SoundTouch.setPitchSemiTones(pitchDelta);//设置声音的pitch
m_SoundTouch.setRateChange(rateDelta);//设置声音的速率
// quick是一个bool变量，USE_QUICKSEEK具体有什么用我暂时也不太清楚。
m_SoundTouch.setSetting(SETTING_USE_QUICKSEEK, quick);
// noAntiAlias是一个bool变量，USE_AA_FILTER具体有什么用我暂时也不太清楚。
m_SoundTouch.setSetting(SETTING_USE_AA_FILTER, !(noAntiAlias));
// speech也是一个bool变量，初步估计可能是没有音乐只有人声的时候，需要设置一下。
if (speech)
{
// use settings for speech processing
m_SoundTouch.setSetting(SETTING_SEQUENCE_MS, 40);
m_SoundTouch.setSetting(SETTING_SEEKWINDOW_MS, 15);
m_SoundTouch.setSetting(SETTING_OVERLAP_MS, 8);
fprintf(stderr, "Tune processing parameters for speech processing./n");
}
通过那么简单的几个函数调用，现在我们就可以感受一下ST的强大。通过SoundTouch类提

供的函数调用方法：
putSamples(sampleBuffer,nSamples);
第一个参数为一个指向PCM编码的一段音频数据的指针，第二个参数就是要处理多少个

sample也可以理解为多少帧。
需要注意的是，一般数据流都是字节流，也就是说，sample的大小和声道、位的声音参数

有关，假如sampleBuffer指针指向一个长度为64bytes的一个PCM数据缓冲区，16位2声道

，那么实际上这里只存放了(16*2)/8=4bytes,64/4=16;16个sample，这是我们需要注意的

地方。m_SoundTouch.putSamples(sampleBuffer, nSamples);数据是传进去了，可是从哪

里接收处理过的音频数据呢？这个时候我们就要用SoundTouch提供的receiveSamples函数

调用方法。
uint receiveSamples(SAMPLETYPE *outBuffer, ///< Buffer where to copy output

samples.
uint maxSamples ///< How many samples to receive at max.
);他也是两个参数，第一个为接收数据的参数，第二个最大可以接收多少sample。
通过这段注释，大概明白receiveSamples这个函数不会在putSamples之后马上返回数据，

另外一方面有可能返回比maxSamples更多的数据，因此需要放在一个do…while(…)的循环里

面把他们都榨干。
// Read ready samples from SoundTouch processor & write them output file.
// NOTES:
// - 'receiveSamples' doesn't necessarily return any samples at all
//   during some rounds!
// - On the other hand, during some round 'receiveSamples' may have more
//   ready samples than would fit into 'sampleBuffer', and for this reason
//   the 'receiveSamples' call is iterated for as many times as it
//   outputs samples.
do
{
nSamples = m_SoundTouch.receiveSamples(sampleBuffer, buffSizeSamples);
//把sampleBuffer写入一个文件，或者填充进声卡的缓冲区，播放声音。
} while (nSamples != 0);
今天就先写到这里，比较劳累。

SoundTouch音频处理库源码分析及算法提取（1)

相关阅读

相关文章

相关问答

相关文档