pydrobert.speech.scales
Scaling functions
Scaling functions transform a scalar in the frequency domain to some other real domain
(the “scale” domain). The scaling functions should be invertible. Their primary purpose
is to define the bandwidths of filters in pydrobert.speech.filters
.
- class pydrobert.speech.scales.BarkScaling[source]
Bases:
ScalingFunction
Psychoacoustic scaling function
Based on a collection experiments briefly mentioned in [zwicker1961] involving masking to determine critical bands. The functional approximation to the scale is implemented with the formula from [traunmuller1990] (being honest, from Wikipedia):
\[\begin{split}s = \begin{cases} z + 0.15(2 - z) & \mbox{if }z < 2 \\ z + 0.22(z - 20.1) & \mbox{if }z > 20.1 \end{cases}\end{split}\]where
\[z = 26.81f/(1960 + f) - 0.53\]Where \(s\) is the scale and \(f\) is the frequency in Hertz.
- aliases = {'bark'}
- class pydrobert.speech.scales.LinearScaling(low_hz, slope_hz=1.0)[source]
Bases:
ScalingFunction
Linear scaling between high and low scales/frequencies
- Parameters:
- aliases = {'linear', 'uniform'}
- low_hz
- slop_hz
- class pydrobert.speech.scales.MelScaling[source]
Bases:
ScalingFunction
Psychoacoustic scaling function
Based of the experiment in [stevens1937] wherein participants adjusted a second tone until it was half the pitch of the first. The functional approximation to the scale is implemented with the formula from [oshaughnessy1987] (being honest, from Wikipedia):
\[s = 1127 \ln \left(1 + \frac{f}{700} \right)\]Where \(s\) is the scale and \(f\) is the frequency in Hertz.
- aliases = {'mel'}
- class pydrobert.speech.scales.OctaveScaling(low_hz)[source]
Bases:
ScalingFunction
Uniform scaling in log2 domain from low frequency
- Parameters:
low_hz (
float
) – The positive frequency (in Hertz) corresponding to scale 0. Frequencies below this value should never be queried.
- aliases = {'octave'}
- low_hz