Transformers¶
Python wrapper around the SoX library. This module requires that SoX is installed.
-
class
sox.transform.
Transformer
(input_filepath, output_filepath)[source]¶ Audio file transformer. Class which allows multiple effects to be chained to create an output file, saved to output_filepath.
Parameters: input_filepath : str
Path to input audio file.
output_filepath : str
Path to desired output file. If a file already exists at the given path, the file will be overwritten.
Attributes
input_filepath (str) Path to input audio file. output_filepath (str) Path where the output file will be written. input_format (list of str) Input file format arguments that will be passed to SoX. output_format (list of str) Output file format arguments that will be bassed to SoX. effects (list of str) Effects arguments that will be passed to SoX. effects_log (list of str) Ordered sequence of effects applied. globals (list of str) Global arguments that will be passed to SoX. Methods
-
allpass
(frequency, width_q=2.0)[source]¶ Apply a two-pole all-pass filter. An all-pass filter changes the audio’s frequency to phase relationship without changing its frequency to amplitude relationship. The filter is described in detail in at http://musicdsp.org/files/Audio-EQ-Cookbook.txt
Parameters: frequency : float
The filter’s center frequency in Hz.
width_q : float, default=2.0
The filter’s width as a Q-factor.
-
bandpass
(frequency, width_q=2.0, constant_skirt=False)[source]¶ Apply a two-pole Butterworth band-pass filter with the given central frequency, and (3dB-point) band-width. The filter rolls off at 6dB per octave (20dB per decade) and is described in detail in http://musicdsp.org/files/Audio-EQ-Cookbook.txt
Parameters: frequency : float
The filter’s center frequency in Hz.
width_q : float, default=2.0
The filter’s width as a Q-factor.
constant_skirt : bool, default=False
If True, selects constant skirt gain (peak gain = width_q). If False, selects constant 0dB peak gain.
See also
bandreject
,sinc
-
bandreject
(frequency, width_q=2.0)[source]¶ Apply a two-pole Butterworth band-reject filter with the given central frequency, and (3dB-point) band-width. The filter rolls off at 6dB per octave (20dB per decade) and is described in detail in http://musicdsp.org/files/Audio-EQ-Cookbook.txt
Parameters: frequency : float
The filter’s center frequency in Hz.
width_q : float, default=2.0
The filter’s width as a Q-factor.
constant_skirt : bool, default=False
If True, selects constant skirt gain (peak gain = width_q). If False, selects constant 0dB peak gain.
See also
bandreject
,sinc
-
bass
(gain_db, frequency=100.0, slope=0.5)[source]¶ Boost or cut the bass (lower) frequencies of the audio using a two-pole shelving filter with a response similar to that of a standard hi-fi’s tone-controls. This is also known as shelving equalisation.
The filters are described in detail in http://musicdsp.org/files/Audio-EQ-Cookbook.txt
Parameters: gain_db : float
The gain at 0 Hz. For a large cut use -20, for a large boost use 20.
frequency : float, default=100.0
The filter’s cutoff frequency in Hz.
slope : float, default=0.5
The steepness of the filter’s shelf transition. For a gentle slope use 0.3, and use 1.0 for a steep slope.
-
compand
(attack_time=0.3, decay_time=0.8, soft_knee_db=6.0, tf_points=[(-70, -70), (-60, -20), (0, 0)])[source]¶ Compand (compress or expand) the dynamic range of the audio.
Parameters: attack_time : float, default=0.3
The time in seconds over which the instantaneous level of the input signal is averaged to determine increases in volume.
decay_time : float, default=0.8
The time in seconds over which the instantaneous level of the input signal is averaged to determine decreases in volume.
soft_knee_db : float or None, default=6.0
The ammount (in dB) for which the points at where adjacent line segments on the transfer function meet will be rounded. If None, no soft_knee is applied.
tf_points : list of tuples
Transfer function points as a list of tuples corresponding to points in (dB, dB) defining the compander’s transfer function.
See also
mcompand
,contrast
-
convert
(samplerate=None, channels=None, bitdepth=None)[source]¶ Converts output audio to the specified format.
Parameters: samplerate : float, default=None
Desired samplerate. If None, defaults to the same as input.
channels : int, default=None
Desired channels. If None, defaults to the same as input.
bitdepth : int, default=None
Desired bitdepth. If None, defaults to the same as input.
See also
-
equalizer
(frequency, width_q, gain_db)[source]¶ Apply a two-pole peaking equalisation (EQ) filter to boost or reduce around a given frequency. This effect can be applied multiple times to produce complex EQ curves.
Parameters: frequency : float
The filter’s central frequency in Hz.
width_q : float
The filter’s width as a Q-factor.
gain_db : float
The filter’s gain in dB.
-
fade
(fade_in_len=0.0, fade_out_len=0.0, fade_shape='q')[source]¶ Add a fade in and/or fade out to an audio file. Default fade shape is 1/4 sine wave.
Parameters: fade_in_len : float, default=0.0
Length of fade-in (seconds). If fade_in_len = 0, no fade in is applied.
fade_out_len : float, defaut=0.0
Length of fade-out (seconds). If fade_out_len = 0, no fade in is applied.
fade_shape : str, default=’q’
- Shape of fade. Must be one of
- ‘q’ for quarter sine (default),
- ‘h’ for half sine,
- ‘t’ for linear,
- ‘l’ for logarithmic
- ‘p’ for inverted parabola.
See also
splice
-
gain
(gain_db=0.0, normalize=True, limiter=False, balance=None)[source]¶ Apply amplification or attenuation to the audio signal.
Parameters: gain_db : float, default=0.0
Target gain in decibels (dB).
normalize : bool, default=True
If True, audio is normalized to gain_db relative to full scale. If False, simply adjusts the audio power level by gain_db.
limiter : bool, default=False
If True, a simple limiter is invoked to prevent clipping.
balance : str or None, default=None
- Balance gain across channels. Can be one of:
- None applies no balancing (default)
- ‘e’ applies gain to all channels other than that with the
- highest peak level, such that all channels attain the same peak level
- ‘B’ applies gain to all channels other than that with the
- highest RMS level, such that all channels attain the same RMS level
- ‘b’ applies gain with clipping protection to all channels other
- than that with the highest RMS level, such that all channels attain the same RMS level
If normalize=True, ‘B’ and ‘b’ are equivalent.
-
highpass
(frequency, width_q=0.707, n_poles=2)[source]¶ Apply a high-pass filter with 3dB point frequency. The filter can be either single-pole or double-pole. The filters roll off at 6dB per pole per octave (20dB per pole per decade).
Parameters: frequency : float
The filter’s cutoff frequency in Hz.
width_q : float, default=0.707
The filter’s width as a Q-factor. Applies only when n_poles=2. The default gives a Butterworth response.
n_poles : int, default=2
The number of poles in the filter. Must be either 1 or 2
-
loudness
(gain_db=-10.0, reference_level=65.0)[source]¶ Loudness control. Similar to the gain effect, but provides equalisation for the human auditory system.
The gain is adjusted by gain_db and the signal equalised according to ISO 226 w.r.t. reference_level.
Parameters: gain_db : float, default=-10.0
Output loudness (in dB)
reference_level : float, default=65.0
Reference level (in dB) according to which the signal is equalized. Must be between 50 and 75 (dB)
-
lowpass
(frequency, width_q=0.707, n_poles=2)[source]¶ Apply a low-pass filter with 3dB point frequency. The filter can be either single-pole or double-pole. The filters roll off at 6dB per pole per octave (20dB per pole per decade).
Parameters: frequency : float
The filter’s cutoff frequency in Hz.
width_q : float, default=0.707
The filter’s width as a Q-factor. Applies only when n_poles=2. The default gives a Butterworth response.
n_poles : int, default=2
The number of poles in the filter. Must be either 1 or 2
-
norm
(db_level=-3.0)[source]¶ Normalize an audio file to a particular db level. This behaves identically to the gain effect with normalize=True.
Parameters: db_level : float, default=-3.0
Output volume (db)
-
overdrive
(gain_db=20.0, colour=20.0)[source]¶ Apply non-linear distortion.
Parameters: gain_db : float, default=20
Controls the amount of distortion (dB).
colour : float, default=20
Controls the amount of even harmonic content in the output (dB).
-
pad
(start_duration=0.0, end_duration=0.0)[source]¶ Add silence to the beginning or end of a file. Calling this with the default arguments has no effect.
Parameters: start_duration : float
Number of seconds of silence to add to beginning.
end_duration : float
Number of seconds of silence to add to end.
See also
delay
-
pitch
(n_semitones, quick=False)[source]¶ Pitch shift the audio without changing the tempo.
This effect uses the WSOLA algorithm. The audio is chopped up into segments which are then shifted in the time domain and overlapped (cross-faded) at points where their waveforms are most similar as determined by measurement of least squares.
Parameters: n_semitones : float
The number of semitones to shift. Can be positive or negative.
quick : bool, default=False
If True, this effect will run faster but with lower sound quality.
See also
bend
,speed
,tempo
-
rate
(samplerate, quality='h')[source]¶ Change the audio sampling rate (i.e. resample the audio) to any given samplerate. Better the resampling quality = slower runtime.
Parameters: samplerate : float
Desired sample rate.
quality : str
- Resampling quality. One of:
- q : Quick - very low quality,
- l : Low,
- m : Medium,
- h : High (default),
- v : Very high
silence_threshold : float
Silence threshold as percentage of maximum sample amplitude.
min_silence_duration : float
The minimum ammount of time in seconds required for a region to be considered non-silent.
buffer_around_silence : bool
If True, leaves a buffer of min_silence_duration around removed silent regions.
See also
upsample
,downsample
,convert
-
reverb
(reverberance=50, high_freq_damping=50, room_scale=100, stereo_depth=100, pre_delay=0, wet_gain=0, wet_only=False)[source]¶ Add reverberation to the audio using the ‘freeverb’ algorithm. A reverberation effect is sometimes desirable for concert halls that are too small or contain so many people that the hall’s natural reverberance is diminished. Applying a small amount of stereo reverb to a (dry) mono signal will usually make it sound more natural.
Parameters: reverberance : float, default=50
Percentage of reverberance
high_freq_damping : float, default=50
Percentage of high-frequency damping.
room_scale : float, default=100
Scale of the room as a percentage.
stereo_depth : float, default=100
Stereo depth as a percentage.
pre_delay : float, default=0
Pre-delay in milliseconds.
wet_gain : float, default=0
Amount of wet gain in dB
wet_only : bool, default=False
If True, only outputs the wet signal.
See also
echo
-
set_globals
(dither=False, guard=False, multithread=False, replay_gain=False, verbosity=2)[source]¶ Sets SoX’s global arguments. Overwrites any previously set global arguments. If this function is not explicity called, globals are set to this function’s defaults.
Parameters: dither : bool, default=False
If True, dithering is applied for low files with low bit rates.
guard : bool, default=False
If True, invokes the gain effect to guard against clipping.
multithread : bool, default=False
If True, each channel is processed in parallel.
replay_gain : bool, default=False
If True, applies replay-gain adjustment to input-files.
verbosity : int, default=2
- SoX’s verbosity level. One of:
- 0 : No messages are shown at all
- 1 : Only error messages are shown. These are generated if SoX
- cannot complete the requested commands.
- 2 : Warning messages are also shown. These are generated if
- SoX can complete the requested commands, but not exactly according to the requested command parameters, or if clipping occurs.
- 3 : Descriptions of SoX’s processing phases are also shown.
- Useful for seeing exactly how SoX is processing your audio.
- 4, >4 : Messages to help with debugging SoX are also shown.
-
silence
(location=0, silence_threshold=0.1, min_silence_duration=0.1, buffer_around_silence=False)[source]¶ Removes silent regions from an audio file.
Parameters: location : int, default=0
- Where to remove silence. One of:
- 0 to remove silence throughout the file (default),
- 1 to remove silence from the beginning,
- -1 to remove silence from the end,
silence_threshold : float, default=0.1
Silence threshold as percentage of maximum sample amplitude. Must be between 0 and 100.
min_silence_duration : float, default=0.1
The minimum ammount of time in seconds required for a region to be considered non-silent.
buffer_around_silence : bool, default=False
If True, leaves a buffer of min_silence_duration around removed silent regions.
See also
vad
-
tempo
(factor, audio_type=None, quick=False)[source]¶ Time stretch audio without changing pitch.
This effect uses the WSOLA algorithm. The audio is chopped up into segments which are then shifted in the time domain and overlapped (cross-faded) at points where their waveforms are most similar as determined by measurement of least squares.
Parameters: factor : float
The ratio of new tempo to the old tempo. For ex. 1.1 speeds up the tempo by 10%; 0.9 slows it down by 10%.
audio_type : str
- Type of audio, which optimizes algorithm parameters. One of:
- m : Music,
- s : Speech,
- l : Linear (useful when factor is close to 1),
quick : bool, default=False
If True, this effect will run faster but with lower sound quality.
See also
stretch
,speed
,pitch
-
treble
(gain_db, frequency=3000.0, slope=0.5)[source]¶ Boost or cut the treble (lower) frequencies of the audio using a two-pole shelving filter with a response similar to that of a standard hi-fi’s tone-controls. This is also known as shelving equalisation.
The filters are described in detail in http://musicdsp.org/files/Audio-EQ-Cookbook.txt
Parameters: gain_db : float
The gain at the Nyquist frequency. For a large cut use -20, for a large boost use 20.
frequency : float, default=100.0
The filter’s cutoff frequency in Hz.
slope : float, default=0.5
The steepness of the filter’s shelf transition. For a gentle slope use 0.3, and use 1.0 for a steep slope.
-
Combiners¶
Python wrapper around the SoX library. This module requires that SoX is installed.
-
class
sox.combine.
Combiner
(input_filepath_list, output_filepath, combine_type, input_volumes=None)[source]¶ Audio file combiner. Class which allows multiple files to be combined to create an output file, saved to output_filepath.
Inherits all methods from the Transformer class, thus any effects can be applied after combining.
Parameters: input_filepath_list : list of str
List of paths to input audio files.
output_filepath : str
Path to desired output file. If a file already exists at the given path, the file will be overwritten.
combine_type : str
- Input file combining method. One of the following values:
- concatenate : combine input files by concatenating in the
- order given.
- merge : combine input files by stacking each input file into
- a new channel of the output file.
- mix : combine input files by summing samples in corresponding
- channels.
- mix-power : combine input files with volume adjustments such
- that the output volume is roughly equivlent to one of the input signals.
- multiply : combine input files by multiplying samples in
- corresponding samples.
input_volumes : list of float, default=None
List of volumes to be applied upon combining input files. Volumes are applied to the input files in order. If None, input files will be combined at their original volumes.
Methods
File info¶
Audio file info computed by soxi.
-
sox.file_info.
bitrate
(input_filepath)[source]¶ Number of bits per sample (0 if not applicable).
Parameters: input_filepath : str
Path to audio file.
Returns: bitrate : int
number of bits per sample returns 0 if not applicable
-
sox.file_info.
channels
(input_filepath)[source]¶ Show number of channels.
Parameters: input_filepath : str
Path to audio file.
Returns: channels : int
number of channels
-
sox.file_info.
comments
(input_filepath)[source]¶ Show file comments (annotations) if available.
Parameters: input_filepath : str
Path to audio file.
Returns: comments : str
File comments from header. If no comments are present, returns an empty string.
-
sox.file_info.
duration
(input_filepath)[source]¶ Show duration in seconds (0 if unavailable).
Parameters: input_filepath : str
Path to audio file.
Returns: duration : float
Duration of audio file in seconds. If unavailable or empty, returns 0.
-
sox.file_info.
encoding
(input_filepath)[source]¶ Show the name of the audio encoding.
Parameters: input_filepath : str
Path to audio file.
Returns: encoding : str
audio encoding type
-
sox.file_info.
file_extension
(filepath)[source]¶ Get the extension of a filepath.
Parameters: filepath : str
File path.
-
sox.file_info.
file_type
(input_filepath)[source]¶ Show detected file-type.
Parameters: input_filepath : str
Path to audio file.
Returns: file_type : str
file format type (ex. ‘wav’)
-
sox.file_info.
num_samples
(input_filepath)[source]¶ Show number of samples (0 if unavailable).
Parameters: input_filepath : str
Path to audio file.
Returns: n_samples : int
total number of samples in audio file. Returns 0 if empty or unavailable
-
sox.file_info.
sample_rate
(input_filepath)[source]¶ Show sample-rate.
Parameters: input_filepath : str
Path to audio file.
Returns: samplerate : float
number of samples/second
-
sox.file_info.
validate_input_file
(input_filepath)[source]¶ Input file validation function. Checks that file exists and can be processed by SoX.
Parameters: input_filepath : str
The input filepath.
Core functionality¶
Base module for calling SoX
-
exception
sox.core.
SoxError
(*args, **kwargs)[source]¶ Exception to be raised when SoX exits with non-zero status.
-
exception
sox.core.
SoxiError
(*args, **kwargs)[source]¶ Exception to be raised when SoXi exits with non-zero status.
-
sox.core.
all_equal
(list_of_things)[source]¶ Check if a list contains identical elements.
Parameters: list_of_things : list
list of objects
-
sox.core.
enquote_filepath
(fpath)[source]¶ Wrap a filepath in double-quotes to protect difficult characters.
-
sox.core.
play
(args)[source]¶ Pass an argument list to play.
Parameters: args : iterable
Argument list for play. The first item can, but does not need to, be ‘play’.