aeneas.cmfcc

aeneas.cmfcc is a Python C Extension for computing the MFCCs from a WAVE mono file.

cmfcc.compute_from_data(data, sample_rate, filter_bank_size, mfcc_size, fft_order, lower_frequency, upper_frequency, emphasis_factor, window_length, window_shift)

Compute MFCCs for a given WAVE mono file, passed as a NumPy 1D array of float64 values in [-1.0, 1.0].

The returned tuple (mfcc, length, sr) contains the MFCCs as a NumPy 2D matrix of shape (n, mfcc_size), and the number of samples and sample rate of the WAVE file.

The last two elements length and sr are returned to make the signature of this function consistent with that of function cmfcc.compute_from_file().

Parameters:
  • data (numpy.ndarray (1D)) – the audio data
  • sample_rate (int) – the audio sample rate
  • filter_bank_size (int) – the number of Mel filters
  • mfcc_size (int) – the number of MFCC coefficients
  • fft_order (int) – the order of the FFT
  • lower_frequency (float) – the lower frequency to cut, in Hz
  • upper_frequency (float) – the upper frequency to cut, in Hz
  • emphasis_factor (float) – the pre-emphasis factor
  • window_length (float) – the length of the MFCC window, in seconds
  • window_shift (float) – the shift of the MFCC window, in seconds
Return type:

tuple

cmfcc.compute_from_file(audio_file_path, filter_bank_size, mfcc_size, fft_order, lower_frequency, upper_frequency, emphasis_factor, window_length, window_shift)

Compute MFCCs for a given WAVE mono file, passed as a file path on disk.

The returned tuple (mfcc, length, sr) contains the MFCCs as a NumPy 2D matrix of shape (n, mfcc_size), and the number of samples and sample rate of the WAVE file.

Parameters:
  • audio_file_path (string) – the path of the WAVE file to be created, UTF-8 encoded
  • filter_bank_size (int) – the number of Mel filters
  • mfcc_size (int) – the number of MFCC coefficients
  • fft_order (int) – the order of the FFT
  • lower_frequency (float) – the lower frequency to cut, in Hz
  • upper_frequency (float) – the upper frequency to cut, in Hz
  • emphasis_factor (float) – the pre-emphasis factor
  • window_length (float) – the length of the MFCC window, in seconds
  • window_shift (float) – the shift of the MFCC window, in seconds
Return type:

tuple