extraire des coéficient mfcc sous matlab Fermé

Question

Bonjour, j'ai voudré faire l'extraction des coéficient mfcc d'un signal vocal j'ai cherché et j'ai trouvé ce code function [cepstra,aspectrum,pspectrum] = melfcc(samples, sr, varargin) %[cepstra,aspectrum,pspectrum] = melfcc(samples, sr[, opts ...]) % Calculate Mel-frequency cepstral coefficients by: % - take the absolute value of the STFT % - warp to a Mel frequency scale % - take the DCT of the log-Mel-spectrum % - return the first components % This version allows a lot of options to be controlled, as optional % 'name', value pairs from the 3rd argument on: (defaults in parens) % 'wintime' (0.025): window length in sec % 'hoptime' (0.010): step between successive windows in sec % 'numcep' (13): number of cepstra to return % 'lifterexp' (0.6): exponent for liftering; 0 = none; < 0 = HTK sin lifter % 'sumpower' (1): 1 = sum abs(fft)^2; 0 = sum abs(fft) % 'preemph' (0.97): apply pre-emphasis filter [1 -preemph] (0 = none) % 'dither' (0): 1 = add offset to spectrum as if dither noise % 'minfreq' (0): lowest band edge of mel filters (Hz) % 'maxfreq' (4000): highest band edge of mel filters (Hz) % 'nbands' (40): number of warped spectral bands to use % 'bwidth' (1.0): width of aud spec filters relative to default % 'dcttype' (2): type of DCT used - 1 or 2 (or 3 for HTK or 4 for feac) % 'fbtype' ('mel'): frequency warp: 'mel','bark','htkmel','fcmel' % 'usecmp' (0): apply equal-loudness weighting and cube-root compr. % 'modelorder' (0): if > 0, fit a PLP model of this order % The following non-default values nearly duplicate Malcolm Slaney's mfcc % (i.e. melfcc(d,16000,opts...) =~= log(10)*2*mfcc(d*(2^17),16000) ) % 'wintime': 0.016 % 'lifterexp': 0 % 'minfreq': 133.33 % 'maxfreq': 6855.6 % 'sumpower': 0 % The following non-default values nearly duplicate HTK's MFCC % (i.e. melfcc(d,16000,opts...) =~= 2*htkmelfcc(:,[13,[1:12]])' % where HTK config has PREEMCOEF = 0.97, NUMCHANS = 20, CEPLIFTER = 22, % NUMCEPS = 12, WINDOWSIZE = 250000.0, USEHAMMING = T, TARGETKIND = MFCC_0) % 'lifterexp': -22 % 'nbands': 20 % 'maxfreq': 8000 % 'sumpower': 0 % 'fbtype': 'htkmel' % 'dcttype': 3 % For more detail on reproducing other programs' outputs, see % https://www.ee.columbia.edu/~dpwe/resources/matlab/rastamat/mfccs.html % % 2005-04-19 dpwe@ee.columbia.edu after rastaplp.m. % Uses Mark Paskin's process_options.m from KPMtools if nargin < 2; sr = 16000; end % Parse out the optional arguments [wintime, hoptime, numcep, lifterexp, sumpower, preemph, dither, ... minfreq, maxfreq, nbands, bwidth, dcttype, fbtype, usecmp, modelorder] = ... process_options(varargin, 'wintime', 0.025, 'hoptime', 0.010, ... 'numcep', 13, 'lifterexp', 0.6, 'sumpower', 1, 'preemph', 0.97, ... 'dither', 0, 'minfreq', 0, 'maxfreq', 4000, ... 'nbands', 40, 'bwidth', 1.0, 'dcttype', 2, ... 'fbtype', 'mel', 'usecmp', 0, 'modelorder', 0); if preemph ~= 0 samples = filter([1 -preemph], 1, samples); end % Compute FFT power spectrum pspectrum = powspec(samples, sr, wintime, hoptime, dither); aspectrum = audspec(pspectrum, sr, nbands, fbtype, minfreq, maxfreq, sumpower, bwidth); if (usecmp) % PLP-like weighting/compression aspectrum = postaud(aspectrum, maxfreq, fbtype); end if modelorder > 0 if (dcttype ~= 1) disp(['warning: plp cepstra are implicitly dcttype 1 (not ', num2str(dcttype), ')']); end % LPC analysis lpcas = dolpc(aspectrum, modelorder); % convert lpc to cepstra cepstra = lpc2cep(lpcas, numcep); % Return the auditory spectrum corresponding to the cepstra? % aspectrum = lpc2spec(lpcas, nbands); % else return the aspectrum that the cepstra are based on, prior to PLP else % Convert to cepstra via DCT cepstra = spec2cep(aspectrum, numcep, dcttype); end cepstra = lifter(cepstra, lifterexp); mais je ne l'ai pas bien compris je veut savoir qu'est ce que représente les argument (samples, sr, varargin) merci de m'avoire aider

zinebsoo · Answer

bonjour 
samples représente les échantillons du signal vocal après l'utilisation de la fonction wavread, donc tu fais ca :[samples,sr] = wavread('nom_fichier vocal.wav');
et sr correspond au sampling rate donc c'est la fréquence d'échantillonnage.
et nargin: correspond aux nombres des entrées d'une fonction ,par exemple si tes entrés sont inferieure a 2 donc nargin=2 veux dire que ta pas exiger la fréquence d'échantillonnage (sr) donc la valeur prête par défaut sera 16KHz d'après ton programme .
bon courage

Extraire des coéficient mfcc sous matlab

1 réponse

Discussions similaires