< Back to IRCAM Forum

AS Fundamental Analysis sdif/txt parameters

Fundamental Analysis sdif/txt parameters

What does the differents data refer to?

For example I got a list of 5 column:

0.275022675736961 333.095275878906250 0.8444882035255 0.818155527114868 0.055422376841307

0.279172335600907 332.260437011718750 0.817870974540710 0.675731658935547 0.062232207506895

0.283344671201814 328.637573242187500 0.78286832571029 0.63606327772140 0.047893941402435

(…)

Column 1 is obviously the time, the 2e is the frequency F0, but for the amplitude
hard to guess between the 3e or the 4e, and what mean the 5th column… cannot find any documentation about that.

By the way, using the data I got pretty good and accurate result.

Hi,

Both the SDIF and the Text files should order the data in that way :

Column 1 : time
Column 2 : frequency
Column 3 : confidence
Column 4 : Score
Column 5 : Real amplitude

A description of the standard SDIF frame types are available here :

http://recherche.ircam.fr/equipes/analyse-synthese/sdif/standard/types-main.html

And for the Fundamental Frequency Estimate :

http://recherche.ircam.fr/equipes/analyse-synthese/sdif/standard/types-doc.html#Matrix_1FQ0

The Bpf and/or text export of the F0 should also follow this ordering.

HTH,
Best regards
Charles

Thank Charles, ok, its make sense… then the amplitude must be the amplitude of the F0 only - am I
right ?

Is it possible to synthezise directly the F0. I only see “synthezise from partials” .

Best regards,
Vincent

Sorry,
I checked the amplitude data and that’s all the sound, good…
I thought the sound file was normalized…

Best,
Vincent

Hello

then the amplitude must be the amplitude of the F0 only - am I right ?

No, in fact the amplitude is the sum of the squares of the amplitudes of all spectral peaks considered to be part of the harmonic series of the F0. Some sort of voiced energy attached to the F0. It only counts the peaks that are used for the F0 analysis (so not outside the max frequency) and it does not try to distinguish sinusoidal peaks from noise peaks. It is really very basic.

Best
Axel