< Back to IRCAM Forum

Fundamental frequency estimate (OM-Pm2 vs. OM-SuperVP)

Hi,

I am comparing results from OM-SuperVP’s f0-estimate and OM-Pm2’s pm2-f0, and despite their similar description in the doc, their outputs are quite different. Despite the fact that they have a couple of different parameters, it looks like SuperVP is returning the most prominent pitch, whereas Pm2 would estimate a possible fundamental to what is heard ? (example tested: a trumpet sound starting around G4 returns the same with SuperVP, but around C3 with Pm2, as if G4 was considered a partial)…

Thanks for any insight!
Jimmie

Hi Jimmie,

First I’d like to say that Pm2 and SuperVP use the same F0 estimation library. This library contains 2 F0 estimation algorithms. So according to the choice you will get different results. Moreover, flexibility of parameterisation of the different algorithms is more rudimentary in Pm2, so that I think in general it is better to use SuperVP for that task.

Not knowing OM-SuperVP or OM-PM2 I cannot tell you anything about the differences - can you access the command lines that are generated by these two OM modules? In that case I could see what version of the F0 algorithm they use and how they parameterise the algorithms.

Best
Axel

hi,

here is a bit of lisp code in OM f0-estimate :

;;;================================================================================================================
;;; FUNDAMENTAL FREQUENCY ESTIMATE
;;;================================================================================================================

(defmethod! f0-estimate ((infile string) &key begin-time end-time
(fund-minfreq 50.0) (fund-maxfreq 1000.0) (spectrum-maxfreq 4000.0) (noise-threshold 50.0) (smooth-order 3)
(windowsize 4096) (fftsize 4096) (step 256) (windowtype “hanning”) (out “f0.sdif”))
:initvals '(nil nil nil 50.0 1000.0 4000.0 50.0 3 4096 4096 256 “hanning” “f0.sdif”)
:menuins (list (list 11 '((“Blackman” “blackman”)(“Hanning” “hanning”)(“Hamming” “hamming”))))
:icon 952
:doc "Calculates the Fundamental Fequency in using SuperVP.
The results of analysis are stored in an SDIF file which pathname is returned.

  • : pathname or SOUND object to be analysed

  • : the begin time of the analysis (s)

  • : the end time of the analysis (s)

  • : Fundamental minimal frequency (Hz)

  • : Fundamental maximal frequency (Hz)

  • : Maximal frequency in spectrum (Hz)

  • : Noise threshold (dB)

  • : Smooth Order (int)

  • : number of samples of the analysis window

  • : number of points of fft

  • : number of samples between two successive analysis windows

  • : shape of the analysis window

  • : output file pathname

"
(if (and SVP-PATH (probe-file SVP-PATH))
(let ((outname (if out
(handle-new-file-exists (if (pathnamep out) out (outfile out)))
(om-choose-new-file-dialog :prompt “Choose a new SDIF F0 file”
:directory (def-save-directory)))))
(when outname
(setf last-saved-dir (make-pathname :directory (pathname-directory outname)))
(let* ((unix-outname (om-path2cmdpath outname))
(beginstr (if begin-time (format nil "-B~D " begin-time) “”))
(endstr (if end-time (format nil "-E~D " end-time) “”))
(fftstr (format nil "-N~D -M~D -W~D -I~D " fftsize windowsize windowtype step))
(f0params (format nil "fm~D fM~D F~D sn~D smooth~D "
fund-minfreq fund-maxfreq spectrum-maxfreq noise-threshold smooth-order))
(cmd (format nil “~s -v -t -ns -U -S~s -Af0 ~s ~A ~A ~A -OS0 ~s”
(om-path2cmdpath SVP-PATH)
(om-path2cmdpath infile)
f0params
fftstr beginstr endstr unix-outname))

          )</code>  
      (om-print "===========================")  
      (om-print "SUPERVP F0 ANALYSIS")  
      (om-print "===========================")  
      (om-print cmd)     
      (om-cmd-line cmd *sys-console*)  
      (and outname (probe-file outname)))))  
    (om-beep-msg "!!! SuperVP not found !!!"))  
  )  

I’m interested in the svp command lines topic and the way these commands are customized through other means to create routines (OM, Python, Shell…)

hope it helps

N.

.

Hi there – rather then Lisp code: you can see the command line that is sent to pm2 (or SuperVP) in the OM Listener.
Here with the default one for pm2 (if you do not set any parameter)

pm2 -Af0 --f0min=100 --f0max=300 --f0ana=3000 --f0use -S"/[…]/in.aiff" -M4096 -I256 -N4096 -m40 -Whanning “/[…]/f0.sdif”

=> note that I have corrected an error here – in OM-pm2 1.3 you might get “-N4096-m40” (with no space before -m)

The default command with OM-SuperVP is:

supervp -v -t -ns -U -S"/[…]/in.aiff" -Af0 "fm50.0 fM1000.0 F4000.0 sn50.0 smooth3 " -N4096 -M4096 -Whanning -I256 -OS0 “/[…]/f0.sdif”

Jean

hello Jean,

thanks for these command lines. I am in fact not very proficient in lisp.

There are quite some fundamental differences in these defaults, and if the results described in the initial question have been produced with the default parameters
then I think all can be explained with those:

example tested: a trumpet sound starting around G4 returns the same with SuperVP, but around C3 with Pm2, as if G4 was considered a partial)…

The most critical parameters of the fundamental frequency estimation is the fundamental frequency search range. If the search range does not cover the fundamental of the sound it is quite obvious that the results cannot be correct.

Here the default search range for SuperVP is given by fm50.0 fM1000.0 which means search the F0 between 50Hz and 1000Hz while for pm2 the default search range is given by –f0min=100 –f0max=300, which means search f0 between 100Hz and 300Hz. Now given the trumpet note is G4 ~= 392Hz it is quite obvious that PM2 will not be able to find the correct note. It does the second best thing and outputs a sub harmonic (C3~=130Hz = 390/3) lying in the F0 search range that is compatible with the observed sound. I am pretty sure that if PM2 will be given the correct F0 search range, then it will also find the correct result.

Note besides this that --f0use in the PM2 command line selects the feature scoring f0 algorithms, while the SuperVP command line uses the much older histogram based algorithm. Normally the feature scoring algorithm performs better, but obviously this also depends on the type of sound and the other parameter choices (see above).

=> note that I have corrected an error here — in OM-pm2 1.3 you might get “-N4096-m40″ (with no space before -m)
well if this is a display problem then it does not matter, if you send the command line n that way then the -m40 will simply be ignored. This parameter controls the peak filter that will remove all peaks that are more than 40dB below the maximum peak within each spectral frame. Note that the corresponding parameter in the supervp command line is sn50, which means again there is a difference in the default values which will certainly create different results, for at least some sounds.

Finally, also the analysis range (the part of the spectrum that is used to analyse the sound) is different. For SuperVP max frequency to be taken into account is F4000 (which means 4000Hz) while for pm2 --f0ana=3000 means 3000Hz. In general it is most often the best choice to limit the observed spectrum to 2-3 times F0max. This is sufficient to discover the harmonic grid, and avoids potential impact of noise in higher frequency regions, that will generally perturb the algorithms.

Best
Axel

Dear all,

Thank you very much for sorting that out, I didn’t think at first that the default settings would be so different… Indeed, when we modify the default parameters for f0 minfreq/maxfreq search range, we get two rather similar analysis. Even, it turns out that for the same Trumpet tune, using f0 range between 400/950 Hz and max freq 3000 (which we can set on SVP, but not on Pm2), with all other settings equal, Pm2 outputs a cleaner result (which may go along the idea that the feature scoring algorithm performs better, at least with this type of sound).

Finally, I just want to report that Pm2’s ‘spectrum-maxfreq’ inlet seems to have a bug, since when shift-clicked, it pops up the ‘harmonic/inharmonic’ menu… The same parameter works properly on SVP, though.

All best,
Jimmie

Hello Jimmie,

I am very happy that the feedback has helped you understanding and resolving the problem.

Best
Axel