< Back to IRCAM Forum

Segmentation / Frequency Estimation

In the case of this sound file, markers are very precisely placed using MEL Spectral Flux segmentation. However the Frequency Mean estimation within these markers is not useful. Is there a possibility of somehow improving the frequency calculations, discarding silences or values beyond a certain delta / threshold? Many thanks

Update: I’ve come up with a bit of a workaround solution for a pitch estimation. Once markers have been assigned (using MEL segmentation), yin~ data for each segment is scanned and filtered: first with an Energy threshold (removing silences), second with a Periodicity threshold to remove noisier frequency estimated. A median pitch estimate is taken and all data that passes through the threshold is averaged to get a “pitch estimate” (in float MIDI values) which is then added to the MuBu track as a constant. Fro m this point, markers can be selected using this “pitch estimate” as a target, and the results give a far more consistent pitch impression than FrequencyMean can provide. Short demo (one buffer):