Segmentation / Frequency Estimation

ctrapani · May 17, 2023, 2:35pm

In the case of this sound file, markers are very precisely placed using MEL Spectral Flux segmentation. However the Frequency Mean estimation within these markers is not useful. Is there a possibility of somehow improving the frequency calculations, discarding silences or values beyond a certain delta / threshold? Many thanks

ctrapani · May 24, 2023, 12:07pm

Update: I’ve come up with a bit of a workaround solution for a pitch estimation. Once markers have been assigned (using MEL segmentation), yin~ data for each segment is scanned and filtered: first with an Energy threshold (removing silences), second with a Periodicity threshold to remove noisier frequency estimated. A median pitch estimate is taken and all data that passes through the threshold is averaged to get a “pitch estimate” (in float MIDI values) which is then added to the MuBu track as a constant. Fro m this point, markers can be selected using this “pitch estimate” as a target, and the results give a far more consistent pitch impression than FrequencyMean can provide. Short demo (one buffer):

beller · June 19, 2023, 9:14am

Hi Chris !

Following !!!

Sounds cool

ctrapani · May 2, 2024, 8:16am

Hello Segmenters — What I have been working on lately is an implementation of the same algorithmic approach I was using with MuBu, but now in the bach / ears environment. I am visualizing pitch estimates, using various algorithms via ears.essentia~, for every frame in a bach.roll and aiming to use that data to refine both pitch-based marker placement and the pitch estimate for each given segment — which can be calculated according to the FFT data or by slicing the buffer at marker points and re-analyzing for new estimates. Finally, I have made it possible to write both pitch estimates and the certainty estimation into the marker labels and embed them in a sound file, so that they can be recalled and read elsewhere.

I find the process fulfilling and the results sometimes good, sometimes less convincing — so I would love to have some interaction and feedback, any clues about how to get closer to the “holy grail” pitch estimator…

Thanks!

pitch-estimate-ears.maxpat (1.9 MB)