In the case of this sound file, markers are very precisely placed using MEL Spectral Flux segmentation. However the Frequency Mean estimation within these markers is not useful. Is there a possibility of somehow improving the frequency calculations, discarding silences or values beyond a certain delta / threshold? Many thanks
Update: I’ve come up with a bit of a workaround solution for a pitch estimation. Once markers have been assigned (using MEL segmentation), yin~ data for each segment is scanned and filtered: first with an Energy threshold (removing silences), second with a Periodicity threshold to remove noisier frequency estimated. A median pitch estimate is taken and all data that passes through the threshold is averaged to get a “pitch estimate” (in float MIDI values) which is then added to the MuBu track as a constant. Fro m this point, markers can be selected using this “pitch estimate” as a target, and the results give a far more consistent pitch impression than FrequencyMean can provide. Short demo (one buffer):
Hi Chris !
Following !!!
Sounds cool
Hello Segmenters — What I have been working on lately is an implementation of the same algorithmic approach I was using with MuBu, but now in the bach / ears environment. I am visualizing pitch estimates, using various algorithms via ears.essentia~, for every frame in a bach.roll and aiming to use that data to refine both pitch-based marker placement and the pitch estimate for each given segment — which can be calculated according to the FFT data or by slicing the buffer at marker points and re-analyzing for new estimates. Finally, I have made it possible to write both pitch estimates and the certainty estimation into the marker labels and embed them in a sound file, so that they can be recalled and read elsewhere.
I find the process fulfilling and the results sometimes good, sometimes less convincing — so I would love to have some interaction and feedback, any clues about how to get closer to the “holy grail” pitch estimator…
Thanks!
pitch-estimate-ears.maxpat (1.9 MB)