< Back to IRCAM Forum

Easiest/Cheapest practices for classifying recorded sounds according few features

Dears,

starting to go a it deeper with mubu here.

I’d need to know a bit more how I could classify recordings.

Recording audio in a mubu track, ok.
Then, I’d like to trigger some analysis to know if the recording is

  • a speech,
  • a sound which is not a speech,
  • a music, why not

Obviously, these are not disjoint sets…
A music could contain a speech, etc etc.

Without trying to do that on a VERY HIGH ACCURATE manner, how could I do that with pipo ?

The main question here is: what features/values should I analyze to do that ?

Any ideas/advices would be very appreciated :slight_smile:

This is still an open research question, but first, mubu.gmm can help here with classification.
Descriptor statistics (stddev) from yin and the other descr should be taken over longer windows with pipo.chop. Maybe try mfcc, and delta values.
And beware of rap music :wink: