< Back to IRCAM Forum

Speech to notation

I am trying to create a quick and dirty patch to approximate the rhythm and pitches of short speech samples.

Everything works fine except that omquantify continues to give me an error (see below) when trying to create a tree out of the durations derived from the SDIF file by SDIF->CHORD-SEQ.

I am assuming it has something to do with the really short durations included in the duration list. I tried to get around this problem by rounding off the durations. But I still get the same error.

Any ideas?

Best,

Federico

error.png

Dear Frederico,

This is a known BIG BUG… with the output of this fucntion. It outputs zero durations…
It has been notified…
If you send the patch i might help to send you another with an approximate quantification.

Best
K

Dear Karim,

Thank you for your reply. I got around the problem by flattening the list of durations coming out of the chord-seq before rounding it. The problem I have now is that there is a significant difference in the result shown in the voice object compared to what is displayed (and what can be heard) in the chord-seq object. I find that if I save the chord-seq contents into a midi file and then open it in Finale I get a much more accurate representation of the original. Actually pretty awesome for such a simple procedure! But I am wondering if there isn't a better way to do this directly in OM.

I am attaching the “improved” patch.

Best,

Federico

speech-to-notation1.omp (41.6 KB)

Dear Frederico,

Great, ok here it is :
First, your quantification won’t work using ldurs from chord-seq. WHY ? : because if you open the chord-seq qnd put it in durations view,
you’ll realize that most of the durations overlap. So your rhythm will be completely false.
Next :
There’s seem a slight problem with the function sdif->chord-seq.: If you chexk out the lonset output you’ll notice that they are not sorted out. meaning that most
events are not ordered
So in order first to fix that, i used your midifile -> chords-seq (c.f patch). There events are sorted correctly.
THen in order to strip overlapping durations, i used a hidden function “normalize-chord-seq”.
Then finally , i use the onsets with x->dx giving me a set of durations between all onsets. (this is ok except when we have rests!)

THen i quantified in a very special way avoiding to have gracenotes… For the moment unfortunately OM does not support them and you will have a false pitches due to missing gracenotes.

Best
K

NOTE : I am working actually on a quantification code that allows all this in a more realistic rendering. I know the voice we have in the example above is awesomely complicated and irrealistic…

speech-to-notation1b.omp (96.6 KB)

Thank you. This is great!

Federico

Dear Karim,

I am having some problems with the patch you helped me with.

First of all I now get an error (see attachment) each time I evaluate it.

Second, the midi file that is generated is only a few seconds long, considerably shorter than the original.

Any ideas?

Thank you for your help!

Federico

error.png

Dear Frederico.

I don’t see here a problem. or maybe if you send me the audio file so I can cross check with the original soundfile.

Best
K