Questions on Douglas Eck's Magenta presentation

beller · March 19, 2021, 9:58am

Matthias Jung à Tout le monde (10:38 AM)
How do you think about modelling social creativity as mentioned in your opening?

Kei Nakano à Tout le monde (10:39 AM)
Can you create new music from MAGENTA?

Francesco Ferrulli à Tout le monde (10:42 AM)
Speaking about the learning process, would it make sense to feed the algorithm with a series of music pieces that follows more or less an increasingly difficult level as it would happen for real human

antoinepetit à Tout le monde (10:42 AM)
Is there a way to access the structures and structural relationships that Music Transformer picks up in the data? To understand how the model understands the data.

Graham Hadfield à Tout le monde (10:42 AM)
The training stage seems to be the least accessible part for even semi-technical users. Given that the networks are only going to be able to offer options based on stuff that it’s been directed at (am I correct?), is there much work going on to make the training stage more accessible for users/creators?

Kei Nakano à Tout le monde (10:42 AM)
Speaking about AI and ML, there are STRONG AI and WEAK AI.Considering about Magenta,

shangyun wu à Tout le monde (10:43 AM)
Thank you for your amazing talk. I tried to play with magenta for a while. To be honest I stuck in the module downloading for a while. Therefore, I am wondering if you will develop magenta into the programming langue like p5js project in the future?

Kei Nakano à Tout le monde (10:43 AM)
which one right now?

Shahan Nercessian à Tout le monde (10:44 AM)
creators are used to using synths/plugins with pristine fidelity and low CPU footprint. That said, what are the gaps that still need to bridged for wider adoption of new ML-based audio processors, and what type of approaches are magenta considering for trying to field them?

Andrew Robertson à Tout le monde (10:45 AM)
What is the biggest difficulty in traversing the different levels, from the low-level audio creative tools, to higher level notes and musical phrases? What are the limits to the way these are modelled, such as the piano transformer? In particular, how does the high level structural model influence lower levels such as timing and nuance?

Robert Lisek à Tout le monde (10:45 AM)
How to implement Adaptation but not by building new interfaces
but by implementation of
Multi-task
A few-shot
Self-supervised learning
Universal Computation
Meta-learning

Jason Palamara à Tout le monde (10:46 AM)
Are any of these tools useable for real time music creation? Is that something magenta is working on for the future?

douglaseck · March 19, 2021, 10:20am

Yes in the simplest sense… you can create something technically “new” using a random number generator. There is a deeper issue, which is that of musical novelty: can we create something new and good? I think so, though this is inherently hard to evaluate.

douglaseck · March 19, 2021, 10:22am

Yes! This is an active area of research called “curriculum learning.” Here’s a link to an early paper on the topic: https://ronan.collobert.com/pub/matos/2009_curriculum_icml.pdf. It has proved to be hard to make work well in practice, but I think it’s still very promising.

douglaseck · March 19, 2021, 10:26am

To a certain extent. Transformers work by “attending” to different parts of the input. You can visualize that attention in different ways. This is an active area of research. Here’s a link to a blog posting about the general problem, and an approach to understanding large language models (similar to Music Transformer); Google AI Blog: The Language Interpretability Tool (LIT): Interactive Exploration and Analysis of NLP Models. Perhaps more relevant for our discussion, you might look at the video in our blog in the section “Visualizing Self-Reference”. The blog post is here: http://g.co/magenta/music-transformer

douglaseck · March 19, 2021, 10:40am

Sorry for your bad experience. We need to do a much better job of supporting our developer community! We do have magenta.js, which can be used alongside p5js. Have a look at our Get Started page: Getting Started

douglaseck · March 19, 2021, 10:49am

We have to offer something genuinely different and useful. Otherwise why bother switching? The real power of ML is (arguably) its ability to learn by example. That learning can be thought of as warping data. See Neural Networks, Manifolds, and Topology -- colah's blog. A model like DDSP in some way warps audio data in novel ways. That has great potential for creative purposes. What we need is to use a model like DDSP in new ways. Maybe even in ways that “break” the model. This ties into Philippe’s excellent ongoing talk!

douglaseck · March 19, 2021, 10:52am

I think the hardest part is designing the hierarchy in the first place, because if you get it wrong there’s really no way to “learn your way around it”. In other words, if we posit that musical notes (having start time, end time, loudness and pitch) matter so much that we build a musical note layer into our representation, we’ll be forced to push all of our music through this layer. This opens some doors but closes others.

douglaseck · March 19, 2021, 10:55am

Music Transformer is a bit computationally expensive, so we’ve yet to develop a realtime plugin. DDSP does work in real time. There is the IRCAM Pure Data version I referred to in my talk (and Philippe is mentioning now). We also have javascript and VSTs under active development. We also have g.co/magenta/studio which works in Ableton, though not in real time yet.