Speaking in Hamburg to a terrific group of assembled locals from a variety of design backgrounds. And yes, this is the other part of my life behind me. I just seem to generally skip the years 1700-1985. Go figure.

The history of music and the history of music notation are closely intertwined. Now, digital languages for communicating musical ideas between devices, users, and software, and storing and reproducing those ideas, take on the role notation alone once did. Notation has always been more than just a way of telling musicians what to do. (Any composer will quickly tell you as much.) Notation is a model by which we think about music, one so ingrained that even people who can’t read music are impacted by the way scores shape musical practice.

All of this creates a special challenge. Musical notational systems had traditionally evolved over centuries. Now, we face the daunting question of how to build that language overnight.

This question has been a topic I’ve visited in a couple of talks, first here in New York at in/out fest last December, then most recently for a more general audience at RSVP, a new conversation series in Hamburg, Germany hosted by the multi-disciplinary design studio Precious Forever. (See photo at top, by which we can prove that the event happened. Check out more on the event and how the Precious gang hope this will inspire new interchange of ideas in Hamburg – something perhaps to bring to your town.)

What I’ve learned in talking to people at those events is, music notation matters. It’s more relevant to broad audiences than even those audiences might instinctively think. The most common lingua franca we have for digital music storage, MIDI, is woefully inadequate.

But perhaps most importantly: replacing MIDI’s primitive note message is far from easy. The more you try to “fix” MIDI, the more you appreciate its relative simplicity. And engineering new solutions could take re-examining assumptions Western music notation has made for centuries.

Musical notation and culture

A recent PSP version of the standard Harmonix/GuitarFreaks interface, Rock Band Unplugged. Photo courtesy Harmonix.

Explaining the importance of notation to expert musicians is easy. But to convey its importance to lay people, you need look no further than the game interface developed by Harmonix for the hit titles Guitar Hero and Rock Band (and in turn descended from a similar interface paradigm used in the Japan-only Konami GuitarFreaks). These games demonstrate that, even among non-musician gamers, certain received wisdoms from Western notation endure. (In fairness, many of the designers of music games have a fair bit of musical experience, but the fact that their work is received by audiences in the way it is nonetheless speaks volumes.)

The Guitar Hero interface actually is a Western musical score, rotated 90 degrees to make it easier to see how the events on-screen are matched to game play input. (For visual effect, the “track” is also rotated away from the screen, so that events further in the future recede into the background – a bit of visual flair that helped differentiate Harmonix from flatter-looking Japanese games.)

Whatever the rotation, the assumptions of the game screen itself are rooted in notation. Pitch is displayed along lines and spaces, just as on a score. Rhythm is displayed along a metrical grid, which reads as a linear track. Not coincidentally, I believe, when Harmonix has deviated from this formula, their titles have tended to be less successful. More sophisticated interactions in titles like Amplitude and Frequency (and the iPod game Phase) were big hits among gamers, but less so among the general public, perhaps in part because they require a more abstract relationship to the music.

Music notes

Musical notes as represented on the score are embedded in our consciousness – even if you can’t read a note. Photo (CC-BY-SA) Quinn Dombrowski.

Games are just one example, of course. Musical scores reflect basic cultural expectations, and in turn shape the music that people in that culture produce. As with most Western languages, text flows from left to right and top to bottom. Ask people to describe pitch in any culture that uses this notational system, and they’ll use the notions of “up” and “down,” “higher” and “lower” – even though these metaphors are meaningless in terms of sound. (Indonesian culture, for instance, gets it more physically correct, by describing what we call higher pitches as “smaller” and deeper pitches as “larger,” as they are in gongs.) And music in Western cultures are also deeply rooted on a grid, on 4/4 time and equal subdivisions. It wasn’t always so: even in the West, prior to the advent of notation of these meters, metrical structures flowed more freely.

It’s little surprise, then, that some of the biggest successes in electronic musical instruments have adopted the same conventions. From the Moog sequencer to the Page R editor on the Fairlight CMI sampler to the array of buttons on Roland’s grooveboxes, rhythmic sequencers that follow the grids devised in Western music notation are often the most popular. Even if the paradigm of the interface is one degree removed from the notation, the assumptions of how rhythms are divided – and thus the kinds of patterns you produce – remain.

Nowhere is this more true than in MIDI. MIDI is itself a kind of notational system, around which nearly all interfaces in software and hardware have been based over the past two and a half decades since its introduction.

Yes, even the step buttons on machines like the Roland TR-808 map to Western notational divisions. Even a 13th-century monk would find them somewhat familiar. Here, translating from Reason’s ReDrum step sequencer to notation. Photo (CC-BY-SA) Warren B, taken at Agnes Y. Humphrey School (PS 27) in Brooklyn, NY.

MIDI, keyboards, and piano rolls: An incomplete “standard”

The first thing to understand about MIDI is that it began life as a keyboard technology. A complete history of MIDI should wait for another day, but even as its early history is told by the MIDI Manufacturing Association, it’s a technology for connecting keyboard-based synthesizers, not a solution to the broader question of how to represent music in general.

p600 logo

The first synth to acquire MIDI was the Sequential Circuits Prophet-600, thanks to father of MIDI Dave Smith. And as a result, MIDI fits the 600 and other instruments like it pretty well. That doesn’t mean it’s the right tool for every job. Photo (CC-BY-SA) Brandon Daniel.

Many of the tradeoffs in MIDI, though, were made long before the 1980s or the invention of digital technology. When the 19th Century creators of the player piano needed not only standardization but reproduce-ability – before the advent of recording, the power to recreate entire musical performances – they turned to the piano as a way of modeling musical events. Indeed, the first player pianos quite literally reproduced the process of playing a piano, using wooden, mechanical fingers to strike notes on the keys just as a human would, before that mechanism was replaced with the internal players familiar to us today. What these inventors found in the piano was an instrument that, in the name of accessibility, aligned pitch to a simple grid.

The piano is a beautiful instrument, but its great innovation – the grid of its black and white keys – is also its greatest shortcoming. That grid is an imperfect model even of Western musical pitches, let alone other cultural systems. The 12-tone equal-tempered tuning used on modern pianos makes tuning multiple keys easier, but only by way of compromises. Even a modern violinist or singer may differentiate between the inflection of a G flat and an F sharp, based on context, but to the piano, these pitches are the same. And tuning is only the beginning. Piano notes begin with a note being “switched” on and end with it being “switched” off – no bending or other events within that pitch as on most other instruments.

keys

Open question – is it possible (and I’m speaking as a trained pianist here) to deconstruct the keyboard? Photo (CC-BY-SA) Hoder Slanger.

It’s little wonder, given MIDI’s origins as a protocol for communicating amongst keyboards, that the editing view most common in music software is the piano roll, labeled as such. The piano roll is the perfect paradigm for sequencing events played on a keyboard, but that doesn’t mean it’s the best language for describing all music. And the obligation of a digital protocol is actually greater than that of musical notation, because there’s no human being at the other end to fill in missing expression and context.

Consider what’s missing in MIDI:

  • Pitch reference: By convention, MIDI note 60 is C4. However, musical practice internationally lacks a consistent standard for what the tuning of C4 is, and any number of variables can interfere, from independent tuning tables to the use of the pitch bend to the activation of an octave transpose key.
  • Pitch meaning: MIDI note values use an arbitrary pitch range from 0 to 127, a hypothetical 128-key piano, which itself makes no sense. 4? 8? 15? 16? 23? 42? The numbers themselves don’t mean anything.
  • Pitch resolution: Because of the 0-127 resolution constraints, to get notes in between the pitches, you need a series of separate messages like pitch bend, giving you two values with only an incidental relationship to one another. Since pitch range is kept in yet another message, the results are confusing and un-musical, far more complex than they need to be. (Why wouldn’t 60.5 be a half-tone higher than 60?)
  • Real expression: Events between note on and note off are represented independently as control change values. But that causes problems, because it means there’s no standard way to represent something as simple as a musical glissando. On a synth, making an expression (like twisting a knob or turning a wheel) separate from a note (pressing a key) makes sense. But that doesn’t make musical sense, and it doesn’t match most non-keyboard instruments. Only aftertouch is currently available, and that again assumes a keyboard and doesn’t expose pitch relationships created by adding the data.
  • Musical representations of tuning and mode: The MIDI Tuning extensions require that you dump tuning information in fairly unstructured System Exclusive binary dumps. The standard itself is in some flux, and at best, its reliance on byte messages means that it’s not something a human being can read. And it still must be aligned with 128 otherwise arbitrary values. It’ll work, but it only makes sense on keyboards, and even then, it’s not terribly musical. Looking at number 42 in your sequencer, you’d have no idea of the tuning behind it, or the position in a mode – something any rational musical notational system would make clear.

Ironically, it was this very set of constraints that early innovators on the Buchla and Moog synthesizers hoped to escape. They were fully aware that the very genius of the keyboard was restricting musical invention. Analog control voltage, the basic means of interconnecting equipment prior to digital tech, was more open ended than MIDI, which replaced it. But that’s not to say it was better. Standardization is an aid in communication, as is the ability to describe messages. The question is, how can you do both? How can you be open ended and descriptive at the same time?

??? notation musicale

We see notation everywhere we look, but that could be a good thing. Photo (CC-BY-SA) Ben XU / Hongbin XU.

How do you build a new system?

Deconstructing is easy; constructing is hard. We certainly have the ability to send more open-ended messages and higher-resolution data; that’s not a problem. (Even by the early 80s when MIDI was introduced, its tiny messages and slow transmission speeds were conservative.) We also have OpenSoundControl (OSC), which has some traction and popularity, including near-viral use on mobile devices and universal support in live visual applications. It’s telling that that protocol is itself not really an independent protocol in the sense that MIDI is, but built on existing standards like TCP/IP and UDP. 2010 is, after all, not 1984.

The hold-up, I think, is simply the lack of a solid proposal for how to handle musical notes. And there are plenty of distractions. It’s tempting to throw out the simplicity of MIDI’s note on and note off schema, but it’s partly necessary: with a live input, you won’t know the duration of a pressed key until that key is released. It’s equally tempting to cling to Western musical pitches, even though those pitches themselves lack solid standardization and don’t encompass musical practices in the rest of the world. (12-tone equal temperament is a recent invention even in the Western world, and one that doesn’t encompass all of our musical practice. World tunings should best be described not by majority, but plurality, anyway – have a look at the current demographics of Planet Earth.)

One solution is simply to express musical events by frequency. That’s not a bad lowest common denominator, or a way to set the frequency of an oscillator. As a musical representation, though, it’s inadequate. It’s simply not how we think musically. The numbers are also unpleasant, because we perceive pitch roughly logarithmically. Pop quiz:

Can you do logarithms in your head? Yes or no?

Can you count?

MIDI gets it half right by using numbers, but then it’s hard to see octave equivalence, another essential concept for perceiving pitch. MIDI note 72 is probably equivalent to MIDI note 60… assuming 12 steps per octave. Or it might not be.

If you need a common denominator that covers a variety of musical traditions, mode (or more loosely, pitch collection) and register aren’t a bad place to start. I don’t think a system needs to be terribly complex. It could simply be more descriptive than MIDI is – while learning from the things MIDI does effectively.

Consider a new kind of musical object, described over any protocol you choose. It would ideally contain:

  • Mode/pitch collection: As with MIDI and the MIDI tuning tables, tuning would need to be defined independently, but it can be done in a musical, human-readable way. It then becomes possible even to define modes that have different inflections based on context, as with pitches that are slightly different in ascending and descending gestures (common in many musical systems).
  • Relative degree: a notation like “1 1 2 3 5 6″ can work in any musical language. You simply need to know the active mode or pitch collection.
  • Register: Instead of conflating register and scale degree, you could simply define an octave register and starting frequency. This retains modal identities and octave equivalence, and makes relative transposition easy to understand. (A “transposition” message could be defined as an actual message, which is more musically meaningful.)
  • Standardized inflections, connected to pitch: Pitch bends and glissandi should be relative to a specific note, because notes can have pitches that bend around their relative scale degree. (Think of a singer bending just below a note and into the actual pitch. These aren’t independent events.) A trombonist would never have invented MIDI notes. They would likely have immediately turned to the question of how to universally describe bending between notes.
  • Yes, frequency: There will be times when directly referring to frequency makes sense, and that should be possible, as well.
  • Relative duration: Musical notation, regardless of musical culture, uses some kind of relative indication of duration. Only machines use raw clock values. The result is that it’s possible to make musically meaningful changes in tempo and have durations respond accordingly. And whereas note on and note on events make sense on input, a musical event would not logically separate these events; there’s some notion of an event with a beginning, middle, and end. If you sing an ‘A,’ that’s one event, with a duration, not an independent beginning of the note and end of the note.

Far from replacing existing standards for music notation, this kind of standard could interchange more gracefully with printed notation. If you import a standard MIDI file into notation software, you get results that are typically full of errors, because the SMF lacks musical information about the events it contains. With more of that information stored, and stored in standard ways, translating to paper would become vastly more effective.

I’m sure attempts to model this in OSC have been attempted before, but it’s worth compiling those ideas and resurrecting the discussion.

Reactable at Creators Series

Input could mean … anything. And that’s the point (and nothing new). Reactable at Creators Series, photo (CC-BY) Alex Barth.

What about input?

Ah, you say, but then, let’s go back to the keyboard. None of these events makes sense on a keyboard. You don’t know when a note is pressed how long it’ll last. You don’t know the modal degree of a particular, arbitrarily-played note.

I was stuck on the same problem, until I realized what I had been taking for granted: MIDI conflates two very separate processes. It makes input and output the same. Musical notational systems have never done that. When you look at a score, it’s a set of musical ideas, given meaning and context. If you record a series of events from an input, those events don’t immediately have meaning or context. It’s confusing the mechanical with the musical. It’s the reason MIDI is not just like a player piano – it is a digital player piano.

Separate out the issue of recording mechanical input events, and you can have a system that’s more flexible. That system should fit whatever the input is. An organ, a shakuhachi, a didgeridoo, and an electric guitar aren’t the same thing. Why would they be represented with the same set of input events? That’s pretty daft.

Look at it this way: imagine if instead of being invented by synthesizer people, Aeolian Harp players had invented MIDI. (It’s not so far-fetched: the Aeolian Harp has a millenia-long history and was once quite popular.) An Aeolian Harp sequencer would feature elaborate, high-resolution data recording for wind pressure relative to different strings. It might measure, even, wind direction. In fact, it’d look a lot more like meteorological data than musical data per se. It certainly wouldn’t involve integers from 0 to 127.

This should lead to a simple conclusion with profound consequences:

Physical input and musical output should not be the same thing.

One of the advantages of a protocol like OSC (or any open, networked, self-described protocol) is that it can be open-ended and descriptive, meeting our earlier challenge. For instance, using a hierarchy of meta-data attached to the message, you could describe a set of variables relevant to wind input. If you wanted to transcribe the results in musical terms, you could then use a musical notation, as above – one that used musical identity attached to the resulting frequencies, as in relative modal pitch and rhythmic duration. But the input would be a separate problem. That’s a far piece from MIDI, which is adequate neither as a complete description of the input device, nor of any kind of resulting musical system.

But wait a minute – how is there a standard? How do you standardize something that could include an Aeolian Harp, a vuvuzela, and a bagpipe? Welcome to the problem of music. Music is by its very nature resistant to standardization, because the possibilities of the physical world are so broad. This also suggests how input protocols (and output protocols) can go beyond musically-exclusive data. Again, we can turn back to MIDI as a model. MIDI was intended with specific applications in mind, with messages that referred to MIDI notes and filter cutoff. But that didn’t stop it from being warped to accommodate tasks well outside the standard, ranging from triggering videos to controlling amusement park robotic characters (literally). This suggests to me that what defines a standard protocol of this kind is not what is most strictly standardized, but what is most flexible.

The real challenge with something like OSC, then, is to come up with standardized ways of defining non-standardized events, and using some kind of reflection or remote invocation to allow devices or software that have never communicated before to handle unexpected messages intelligently. At the very least, they should give users clear, understandable options about the data they send and receive. This independent question has been one the OSC community has raised for some time. To me, all that remains is to make some compelling implementations and let the most effective solution evolve and win out. Recent reading on the topic (though this absolutely deserves a separate post, which I’ll get to soon):
Best Practices for Open Sound Control
Minuit : Propositions for a query system over OSC

That’s a separate problem from how to make events musically meaningful. But that to me is the central revelation, and something MIDI completely misses: these are two separate problems, not one problem. Handle input events as input. If it makes sense in a sequencer to record them as musical events (like scale degree pitches), do that. If it makes sense to record them as a series of time-stamped, physical events, do that – but with actual information relative to what was recorded, so that the wind across an Aeolian Harp is recorded in a way that makes sense for that input. And when describing musical events, describe them in musical ways.

This isn’t relevant only to music communities, either: it’s relevant to anyone recording events in time. It’s part of the reason the “sound” needs to be dropped from OSC. MIDI is as specific as it is partly because the specification has messages too small to contain information describing what the events mean. We now have standard network protocols that do that, so they can include information about other kinds of events. There’s no reason someone monitoring water levels in their herb garden and someone recording a sousaphone solo couldn’t use some of the same underlying protocols. There’s also every reason they’d record different kinds of data content.

I AM A MUSIC STAND.

What’s possible? Everything. Music predates notation, meaning musical ideas can always come first – particularly with the open-ended, abstract world of software. If you have an idea, try it. Photo (CC-BY-ND) Kate Farquharson.

Promising venues and a call to action

There’s really no need to try to “replace” or “fix” MIDI – if MIDI has endured for a specific application, maybe it actually is well-suited to that application. I think it’s time, instead, to think about how new systems can encompass more musical meaning from our own traditions and traditions around the world, and how we can standardize broad ranges of events instead of trying to fit everything into narrow, rigid boxes.

There is every reason to believe new things can happen now, too. Whereas hardware standardization once was a slow process, requiring the involvement of major manufacturers, we now carry around programmable computers inside our pockets as “phones” and learn to write embedded code in Freshman college classes using $30 Arduino boards. If you want new hardware standards, you can literally make them yourself. We have the ability to share musical notation directly in a Web browser using standard descriptions, as covered here recently. Because browsers in general are demanding newly distributed, networked applications, communicating in standard ways – as Web APIs do naturally – is becoming imperative.

But there’s one thing that makes me especially optimistic: you. Via the Web, we have instant access to your collective knowledge and experience. That means it’s a sure thing that all of us, collectively, knows more about previous research in this area, previous ideas, and what has and hasn’t worked. We also have the opportunity to communicate with each other, to make ideas evolve, at least experimentally. That doesn’t remove the need for eventual standardization, but good standards follow practice, not the other way around – something has to work in one place before it can be a shared standard. We also have mechanisms for self-standardization that didn’t exist before. Spoken languages evolve because people collectively work to share common means of communication. You might argue that this leads to a tower of Babel, but then, I’m writing this in English and you’re reading it in the same language and (hopefully) understanding. The same is true of Mandarin, Portuguese, German, Arabic, Hindi, and so on. It’s also true of volunteer adoption on the Internet of HTML, XML, JSON, and RSS.

Music is not the result of notation or standards. It’s the other way around. Musical practice long predated any attempt to write it down. And mathematics and written language each have abilities to describe music and many other media.

To me, two questions remain:
1. What would an implementation of structured messages for pitch and duration look like, perhaps implemented via OSC? What history has been there in this area, and what do you need?
2. How can smarter implementations of a protocol like OSC allow software and hardware to better handle unfamiliar input – as musicians, as they have done since the dawn of time, invent novel physical interfaces?

I look forward to kicking off this discussion and hearing what you think.

  • Steve Welburn

    I find complaints about MIDI interesting, as it's a fairly open standard with an important part originally being how to physically get messages between devices. With sysex and (N)RPN you can do *anything* a digital protocol can achieve (okay, speed issues can affect this). MIDI is a protocol. The standard messages are just that – a set of standard messages, not a restrictive definition of allowed capabilities. If the Aeolian harp community wanted to adopt a standard set of NRPNs, they could.

    I'm also intrigued by amount of love OSC gets. I can see how OSC is useful in hinting at functionality – which is especially useful to let developers keep track of what they're working on – but in order for it to be useful to *users* it needs more. You describe it as "self described" but more accurately, it's "labelled". You know which parameters you're editing, but you don't necessarily know the relevant transforms that affect the data or even the units (is that a frequency, a MIDI pitch or a pitch in cents ?). In order for OSC to be transparent the ability to query the relationship between values and output needs to be available. Standardisation of this would make OSC truly useful – if you need to look at a manual to know how to control a device what's the difference if it's just a data type or both an NRPN number and a data type. Additionally, OSC is described as "not a protocol" and could easily be transported over MIDI in sysex messages! It would be nice to see if (Max/MSP ?) OSC wrappers for (hardware) MIDI synths would work to make people explore synth featuers more.

    In most contexts, users aren't expected to understand the underlying protocol – things just "work". Currently, neither MIDI nor OSC break through this – the only recent innovation in this area I can think of is Novation's Automap. The ability to get type information from OSC could really change things.

  • Raffael

    Great Article.

    As A composer I have become quite frustrated with midi and notation on the computer in general. That's the reason why I hand write all my scores and then if necessary "translate" them into my computer.

    My idea for a solution to the midi problem would be to generalize the data a instrument or sequenzer sends to eliminate the problem that midi data comes only through certian channels ( by which I don't mean the midi channels from 1 to 16) like velocity or note, which can only send numbers from 0-127.

    For the undefined numbers the device sends there has to be a key which gives them meaning. For example a flute like controller could send data like: number 1 is note, number 2 is air pressure ect.

    This key then would have to be installed on the computer and the device so they can communicate correctly and the computer can process the data.

    Then again I'm not sure if any of this would work.

  • http://nezoomie.wordpress.com/ nezoomie

    I strongly believe OSC is wonderful, flexible and readable, but its implementation needs to be more transparent to the user than before. Zeroconf is a first attempt to that. I dream of self-configurable wireless devices, with centralized automatic tuning and other goodies, but I'm not quite sure to what extent this is possible. I need to study OSC a little bit more.

  • http://www.bassling.com bassling

    Wow, that's a great discussion. Sometimes I think musical frameworks are limited but – hell yeah – MIDI is even more limited.

    Could an alternative way of working be some sort of soundwaves? Like those squiggly lines that we see representing sounds on computers.

    Guess we'd be using styluses to draw the size of frequency cycles or their attack and decay. Or maybe fingers for making polyphonic sounds.

    Hey, for what it's worth, I've worked with a large-scale aeolian harp (see http://www.youtube.com/bassling) and can see a squiggle would lack the harmonic information that's much of what is heard.

    Back to the drawing board for me. Or maybe just the MIDI again.

  • http://www.bassling.com bassling

    Sorry — operator error!

    The large-scaled aeolian harp is at

    http://www.youtube.com/watch?v=j8Al-tiWPwc

  • http://hendersonsix.com henderson

    Fantastic to be reading about this topic again (I really enjoyed reading the a recent article on this site about colour/notation also.)

    Having coincidentally just completed a degree project/thesis on this exact subject, I think the key is a quote by Jack Burnham:

    "We are now in transition from an object-oriented to a systems-oriented culture." (Source: Systems Esthetics, Artforum, 1968)

    Contemporary music is increasingly being "generated", or born out of rules and procedures – take Steve Reich's phasing, Cornelius Cardew's The Great Learning, or Brian Eno's Generative 1 for example. Computers allow for systems-oriented music, so therefore perhaps a new language that is not focused on objects per se but on systems and processes is more relevant? Think graphic scores, flow charts, non-linearity, analogs.

    The issue is the development of a new vocabulary of symbols that can at once by machine and human readable. A computer can calculate a complex algorithm and produce a frequency detuned to numerous decimal places, whereas such a stream of numbers to a human is of course impractical. Currently, clever GUI elements controlling OSC packets are the closest we've come to a seamless new language for developing music. But the question really depends on what type of electronic music do you want to create? The invention of a language is also the invention of an interface, and an interface mediates what type of music we are going to produce.

    For anyone interested in my degree project (Radius Music), it can be seen here: http://hendersonsix.com/#212455/Radius-Music

  • http://www.createdigitalmusic.com Peter Kirn

    @Steve:

    Well, the questions I hope to ask here are more than just about the underlying transport. That's my whole point; nothing – OSC included – implements a standard means of describing events musically. Yes, it's possible you could use SysEx and NRPNs over MIDI, but that gets dodgy in real-time operation and I don't think it's in the spirit of MIDI that you would entirely supplant MIDI note events with these schemes. By the time you're doing that, you might as well use any protocol and transport, including those that have greater data bandwidth.

    MIDI has been routinely wrapped in OSC, and OSC translated to MIDI. But yes, what you're describing in terms of the ability to query the data you're looking at – that's exactly what people are suggesting. (The papers I link here even point to that issue; one suggests a method for querying that they've tried implementing.)

    I think "just working" invisibly isn't good enough for users, and we can actually thank MIDI. MIDI's note and CC events are fairly readable, and so users manipulate them to do what they want. That's why users have even greater expectations for new protocols, even if OSC doesn't satisfy all of those expectations yet.

    This isn't an OSC vs. MIDI discussion, though. If you really wanted to adapt MIDI in a new way, you could. What I'm saying is that the default means by which notes are stored and described, which has been built on the original MIDI spec, isn't a complete way to look at musical events.

    @henderson: Some great ideas; now I have to go spend more time with your thesis! I'm sure there's other research from the reader community, too…

  • http://avanturb.com Primus Luta

    Okay I'm going to go left field and say perhaps standardization shouldn't be the goal with OSC. Okay not that far left field. Of course I don't mean that there shouldn't be standards, but rather there shouldn't be musical message standards.

    Using the example of pitch, the mere statement of how can we standardize pitch messages, draws a box around what pitches can be. While it may seem more open than line and space declarations on a staff, how it is represented does have some bearing on how it is implemented. IF one always thinks about pitches as note representations, their compositions are limited to what can be done with those note representations.

    A parallel example would be that pitch in traditional notation is seen as a primary element, and by standardising pitch messages a similar weight is put on their relevance. However, we are at a point in compositional possibilities where pitch is the incidental, and the actual performance is derrived from elsewhere.

    Rather than standardize, I'd like to see more artists taking their own creative approach to documenting their compositions and performances. While it may not result in being able to pass another artist your work and having them play it from site, or even passking it to another computer and having it replay it, I think the value in inspiration, derived from the variety of ways the digital ways has us each thinking about music and representing it, will be well worth it.

    Okay that's all well and good but practically I think OSC should be closer to XML. Something akin to a DTD file defining messages from a specific system and then other than protocol specifications have fun with it. It'd be nice to see that DTD file working in tangent with a system for sequencing OSC messages as well. Beyond that I say leave it open.

  • http://www.createdigitalmusic.com Peter Kirn

    @Primus Luta: right, but hearing pitch *relationships* – as you would with certain modes or pitch collections – is pretty significant. Surely you'd want some way of describing a pitch system that you use.

  • http://www.jordancolburn.com Jordan Colburn

    I think you would have to standardize pitch relationships to the frequency they produce. The hardware controllers and software editors can adjust which assumptions the user is making about scales and modes, but to have the whole standard be based around assumptions, but then have those assumptions changed(by switching software packages or something) would result in too much variance in results that would be frustrating to use.

    I think the part of the article that got me the most excited was the ability to assign different variables for different uses. Say for a digital sax type controller, you can assign different variables based on how many sensors your hardware implements, then use the software to determine how that maps out. This can kind of be done in MIDI by carefully mapping out CC messages, but you mainly run into the resolution issue and having to map it all out by hand.

  • http://hendersonsix.com henderson

    @Primus Luta

    "While it may not result in being able to pass another artist your work and having them play it from site, or even passking it to another computer and having it replay it, I think the value in inspiration, derived from the variety of ways the digital ways has us each thinking about music and representing it, will be well worth it."

    I agree – I think the interpretability of the graphic score should be celebrated! To create a musical language of rules that allows for deviation on a performance-to-performance basis is incredibly interesting and challenges a wide range of notions ordinarily taken for granted – in a way it is more copyleft than it is copyright. On a visit to the Sonic Arts Research Center in Belfast a few months ago, I was curious to find out how a composer wrote music for a 48 channel surround system. What is the symbol for "PAN CHANNEL FROM MONITOR 7 TO 36"? Perhaps unsurprisingly, there is a large element of improvisation – and rehearsal – in performing such works. So with each performance being sufficiently different from the last, how does this interfere (if at all) with the composer's agency for creating the work in the first place? Also, if 10 years down the line the work is to replayed, to what extent would the work depend on an anachronistic technology/language?

  • http://music-interface.com mat

    Great post!

    My personal focus is more on rhythmics than on pitches as I do sequencers, so I refer to one of the early paragraphs of this article.

    My sequencer got independent stepranges on each track. So one could be 16/16, the next 15/16 or whatever. That allow you polyrhthmic patterns.

    The funny part is, that most people say "uh, thats a nice random track" if I do so. It is not. It got a polyrhythmic structure. Even worse some people think my sequencers are not tight. (Thats why I use a rather boring 4/4 in most demo videos) The reason for that is, that this polyrhthmic structure doesn´t fit into their frame of reference. They are adapted to 4/4 and everything beyond this just sounds wrong. (ok, I think those guys like you, who read through the whole article, will see it different… but this is not common sense)

    Why is this important to the discussion anyway?

    Standardisations and new protocols need a wide spread to be successful. And I do not think that such a detailed protocol will be. it is not only the adaption of most people, it is also the easiness of Midi that make it a standard over years. In german words "Perlen für die Säu!" ;) meaning: useless effort for most musicans (especially if it comes to electronic music). And as they build the critical mass for the success of new standards, it will be hard to put it through.

    As said… readers and writers here (especially if it comes to this article) are different, but they are just a few….worldwide.

    Anyway, nice post… and I hope I am wrong ;)

  • http://avanturb.com Primus Luta

    @Peter Kirn: Absolutely, and perhaps between composers there would be some commonalities in representing that. What I don't want to get is in the MIDI trap of confining significance to note on and note off determinations. Even something like a scale or a mode is a preference not a requirement of composition, especially in the digital age.

  • http://www.createdigitalmusic.com Peter Kirn

    @Primus Luta: No, I agree. So you'd want to be able to refer to a pitch collection when it makes sense, and not if it doesn't. (Sometimes a frequency is just a frequency.) Actually, you can still normalize frequency range to a non-logarithmic scale… interesting question, probably in the implementation rather than the protocol.

  • http://ardour.org/ Paul Davis

    Peter – an awfully great post! Really, one of your best ever.

    Primus Luta: <cite>Using the example of pitch, the mere statement of how can we standardize pitch messages, draws a box around what pitches can be.</cite>

    I don't think its so hard to come up with a fairly expansive definition of what a "pitch" is, as a concept. Given this definition (and Peter did a pretty good job within the article on making an attempt at this), we can then define a message format that encapsulates all elements of the definition. If things are left out, they will be left out because the definition didn't include them either. And yes, there will always be some music that will use sound in a way that steps outside a given definition, because there will always be composers who deliberately attempt to do so, "just because". That doesn't mean we can't come up with something much better than MIDI offers us now.

  • http://www.noahadler.com/ Noah Adler

    David Heinemeier Hansson accidentally wrote a good defense of MIDI when he described opinionated software. It's an opinionated protocol, and it has worked well for a lot of things. That could explain its enduring proliferation. So sure, there are plenty of rough edges around it. I've spent way too many hours mucking with MIDI guitar and guitar-like controllers to believe it is fully adequate for all musical endeavors, but it can be interesting and fun for a wide variety of things. The myth that you can play some instrument and have it 'sound like anything' is alluring but absurd. I would never dream of performing mechanical manipulations on a güiro to try to coax out the sounds of a saxophone. What would that even mean? The expectation is ridiculous, as no distinct physical instruments are directly isomorphic.

    You seem to want to conflate formats and protocols in this desire too. Sure, MIDI defines some formats, but generally speaking, it's a protocol intended for synthesizer control, not a score format. I think endeavoring for a universal score format would be useless. How would you represent all the varieties of aleatoric and experimental score writing? Composers would always find ways to poke holes in it. Not that having a common score format would be bad in and of itself, but perhaps the notion that a particular score should be perfectly reperformable by a machine agent is becoming a bit outdated. Wabi-sabi organicism has infiltrated the pristine bastion of computer land.

    Take a look at the promise of the Semantic Web compared to the reality of Google's cartographic prowess. The future of computational music lies in a similar direction. Musical Information Extraction work being done by groups like MIREX are constantly breaking new ground, but the time before it becomes useful product is still a long way off.

    When it comes to writing scores in the present day, I think our only feasible goal is incremental improvements in the process of score-writing in the established traditions. Sibelius has made things much easier, but I'm still irritated that I can't write on the score directly using a Cintiq. Writing the score one mark at a time reduces the artificial intelligence problem from full OCR to contextual gesture recognition.

    One final thought: given that there are more emerging and experimental musical instruments, which may demand custom scores, the problem could be rewritten as a problem of sync. That is, how can you best sync a traditional score with any number of scores written for novel elements which have no easy mapping in the tradition?

  • http://www.createdigitalmusic.com Peter Kirn

    @Noah: Well, no argument – MIDI has its uses. But I'm not the only one conflating protocol and format; that's long been what has happened with MIDI and SMF themselves. I think we're talking about several things here:

    1. The messaging format by which an input device communicates with other devices and software.

    2. The messaging format by which musical events are stored and described.

    3. "Score" to me means that the end result is printed copy. This has its own set of challenges, but there are some good formats out there, like Lilypond and MusicXML. Lilypond is human-readable and puts the authoring tool and format together. It works pretty well and describes a variety of different musical practices. It doesn't cover everything, but it's specialized to the point of being useful.

    These three ideas can sit atop any number of protocols.

    That said, I'm very interested in looking at what the Web is doing, and the semantic Web, in defining these problems. We've tended in music tech to operate in a bubble, and that can cause us to miss opportunities.

  • http://soundcloud.com/usrsbin Dennis Moser

    I would be very interested to see how this discussion evolves, with regards to performance practice.

    A score is meant to be open to interpretation, and that interpretation is subject to context and understanding of the contexts attached to the performance situation.

  • http://ardour.org/ Paul Davis

    Dennis Moser: <cite>A score is meant to be open to interpretation</cite>

    I'm not sure that this a given. Some people might appreciate this aspect of traditional scoring systems. Others may not. Or it may vary from piece to piece.

  • http://hannanmusic.com peter hannan

    Very interesting article. Before switching to composing full time about 20 years ago, I had a career in early music. When I studied in Amsterdam in the 1980's there was huge interest from fellow students, and in the music scene in general in either music pre-1750 or post 1950. Somehow this always made sense to me.

    As for notation, there is no doubt that the whole notion of whether learning to read music is valuable enough to devote the time to learn properly is being challenged. The college I teach at in Vancouver is generally where would-be rockers or pop musicians go to school. Many of those students simply don't read fluently, and there is a constant challenge to prove where reading music is needed in (that horrible phrase) today's marketplace. There was a time when students were embarrassed if they couldn't read music. No longer true.

    Your observations about midi and the way midi devices are set up are particularly on the money in this discussion.

    I mostly use Logic for composition. Some years ago Logic might have been seen as the possible one stop composition solution- all your compositional, audio, recording, and notation needs met in one place! Of course in the last few years, the notation side of Logic has been pretty neglected, I suspect as traditional notation becomes less important to more and more musicians.

    I'm philosophical about the whole thing. Art is always changing. Future musicians will have a different relationship to music notation, and as you point out, will be coming to questions that would have traditionally been dealt with in notation, but now are dealt with in midi. I see this in my own teaching, where for instance any number of my students have needed to learn something about tuning and harmony as they try to make samples co-exist with each other.

    I can't imagine traditional music reading being lost, and I can't imagine not having the access to music that you gain from being a fluent music reader, but I think there is no question that this will become a more and more esoteric skill.

    This will be an article that I will be pointing people to.

    This site continues to be the first place I visit every morning- always relevant and interesting. I also appreciate the amount of work that must go into what you're doing.

  • http://ardour.org/ Paul Davis

    Peter, protocol and format are more different than I think you're given credit for.

    Suppose we (simple-mindedly) defined a pitch as an integer value between 0 and 127, with an on time and an offtime, and one additional 7 bit value to convey some kind of performance information. Now we have a format, if you will, for transmitting pitch information.

    But we could use that "format" in many different ways and still be limited by the nature of the format. We could send it via OSC, HTTP, a radio FM pulse, morse code, etc. Even, gasp, MIDI :)

    The way we send the message is the protocol. The content of the message is the "format". We don't need to care much about protocols. We do need to care about format: what's in the definition of a pitch and thus what's in the message.

  • http://www.createdigitalmusic.com Peter Kirn

    @Paul: Yes, absolutely – I'm not disagreeing with that. I'm talking content, not the transport by which that content is transmitted. But I don't need to tell you that there are limitations to how much data you can squeeze into MIDI *at its default baud rate*.

    A "protocol" traditionally includes how messages are formatted, however, so I don't think my use of the word protocol here is misdirected. And I think I'm pretty clear I'm talking the messages.

  • http://ardour.org/ Paul Davis

    @Peter: I think my point was that if you're thinking about a new, more expansive messaging definition for pitches, worrying about bandwidth is not really relevant. We won't be sending this over any MIDI serial line, it will be moving via newer protocols for which there is almost (?) always going to be plenty of bandwidth.

    And the second part of my point was that the details of how you pack the information that "you should play something corresponding to your idea of note number 73" into some transport protocol is much much much less important than defining what information needs to be conveyed.

  • http://avanturb.com Primus Luta

    @Paul & @Peter: I'm still hesitant. Where I think it all breaks down is with that "one additional 7 bit value to convey some kind of performance". I think what we're seeing today is that that 7 bit is as if not more important than either the pitch integer or on and offtimes, so much so it might require more than 7 bits. Even further, the types of performance information can be quite infinite and so designating a format which works well more or less across the board seems like a futile effort.

    Perhaps the 'protocol' is to refer to pitch in OSC message via integer values 0-27 (which still could be debatable eg what about microtonal using decimal values). But to standardize the packet in which that is sent is my slippery slope.

  • http://ardour.org/ Paul Davis

    @Primus: my example was a tongue-in-cheek description of the current MIDI pitch definition. Nothing more or less.

    Nobody in their right mind would suggest that a single 7-bit value is enough to correctly convey performance data for an arbitrary instrument.

  • http://www.createdigitalmusic.com Peter Kirn

    Well, wait a minute – virtually any kind of communication involves a range of shared information, and a range of new information. Take a recipe, for example. You have an agreement on standard measurements — a teaspoon. But then the recipe itself instructs the user how to make it, providing some common information ("sauté") and explaining things that would be unfamiliar. Based on the experience most people have in music, there's something similar. Some things will have a reference, some won't. You just need some language to describe both.

  • http://www.noahadler.com/ Noah Adler

    Thinking about it further, most of what I really intended to comment on regarding format versus protocol was the accessibility nature: random access versus sequential. That gives scores (and I'm speaking in the general sense– csound score files, engraved manuscripts, or deranged scribblings of performance instruction on a napkin) the ability to convey information in multiple contexts… e.g., does this note fit into the key signature of the piece? Well, how does it function in the current chord? Is it a member of a sequence? And to confound it, usually these sorts of groupings are not exclusive, but can overlap in any sort of weird way they want! Music is just too beautiful, isn't it? :-)

    @Paul: Whether or not a score is meant to be open to interpretation is rather moot, because it inherently will be, whether the author desires that property or not. (personally, I choose to be happy about this)

    @Peter: I've been a fan of Lilypond for a few years, and I think it's a great bridge between keyboard and traditional notation. The abcjs project you posted the other day seems really promising as well, but what they have in common is simply making it more affordable to create and exchange a particular type of score. Which is to say, they leverage the commodity stature of keyboards, de facto standards like traditional notation, ASCII or Unicode, and most lately browsers. This is not insignificant! But they are design decisions based primarily on what is cheap and widely available. When I harp about not having a Cintiq input method, well… compare the costs and ubiquity, and it makes sense.

    To get a little back on topic, my personal main gripe with MIDI is the dissociation of notes and their ornamentation/inflections. I don't really care about channels at all. Could channels be tossed? They seem like an optimization relic that now gets in the way, unless I'm missing something.

  • http://ardour.org/ Paul Davis

    the primary thing you need for a decent pitch-based message system is the notion of a voice, rather than note-on/note-off messages. that is, you send a message that says "allocate a voice to play this note and give me some kind of token to refer to the note at various points during its lifetime".

    MIDI's on/off-by-note-number formalism makes certain kind of music expression essentially impossible to represent, and SMF only makes that worse.

    So the actual overall communication flow looks like a bit like this:

    Sender: i want you to start playing a sound, with the following parameters ….

    Receiver: OK, i've started that (or I will in the future, you can refer to that sound as ajeu1132x in the future

    .. time passes …

    Sender: i'd like you to change the pitch of ajeu1132x in the following way …

    Send: remember the 17th parameter of ajeu1132x that i sent you? change it to 12.19 please.

    Sender: its time to stop making the sound associated with ajeu1132x

  • http://ardour.org/ Paul Davis

    @Noah: <cite>Whether or not a score is meant to be open to interpretation is rather moot, because it inherently will be, whether the author desires that property or not.</cite>

    I'm not sure this can be said, for example, of a CSound score file.

  • http://www.noahadler.com/ Noah Adler

    @Paul: Version inconsistencies? :-) Custom extensions? (though I take your point)

  • salamanderanagram

    for me, "the best way to represent notes" depends on what you're trying to accomplish.

    personally, i'm most comfortable reading sheet music or looking at a piano roll, i never use "weird" tunings, i never find the need to go above midi note 127 or below note 0, and only very rarely do i find a need to greater than 7-bit accuracy (i usually think of a knob as going from 0-100, rarely do i find myself thinking, if only i could make it equal 51.345! 51 will do just fine, thanks).

    given all of this, MIDI works perfectly well for me. changing to a new protocol would mean re-writing a shitload of reaktor scripts, buying new gear, (it would make over $1000 worth of my gear pretty obsolete) and it really would not offer any *tangible* benefits that i actually care about. sure, there are improvements to be made but none that i really care about. faster baud rate would be nice – but are we really tossing around more than 30000 messages per second on a regular basis? i know i'm not, and i use MIDI in some pretty weird ways. the only parameter i ever wish had more than 7-bit accuracy is filter cutoff which you can just use nrpns for anyway.

  • http://ardour.org/ Paul Davis

    @salamanderanagram: with the greatest repsect, I think you're missing the point of Peter's post. He is quite explicit about not trying to replace MIDI, and about the fact that MIDI works very well for certain purposes, which apparently include yours. The problem is that MIDI is essentially unusable for some other purposes because of the narrow scope in which it was conceived. It would therefore be useful to have an additional, more powerful, more flexible standard available for use where MIDI falls down (at least).

  • salamanderanagram

    i'm merely explaining my disinterest in such a system, and why it will be a very long time until i'm willing to even consider changing protocols. given that i only know one person besides myself who even knows what OSC is, i'm pretty sure i'm not alone there.

  • http://ardour.org/ Paul Davis

    @salamanderanagram: again, with all due respect, it won't be you that makes the switch (if it ever happens). It will be Reaktor (one of the first proprietary programs to ever support OSC, by the way) :)

  • salamanderanagram

    @paul, OSC has been around for years now. it is the system that is "more powerful, more flexible standard available for use where MIDI falls down…"

    it's not just me, it's almost everybody who isn't adopting it.

  • salamanderanagram

    btw reaktor having OSC is useless when no hosts will pass it along ;)

  • http://ardour.org/ Paul Davis

    @salamanderanagram: one of the reasons for Peter's post is precisely that OSC is not a "more powerful, more flexible standard available for use where MIDI falls down" in its current form. And one of the main reasons for this is the lack of any agreed upon message format for conveying pitch data.

    My point about Reaktor was really just that its developers (like me) who adopt protocols, not users.

  • salamanderanagram

    "OSC is not a “more powerful, more flexible standard available for use where MIDI falls down” in its current form."

    it most certainly *is* "more powerful" and "more flexible" than MIDI. it's also "available for use." the problem is nobody can agree how to use it.

    and you'll note that many people want to keep it that way. i've seen many discussions where people are trying to figure out such a system – and many people will argue that any sort of standardized way of dealing with it is undesirable… i don't understand this view, but it is certainly out there, and quite popular too.

    let's face it – the reason why MIDI is so popular (and the reason it's going to stay that way) is because it is so easy to understand and it fits the uses of the *vast majority* of users. right now, even having such a standard would be useless because there's no synths or anything else that would be able to understand it. most of these problems (like microtuning) actually do have workarounds in MIDI that would be far easier to implement than upgrading every single piece of software/hardware that i own.

    "My point about Reaktor was really just that its developers (like me) who adopt protocols, not users."

    OSC still has a LONG way to go towards being any sort of a standard… we're still talking about pitch tuning more than 5 years later? you developers need to get it together!

    if you can't see my point that the vast majority of users have no desire to spend time and money upgrading systems that they have no need to upgrade, that's fine.

  • http://ardour.org/ Paul Davis

    @salamanderanagram: <cite>if you can’t see my point that the vast majority of users have no desire to spend time and money upgrading systems that they have no need to upgrade, that’s fine.</cite>

    Are you not following Peter's repeated postings on mobile computing devices and the OSC apps that are available for them? You won't be upgrading equipment, you'll be installing apps here and there.

    Microtuning is not the principle problem with MIDI, as Peter demonstrated in his post above. And the reason why MIDI is so popular is not because its easy to understand, but because it was the first kid on the block, and was sufficiently smart to keep its place. Peter (and I) acknowledge that MIDI has many good qualities.

    The reason that OSC is neither "more powerful" nor "more flexible" is that its really not a protocol at the same level as MIDI. Its closer to a transport protocol than something that has actual semantics. Look at the size of the documentation: you can explain everything about OSC in a page or two of technical text. The MIDI manual (even just the basic part) is many tens if not hundreds of pages long. And its not simple. As an implementor of MIDI support in Ardour, I can point to dozens of wierd subtleties in MIDI that are really nightmarishly complex but rarely bother users. The difference in the size of the the spec is not because OSC sucks, but because all it really defines is how to send a message from A to B, whereas MIDI defines that in addition to what the messages are and what they mean. And its precisely the latter that is the key to a useful protocol, which at present OSC really isn't. Hey, we agree! :)

  • http://www.createdigitalmusic.com Peter Kirn

    Well, and by the same token, where OSC messages are defined – as with something like TUIO – it is quite useful. I'd say it's even more useful than sending raw network packets when sending undetermined control data, etc.; at least you have some basic expectations about parsing. But if you have information that needs overlaid meaning, you'd better find some level of convention.

  • salamanderanagram

    "Are you not following Peter’s repeated postings on mobile computing devices"

    no i do not care for iphones/pads, sorry. i like buttons, knobs, and other tangible controllers. if i want some sort of OSC controller i will have to buy one. and buying apps is not free either – just having an OSC controller is kinda useless – it's not worth anything until it has something to talk to. which it doesn't, in my setup (except for ableton live i suppose, but i can already send OSC messages to live with my *MIDI* controller and a simple script)

    "And its not simple"

    yeah, i've programmed MIDI handlers, they are super annoying at the core and to be honest i never was able to get rid of all the bugs (i've never been the best programmer when it comes to that stuff). i didn't find handling OSC messages any easier – what with the endless combination of possible messages that could be sent, with who knows who many attached parameters – it's too open ended!

    when i say MIDI is simple, i mean for the end user: note on, note off, pitchbend, aftertouch – what these mean is obvious and intuitive.

    "And its precisely the latter that is the key to a useful protocol, which at present OSC really isn’t. Hey, we agree! :) "

    fair enough! i find the most compelling argument for MIDI at this point in time is backwards compatibility, which i will admit is kinda sad, but i also will argue that it does a pretty fracking good job and it will be a long time before it is replaced or even has serious competition.

  • http://avanturb.com Primus Luta

    @Paul Davis

    "Nobody in their right mind would suggest that a single 7-bit value is enough to correctly convey performance data for an arbitrary instrument."

    An arbitrary instrument no, an arbitrary pitch though?

  • http://www.createdigitalmusic.com Peter Kirn

    @Primus Luta: the problem is, nearly all pitch systems are relative, not absolute. (12-TET is a rare exception.) So 128 values are enough to convey relative pitch, yes. But at the very least, you need another message for tuning. That's true even in MIDI. And that's before you bend between pitches, which isn't an "experimental" technique – it's common on most instruments.

    Anyway, yes, if you number your notes from 0 to 127, then send a separate message at some point to indicate the tuning, you can get a consistent result. Even then, though, where is the octave? In 12-TET, it should be every 12 integers, but then 128 isn't divisible by 12.

    All of this works in MIDI, to a point; it's just not complete or ideal. OSC doesn't really address it at all. 7-bits is a crowded amount of space to try to come up with a complete or ideal solution, at the very least. Beyond that, though, any protocol or software representation ought to be able to implement something more complete, which could in turn be the basis of a standard.

  • http://avanturb.com Primus Luta

    @Peter Kirn: Say you're doing an rgb hex to pitch implementation, what's the scale? The math works out a little better, enters into microtonal realms and you can elaborate to infer further meaning. That's all without saying how the pitch is controlled. Ultimately the control is determinant in the definition, but where the control is no longer a universal neither is that definition of a pitch. Perhaps 7 bits is efficient as the thought pattern shouldn't be how much we can use to define, but how little is necessary.

  • Damon

    Ok, I give, what's the correct answer?

  • http://copperlan.org Eric Lukac-Kuruc

    Have you heard about CopperLan? It is a complete networking system which, among many other things, includes a protocol that goes beyond MIDI while remaining MIDI compatible.

    CopperLan’s protocol answers many questions raised in the article.

    Pitch

    Regarding the description of pitch in messages, CopperLan offers both pitch and frequency formats. In pitch format, the resolution is beyond the limits of human perception, and since it is self contained, it doesn’t need a separate message such as pitch-bend to inflect the height of a note away from a neutral reference. (Pitch-bend is nevertheless available when needed)

    While pitch tables assume that the target is in charge of defining the actual pitch, CopperLan allows the source (controller) to be in charge, so that pitch tables are useless.

    The pitch range goes from LFO frequencies up to ultrasonic in a continuous way which is independent from any gating/triggering. Think of MIDI with a wider range, fractional values and independent of the note on/off aspect. In CopperLan, 60.5 is indeed the pitch half-way between notes 60 and 61 :-)

    The overall tuning remains a separate control as it avoids having to re-describe an entire musical piece (in terms of messaging) in case it has to be tuned to a different reference.

    Real expression

    CopperLan offers many ways to describe the musical expression (gesture, materials, physical interactions …). The messages in charge of such enrichment are optional, but one very important aspect of the protocol allows these to be glued together temporally. This is the notion of Transaction. For example, hitting a percussive instrument could, among the many possible messages offered, generate the ones needed to describe the hit position and the muffling. The target device in charge of processing the message and generate the appropriate sound needs to know when the description is over, so that it can create a sound which is the result of the concomitant action of all messages part of the Transaction. Without Transaction, there is a risk of audio glitch as the messages would be applied sequentially as they are decoded.

    In MIDI, the pitch (note numbers), the velocity and the gating (on or off) are indissociable; this is not the case with CopperLan where each of these aspects can be described independently, but also updated continuously over time. Moreover, the gating is not restricted to on/off since there are additional notions such as triggering and gluing. In this way CopperLan brings back the wild freedom of CV/Gate in a digital protocol.

    Scoring

    The usual mess of printing scores from MIDI files is due to the music flowing freely around the “hard” clocking. In CopperLan this problem is solved by “elastic” clocks that could stick to the actual notes placement. Said clocks are themselves referring to an underlying hard clocking reference in charge of the overall tempo. (It’s in fact richer than that, but I kept it simple for now).

    CopperLan was developed over many years with the kind feedback and collaboration of companies like Native Instruments, Steinberg, Yamaha, to just name a few, and academic organizations such as the IRCAM and GMEA. Its development is now over. An SDK for freeware development will be available next week.

    This little introduction is far from covering all the music-oriented features of CopperLan, to know more about its other aspects (networking, plug-and-play, zero-configuration, discovery, editing, timing, midi compatibility …) see CopperLan.org

  • http://Noiseforairports.com Nick

    This is a fantastic meditation on notation and performance, Pete. You might be interested in my master's thesis on the history of the relationship between player pianos and other "re-performing" technologies and sound recording. Some of the issues you raise here about user inputs, completeness, and standardization come up very strongly there: http://cms.mit.edu/research/theses/NickSeaver2010

  • Pingback: Beatfly, Free Acoustic Drums: Full Kits, Looking Beyond MIDI, Making A Drum Kit With Your Mouth, Air Users Blog Bargain Basement, CERN Sounds library, Alberto Balsam - Steel Version, Review: Ohm Force Ohmicide Melohman, Packaged Piano

  • http://ardour.org/ Paul Davis

    @Eric Lukac-Kuruc: copperlan sounds very nice technically but it currently appears unclear whether it will be usable by the open source community. The website says:

    Sensible of the philosophy of the freeware community, CopperLan is willing to support these goals by offering a cost-free model. The freeware license gives access to a freeware-specific development platform, including a wiki documentation set and a dedicated forum.

    Freeware and open source are not the same thing. The open source doesn't care about the cost, the important thing is the ability to distribute source code. Will CopperLan be compatible with the GPL?

    It just defeats and saddens me how many times the music technology industry can fail to learn the lessons of MIDI's adoption over and over again. Design the technology. Make it kick-ass. Publish the specs. Allow anyone to implement it. Profit! The alternative has been tried over and over, and has failed every time.

  • http://www.createdigitalmusic.com Peter Kirn

    Well, right. I appreciate the work being done on Copperlan, but there isn't yet a starting place for developers, and "freeware" doesn't equal open standard. It's not just about cost; that leaves wide open questions about governance, implementation, and the future of the spec.

    Of course, I still think it's pretty absurd that technically you can't publish the MIDI spec in its entirety, that you have to pay $50 for a printed copy. OSC at least gets the "open" right. The next step would be to take advantage of that, to set up a proper process by which the community can comment on issues, etc. Aside from the protocol itself, imagine how useful it'd be to be able to open a spec document and immediately see comments from implementers on best practices in real-world usage.

  • http://www.skyron.org SkyRon™

    Great post, and rich discussion, Peter!

    I offer my own two cents in the form of yet another, different, and probably (musically) irrelevant intersection of 'music notation', 'digital media', and 'experimental'. Sorry, it doesn't output to MIDI or OSC, and very sorry it's done in Flash, but, hey, free random string quartet! – - http://webprototypes.wordpress.com .

  • Brendan

    I don't see why there is any need to replace musical notation, it represents pitch and time as well as anything could. Pitch and time are the two globally shared elements in music, no matter how you subdivide them. I can't think of much else which is common between all musical instruments.

    The thing is, to be a 'standard' such that the notation could be played by *any* instrument means that the notation has to be reduced to the lowest common denominator. No matter what your notation is going to be, the specifics regarding the playing of a certain instrument will have to be an hack on top of the existing notation.

    MIDI has standardized many of these hacks in a big table mapping sysex values to musical terms. I think this is the shared information you're talking about Peter.

    What everybody says that OSC needs is the reflection mechanism that Steve was talking about. He's right, but we still need a common language – the OSC equivalent of the MIDI sysex mappings. It's like "Oh ok, I can ask the instrument questions now. But what are we going to ask?"

    So OSC needs two things:

    1. reflection

    2. a big ass sysex table

    Whether or not we use OSC is actually kind of beside the point…any protocol would work. The thing is, MIDI is slow and OSC is readable.

    I like the "good standards follow practice"; it's very true. In programming that's analogous to design patterns. Let's hear more ideas!

  • http://ardour.org/ Paul Davis

    @Brendan: i don't think that things have to be "hacks on top of the existing notation". Consider CSound scores, for example. (One of) the key elements here is the notion that the parameter lists for a given voice/note depend on the nature of the instrument they are sent to. That is, the number and nature of the parameters is not fixed by the protocol, but reflects the instrument. A second important element is that its possible to send modifications to any/all of those parameters after the voice/note has started sounding.

    Its hard for me to imagine any instrument that is not covered by this kind of model.

    I'm really unsure of what you are referring to when you mean "a big table mapping sysex values to musical terms". Can you clarify?

    OSC reflection/namespace discovery is a red herring in this context. The key is a shared message format & protocol, not discoverying that synth1 uses one message and synth2 uses another for the same thing.

  • Pingback: What’s to Come of Present Innovations in the Future? It’s All in the Beginnings « ZenStorming – Where Science Meets Muse…

  • Brendan

    @Paul

    I thought the parameter lists are exactly what we want to avoid?

    From what I understand you mean:

    you can send 'synth1' the message [1,2,3,4] and it may interpret that as settings for it's ADSR amplitude envelope, and you may send 'synth2' the same message, [1,2,3,4], and it can interpret it as it's amplitude, cutoff, or what have you.

    But this still has the same problem- the 'meaning' behind the numbers- mapping the music to the synth. The MIDI specification defines meaning behind the numbers by 'suggesting' certain control values for parameters like volume, pan, portamento, pitch bend, 'effect control', breath controller etc. I say suggesting because manufacturers didn't adhere to the specification all *that* closely, as far as I know.

    Granted, 'effect control' isn't precisely defining what that control does, but it narrows the scope of the control significantly from being something like pitch.

    These mappings allow us to pipe midi data to different synths and have them sound relatively similar. Pitch and tone are a start, but with all these control conventions we can play all these different synths in the same way.

    Of course, the synths won't sound *exactly* the same because they may have their own unique controls; like an adjustment for FM operator 11. A discovery service would be perfect for the extra controls, even in MIDI. However, the service would still need a way of looking up the musical 'meaning' behind the control, so it knows how it sounds. Then the notation would be mapped correctly to the device.

    It would be great to have some kind of wiki where people could enter "instrument control number X is similar to 'volume'". It would be like semantically mapping music to synthesizers. People could make suggestions for the different parameters of types of instruments. The highest voted parameter gets selected for the sequence's control value! The song will change over time!

    On another note, how's this for a musical mapping:

    http://www.youtube.com/watch?v=2Xg9D3Me5Zw

    Never heard of these guys up until a few days ago. Somehow seems appropriate for this article…

  • http://ardour.org/ Paul Davis

    @brendan: i'm thinking more of a varargs-style syntax combined with a key:value syntax. suppose that the basic message to turn on a sound looks like this:

    <code>play pitcHz:440 on:timeT attack:2.13</code>

    such a message would work for any instrument, any time. but … a given instrument could accept additional parameters both then and during the course of the sound:

    <code>play pitcHz:440 on:timeT attack:2.13 wiffle:49.87 waffle:0.1</code>

    if you send that message to piano instrument, the extra parameters will do nothing. if you send it to paul's own bassophilic gazumper, it will handwaving

    and in this context, i guess that discovery is actually relevant (mea culpa), along with a semantic database of meaning.

    note that i am not proposing that the above bits of text would actually be the messages sent, as a bytestream. i'm just discussing the semantics.

  • Brendan

    @Paul

    Ah, I see what you mean. That reminds me of Supercollider; the syntax is similar to when you instantiate a synth. If you have the synth's id you can also send it messages after it's created. Trouble is, one still has to read the code to know which parameters to change, and read even deeper in the code to know what units the parameters are in (unless the docs are good, I guess).

    Where can I get a Bassophilic Gazumper? It sounds like a synth Dr. Seuss would make. Hope it's a LADSPA plugin so it works with Ardour! By the way, thanks for Ardour; it's amazing. MIDI support will make it mind-blowing.

  • http://ardour.org/ Paul Davis

    @brendan: yes, i think that SC is a subconscious influence on my own thinking about this. That and the old NeXT MusicKit language (another Music V derivative, but nicer than the rest), which Apple sadly dropped upon acquisition.

    but take heart – you have to know quite a bit about a trombone or a cello to understand what "its parameters" are too :)

  • http://www.h-pi.com Aaron Andrew Hunt

    Thank you for the provocative article, Mr. Kirn! Whatever is used as a transport language will of course reflect whatever paradigm has been assumed, and the Western musical paradigm seems to be central to all the limiting factors you have raised.

    Obviously, MIDI was designed around that paradigm. It was also designed in days of low bandwidth and slow bit rate, and as a result, it is as compact as possible – quite elegant in that regard.

    The most important musical parameter missing in the basic MIDI spec is tuning. A simple tuning message should have been first on the list of MIDI messages. If the Note OFF message were changed to a Note Tune message, the list of messages would begin with the most important piece of data regarding a pitch. The authors of MIDI missed that.

    You suggest investigating new systematic approaches to the transfer of musical information. When doing that, one should first consider alternative paradigms for music in general, as you suggest in your article. At http://www.h-pi.com/theory/huntsystem1.html you'll find an outline of some work in this area. There you will find what is basically an expansion of Western theory in terms of theory, notation, and instruments, to include all possible pitches and intervals.

    MIDI can be made to do a lot of things it was not really originally designed for; for example, the software and hardware I make uses MIDI.

    I am guessing from your article that you would like to see an open, extensible, verbose universal high level transport language that would be able to control any data stream regardless of bit format. That makes sense. You mentioned MusicXML. Unfortunately its paradigm is the old one. Lilypond also assumes all the old limits. Both languages are extensible, but basic limiting assumptions make them both somewhat dead on arrival.

    My work is obviously concerned with pitch. Other work has been done in the are of rhythm; an article by Benjamin Boretz in PNM on logarithmic rhythmic notation comes to mind.

    At any rate, I do hope these old barriers of Western tradition will not continue to impede the freedom of music makers in the digital realm. Thanks again for the article.

  • http://ardour.org/ Paul Davis

    @aah: the hunt system is an interesting read, but it seems to me that its still way too rooted in basic western musical concepts. this is a problem that came up when this issue arose during the online design of the GMPI API. the General Music Plugin Interface was a cross-industry attempt to define a new plugin API that would be portable, open and all the other good stuff. It never really came to fruition, but the discussion about pitch specification involved a dogged fight.

    The fight was mostly between people who wanted the pitch notation system to reflect musical concepts at its core, and others who felt that the system should be based purely on physics (or acoustics, if you prefer), and that any musical concepts would have to sit above, not below the pitch specification. to give a trivial example of the simplest but most fundamental kind: one can specify a series of note names as absolute hertz values, and inflections as either percentage or absolute variations above and below the note name value. alternatively one can consider the range of audible sound to be divisible in various ways, and have note names that somehow refer to this division implicitly. the hunt system (like traditional western notation) appears to be an example of the latter concept.

    put in even more stark terms: there is the concept of pitch as an absolute value, versus the concept of a pitch as a relation to some other pitch.

    this argument was never really resolved, if i recall correctly, and the final spec that was written (i think) left open the use of both kinds of specifications. my own perspective is that the way that musicians and composers might want to think about pitch relationships is very important, but that it doesn't and shouldn't necessarily dictate the way that you try to convey to a machine what sound you wish it to generate.

  • http://www.createdigitalmusic.com Peter Kirn

    @Aaron/Paul: Yes, Hunt's work is interesting, and in turn based on various experiments to do something like that. To me, it's sort of an outgrowth of the problem – the fixed nature of a piano keyboard or conventional Western staff.

    I think humans working with any kind of pitch relation generally think about some kind of pitch relationships because of the way we hear, whether the system they use is 12-TET or Slendro or a maqam. But note that those don't always have a definite one-to-one frequency to pitch relationship. For instance, you could do an early 18th Century-style performance of a piece, followed by a 20th Century-style tuning… and it'd sound like the same piece of music.

    So, shouldn't a device be able to respond to tuning and pitch relations independently? Then the question is, does that happen at the transmitter or the receiver? (That is, do you send frequencies explicitly and change tunings at the sender, or interpret them at the receiver?)

    If you're storing musical information, I think you have to store more than just frequencies.

    But I think what Paul's describing here is the basis of something that's very doable.

    Eric's approach with Copperlan does make a lot of sense, too; I think it's high time software makers tried out some ideas and eventually came to a convention that works.

  • http://www.h-pi.com Aaron Andrew Hunt

    Hi Paul; just to be clear, I wan't suggesting that my H-system be adopted as a syntactical norm; I was only showing that it is possible to have a paradigm which is inclusive of all pitch possibilities as a core principle.

    Of course, pitch as absolute versus relative is something built into Western tradition; the tuning pitch called A had varied as widely as a fourth, from sounding F# up to B, before becoming somewhat standardized in the last century.

    In a verbose language, IMHO any number of valid identifiers should be allowed for tuning, from Hertz to cents offset from 12ET, to note names, standard or extended as in my system, etc. I drafted something called TuningXML a couple of years ago intending this approach. It's incomplete: http://www.h-pi.com/wordpress/?p=24

  • http://www.createdigitalmusic.com Peter Kirn

    @Aaron: ah, ok, makes a lot of sense in that case. And that's the sort of need I was hinting at here.

    Sending tuning information as tables makes some musical sense, too; it's not as though you'd typically retune in the middle of playing something. :)

  • http://ardour.org/ Paul Davis

    i suspect that this is clear, but what i really meant by the absolute/relative distinction is this: you can specify a set of pitches as a set of Hz values, or you can specify them based on some kind of distance or relationship to an existing value.

    if you use the first approach, no meta-data is required to interpret the pitch information.

    if you use the second approach, then you need to provide meta data that is most commonly referred to as "tuning" information.

    as a musician or a composer, its clear to me that the second approach has many, many benefits. but just as with programming, where programmers generally work in one language for lots of excellent reasons but have to see their work transformed into machine code in order for it actually run, i think there is a strong case to be made for NOT using anything except Hz values when communicating with a sound generator.

    if somebody could show how the operation of the generator/synth might rely on knowing something about the tuning in use, that would be a useful counter-example.

  • Noah DiNoia

    Informative thread… has had a result similar to stepping into a time-machine and coming out sometime before the MIDI specification was 'decided upon' and review what attempts at creating 'shareable instructions' (i.e. forms of notation) were explored for composition of electronic music prior to deciding upon the omnipresent piano-roll editor some of us have been 'brought up on'. I recall seeing the liner notes to Brian Eno's "Thursday Afternoon" and thinking "what is all THAT?!" and then a similar thought process while reading a book of "Interviews with The Composer" which featured a look into the way Karlheinz Stockhausen approached notation of his experiments. Having only-just read completed reading this article and its comments (I've yet to explore all of the links mentioned in the comments, although CopperLan piqued my curiosity, I'll admit) it comes somewhat serendipitous to have found a mention of a book exploring alternate forms of notation elsewhere.

    Courtesy of Brain Pickings:
    "Notations 21 … the ambitious 320-page volume by Theresa Sauer and Mark Batty Publishers reveals how 165 composers and musicians around the world are experiencing, communicating and reconceiving music visually by reinventing notation." http://www.brainpickings.org/index.php/2011/05/06

    I realize this isn't much of a contribution to what's being discussed here, but perhaps it may serve as a reminder that many artists have in fact, over the course of composing their works, come up with their own symbolic representations of their musical ideas. Perhaps an old/abandoned idea could serve as the foundation for a new one?

    Great thread, anyway. Inspired me to look outside of (self-)imposed limitations of how music "has to be" written and consider how it "can be" written/notated.

  • Anonymous

    This is hardly speaking from a technological standpoint; One of the biggest issues I have with it is that midi is expressed as hexadecimal data, and yet you assumed that ’60.5′ would be half a tone higher than 60, which is far from the truth as there would be no way to represent halves within any given integer range. Although this is great for speed+portability+compactness, it sucks for expressiveness.

    Imo, OSC should be the perfect format for this, although in some instances something like musicXML would work better (mainly for sheet music, but it retains all data better & legibly).