Music making with your face? It’s just the latest novel way of manipulating your computer with movement, thanks to a revived interest in camera-based interaction spurred by Microsoft’s Kinect and hackers making it work, and other computer vision libraries. One original work: FaceOSC, which uses custom tracking code and a standard computer webcam (no additional hardware required) and free code to send control information for applications like live music performance. Kyle McDonald may have already wowed you with his face-tracking wizardry, but it’s easy to want to know more. Sure, it’s cool, but, um, what is it for? How do you get started? Is the timing quick enough for this to work in music? And what can we expect in the future?

I spoke with Kyle, educator, artist, and coder, about those questions and more. He’s also got some examples of what people are already doing just days after the release of his software – there’s some serious viral quality to open source code.

You’ve done work with both Kinect and now, in FaceOSC, your own camera-tracking software. How is working with Kinect for musical applications, in terms of latency? How have you found latency in your own FaceOSC application?

The only number I’ve heard regarding the Kinect’s latency is from Synthtopia, where they give 100 ms. That seems a little high to me. In my experience, the depth camera seems to have an extra frame of latency compared to the color camera. So I’d put the latency somewhere between 30 ms and 80 ms. In other words: don’t expect it to be a precision tool for live percussion, but for everything else I think there’s just
as much to explore as with any other camera.

FaceOSC feels like adding an extra frame or two of latency on top of what you’re getting from your camera. So you shouldn’t expect to beatbox or do percussion with it, but for controlling parameters in a musical context, you should be set.

How do you imagine this being used? I mean, obviously, in some ways this is (very) experimental, if good, clean fun — is there a
practical application? (EyeWriter, the eye-tracking application that improves accessibility, comes to mind as one possibility.)

I imagine FaceOSC being used to prototype ideas surrounding face-based interaction. I created it because Jason Saragih, the researcher behind FaceTracker, uses an open source non-commercial license for his code. He asks that anyone who wants to use the code email him directly, as a way to keep track of the usage. This is great, but I know that one of the fastest ways to get cool stuff happening is to make new tools and research accessible to a wide audience. So I asked him if it would be ok to make a standalone app for people to prototype their ideas — even if they don’t have access to the code. Everyone already “speaks” OSC so I thought this would be the easiest way to get the technology out there. Eventually, if people need to integrate it into a single application, they can contact Jason directly and use my ofxFaceTracker addon to get started:
https://github.com/kylemcdonald/ofxFaceTracker

And if they need to go the commercial route, there’s FaceAPI
http://www.seeingmachines.com/product/faceapi/

As far as a practical applications, I could imagine it augmenting the way the computer understands us. I’ve been thinking a lot about this
recently. Your computer has a microphone to listen to you, an accelerometer to know when you drop it, a camera to watch you, an
ambient light sensor to know how bright the screen should be. I have to wonder if it makes sense to respond to our pose and facial
expressions.

That said, here are the few experiments I’ve seen so far:


http://jeffwinder.blogspot.com/2011/07/face-gestures-faceosc-and-flash.html

Ingredients above:

FaceOSC
+
RoboFab’s Glyph Math
robofab.org
+
Vanilla
code.typesupply.com
+
Ideal Sans
http://www.typography.com/fonts/font_overview.php?productLineID=100042

Want to talk at all about your approach to developing this — particularly as you’ve been teaching others?

I think everyone learns differently, but for me I learn by playing. So I try to make it easy for other people to play by providing interfaces like FaceOSC (or, with 3d scanning, via my structured light work).

Anything else musicians might want to know about your work?

I haven’t spent enough time recently making music, but I’m always thinking about things in musical terms. My older work has a lot of
musical interfaces and ideas scattered through it, if you dig through http://kylemcdonald.net you might find some inspiration there.

Thanks, Kyle! If you’re in NYC, as Kyle and I are — or, may be, when we’re not traveling to opposite ends of the globe — Kyle has a couple of recommendations. There’s a “no-more-than-monthly” Kinect meetup organized by Sean Kean:
http://www.meetup.com/volumetric/

Also, there’s an amazing “summer school” meetup on July 21. Wish I could be there myself, but I’ll send regards from Berlin. Hope one of the other New Yorkers can report back.
http://eyebeam.org/events/meetup-demo-day

For more on Kinect, we’ve got loads of coverage on Create Digital Motion:
http://createdigitalmotion.com/tag/kinect/

It is, after all, Motion!

Got a creation of your own, or a meetup in your area? Let us know!

  • jonathan d

    haha this is the guy who did the idiotic Apple Store stunt! Oh well… rest of the stuff is pretty cool… really like this method of interacting.

  • http://kylemcdonald.net/ Kyle McDonald

    Just wanted to add that FaceOSC doesn't use the Kinect at all, in case it's unclear. It's just a regular webcam, and it's making some smart guesses about 3d geometry.

    Also, one more amazing demo I just saw: http://vimeo.com/26475997

  • http://www.salon1016.org Bill Cottman

    workinworking on a performance piece called Surface Tensions using Isadora as my performance engine. planning to use gesture detection from a dancer as feedback to redirect the narrative flow. your work with face tracking is very interesting to me. thanks and continued success.

  • Jonah

    I don't have time to more than skim until later, but are the mundane possibilities talked about? I mean effects demos are cool, but the day to day stuff is what matters.

    1)Like look up/down to change mixer tracks, panning, and so on.

    2)Give it the finger to undo!

    3)autosave when you're pinching your temples or furrowing your brow. 

    4) I saw mouth control. This could be adapted into (silent) speech recognition!

  • http://www.jeffwinder.nl Jeff Wnder

    Another something I did with FaceOSC & Flash, typing with your face :) tinyurl.com/​3z88jbm