It’s called “Jelly Bean.” But a 4.1 version of Android might also be called, at last, a version of Android musicians will find tasty. (Those last versions were a bit more of the disgusting variety from Bertie Bott’s Every Flavor Beans; this is a bit more Jelly Belly.) Photo (CC-BY-SA) Hermann Kaser.

Android devices may, at last, get the kind of sound performance that makes music and audio apps satisfying to use. We’ve suffered through generations of the OS and hardware that were quite the opposite. But material, measurable changes to software, combined with more rigorous standards for hardware makers, could change all of that soon. And using the free, cross-platform libpd library, you can be ready now to take advantage of what’s coming.

If you’re using an app that involves sound or music, the performance of the underlying OS and hardware will make a big difference. That’s doubly true if you’re using an app that simulates a musical instrument, because you’re more likely to notice how responsive it is in real-time. If the hardware, OS, and apps add too much overhead, you can experience symptoms like clicks and pops – or, in order to prevent those glitches, app developers might have to create big sound buffers that make the apps less responsive. The challenge for the Android platform, then, is to eliminate that overhead and get sound from the app to your ears as directly as possible.

In brief, what’s changing in Android 4.1 “Jelly Bean” and supported devices:

  • Low latency audio playback capability, via a new software mixer and other API improvements.
  • Latency targets below 10 ms. For now, low-latency playback (which would in turn impact round-trip latency) refers only to the Samsung Galaxy Nexus; other devices should get similar performance enhancements, but there’s no official word yet on which, or when.
  • Strict maximum latency requirements for third-party vendors at some time in the future.
  • Enhanced OS features, too: USB audio device support, multichannel audio (including via HDMI), recording features, and more.

But let’s talk a bit about why this is important – in terms that are hopefully understandable whether you’re a developer or not.

It’s tough to argue with the importance of this, if you know anything about what makes good sound experience in software. Human beings are, on average, gifted with the same common powers of perception and hearing. You don’t have to be a “pro musician” to notice when sound and music apps aren’t responsive. Untrained ears will respond immediately – and unforgivingly – to crackles, pops, and delays in sound. (That last issue, described as “latency,” is subtler but no less important – you tap on a screen, and you expect to have an immediate response. Users often respond negatively to even minute increases of delay, whether consciously or not.)

So, when we talk about “high performance audio,” we really mean something for anyone using sound in apps. It’s easy to understand why it’s important. It’s just hard to do from an engineering standpoint.

Raising the Bar, Lowering the Latency

“Latency” isn’t one metric. Think of speed on a race car. Every little adjustment to the engine, gearing, weight, and aerodynamics has an impact. These results are cumulative. You can’t just get one factor right; you have to get all of them right. And that’s what made Android’s dismal audio performance complex to describe in the past. It has been a combination of factors – incomplete functionality in the developer API, a system mixer that added to latency, and device-specific issues being three major culprits.

Apple has done an excellent job with this on iOS, which contributes to their near-complete dominance of mobile for music apps, and justifyably so. But that should not be taken to mean that it’s impossible to achieve low-latency audio performance when working with a variety of hardware vendors. The Windows (or Linux) PC is a great example – both of what works (extreme low latency across devices using an API like ASIO) and what doesn’t (general-purpose mixers that drive latencies past a tenth of a second).

Based on what Android developers are saying, the platform is at last moving in the right direction. In fact, it’s in stark contrast to what we currently know about Windows 8 on mobile. The very same issues I raised last week in my criticism of what’s documented in Windows RT are the ones Android developers are at last addressing. Windows RT and the WinRT/Metro library for desktop and mobile, based on current information, not only lacks “nice-to-have” features like USB and multichannel audio on a new generation of Windows tablets, but also would seem to set unacceptable latency targets — 100 ms+, or where Android was a year ago. We’re hoping to find out that there’s more we don’t know; but we’re awaiting more information from Microsoft. In review:

Music Developer on Windows 8: A Leap Forward for Desktops; A Leap Backward for Metro, WinRT?

In that article on Windows 8, I talked about the importance of having the ability to route sound through a system mixer without other sounds pre-empting what you’re doing. That’s what WinRT and Windows RT each appear to be lacking (despite their name), and what Android may finally get. (It’s what iOS, Windows, Linux, and Mac OS X all have in some form.)

Back to the car metaphor, imagine you’re in a drag race. Would you want to hit the accelerator to the floor on a nice, deserted street? Or would you like to do it in the middle of the on ramp on the I-5 in downtown LA during rush hour?

Generic system mixers often look too much like the latter scenario. And this is what Android’s development team explained in a Q&A at Google IO last week. Calling the project “a work in progress,” they claimed a sub-10 ms playback target is their goal. (“Warm playback” I believe means once an audio engine is started and there’s something to play in the sound buffer; someone else correct me if they have a different interpretation.)

As one Android developer puts it, regarding system mixers: “Once you get to that level, anything at all that’s in the system that’s pre-empting you becomes problematic.”

Photo (CC-BY) LAI Ryanne.

Android developers are clearly making progress on the software. The tougher challenge is likely to be coordinating with hardware vendors. On the Samsung Galaxy Nexus handset – a device over which Google has more control – they’ve already improved latency from 100 ms in “Ice Cream Sandwich” (4.0) to “about 12 ms” in “Jelly Bean” (4.1), and want to go oven better. 12 ms is usable; sub-10 ms could really attract sound developers to the platform. Note: I’d like to hear more about the Nexus 7 tablet, but for now, there has been no mention of sound performance on that hardware, which is made by Asus, not Samsung.

The “Fast Mixer” will work with SoundPool, ToneGenerator, and OpenSL APIs. We’re most excited about the OpenSL API, as we’ve already heard developers getting performance gains on previous OSes using that tool. (More on how to use it with the free libpd library below.)

The key variable here is when you’ll actually see devices reaching these targets in the field. Unfortunately, that may be more of a waiting game. Google says they want to “get to a point” at which they’re mandating maximum latency, but it’s unclear when that will be.

Given the wildly-variable experience on current devices, my guess is that developers targeting Android will be pretty tough on minimum requirements. If at least sub-15 ms latencies become the norm on Jelly Bean devices, I could see making the 4.1 version of the OS a prerequisite, thus avoiding complaints from users when your app doesn’t behave as expected.

Jelly Bean in general promises some encouraging improvements: at last, we see a focus on hardware accessories, high-quality audio, and high-quality animation and high framerates. These are some of the things that make iOS so much fun to use and so satisfying to users and developers alike. Pundits have rightly echoed Apple in saying that software isn’t just about specs. But these aren’t just empty specs: they’re the concrete measurement of the qualities that people experience on a sensory level when using software.

http://developer.android.com/about/versions/jelly-bean.html

More Audio Goodness

I’m actually equally enthusiastic about visual changes, but on the sound side, 4.1 delivers a lot of much-needed functionality.

Some of this is more consumer-oriented, but here’s the full list:

  • USB Audio support, exposed through the Android Open Accessory Development Kit
  • Multichannel audio over HDMI (Android-based surround sound installation, anyone?)
  • Media codec access at the platform hardware and software level (with lots of cool low-level features)
  • Audio record triggering
  • Audio preprocessing
  • Audio chaining, which means (among other things) support for seamless playback
  • Media routing (a bit closer to what Apple does with AirPlay)

Out of the list, of course, what’s interesting to us musicians is really HDMI multichannel and USB audio. I’ll be looking into how USB audio is implemented, whether they do USB audio class support as on iOS, and – just for kicks – whether USB MIDI support is possible.

All of this is described in the Jelly Bean overview, though developers will want to go ahead and grab the SDK for further detail.

Source: Android OS Developers

Lest you think a bunch of Android fanboys just dreamed this happened, here’s the evidence from Google IO.

Audio latency (and other functionality) is directly mentioned in Google’s presentation of what’s new in 4.1:
http://www.youtube.com/watch?v=Yc8YrVc47TI&feature=player_de tailpage#t=1366s

From the developer sessions, a look at API changes shifts to the audio functionality:
http://www.youtube.com/watch?v=Yc8YrVc47TI&feature=player_de tailpage#t=3280s

An extended discussion of what’s changed in the “fireside” Q&A with the dev team:
http://www.youtube.com/watch?v=UGJbPPjANKA&feature=player_de%20tailpage#t=3552s

Thanks to everyone who sent this in; in particular, there’s a terrific thread on KVR Audio:
There Are Low Latency Audio Improvements in Android 4.1 (Jelly Bean) [KVR Audio Forums]

OpenSL and libpd

For you developers out there, OpenSL is a big deal. (And users, just … trust us on this. You’ll soon be experience better Android apps as a result.)

Peter Brinkmann is the principle developer of libpd. For those of you just joining us, libpd is the free and open source embeddable version of Pure Data that now runs across desktop and mobile OSes. It actually is core, vanilla Pure Data, not a fork, but provides additional support around that core for making it easier to use Pd patches inside other apps. Peter has been working hard on OpenSL support – and has found reason to be excited about it coming.

Peter notes that OpenSL is working better than standard Java audio output across all test devices. You can use it with the libpd library, but other Java, C, and Processing for Android developers should benefit, too.

Oh yeah, and the design of the OpenSL branch should be useful on platforms other than Android, too: Peter notes that he already has “a prototype using PortAudio that took about 15 minutes to implement and runs nicely on my Mac. Porting this to other platforms is just a matter of doing the grunt work of adjusting makefiles and such. In a nutshell, this promises to turn libpd into a viable stand-alone audio solution for Java, without necessarily needing to use something like JavaSound.” (Sorry for quoting your email, Peter, but in the hopes other people might help the two of us Peters on this, I’m going to put this out there for everyone.)

Well worth reading: Peter has a series on libpd and OpenSL ES. It shows a bit of the future of Android audio development, and also the future of cross-platform free software development for sound for Pd (and other platforms, too) well beyond Android.

libpd and OpenSL ES, Part I: Squaring the circle

libpd and OpenSL ES, Part II: Yet another JACK-like API for Android

libpd and OpenSL ES, Part III: Receiving messages

libpd and OpenSL ES, Part IV: Extending the API

And to take advantage of this on Android in libpd on Android:

I just pushed a new branch of pd-for-android that supports either AudioTrack/AudioRecord for FroYo or earlier, or OpenSL ES for Gingerbread or later. I managed to square the circle and make the entire transition more or less transparent to developers. If you limit yourself to the part of the API that I discuss in my book (i.e., everything but the low-level audio processing methods), then your apps won’t need to be adjusted at all.

As it happens, the ability to switch between audio engines could be relevant in future when using JACK, Core Audio, ASIO, and the like on other operating systems.

For everything you need for libpd, including Peter’s superb book that will get you started on Android and iOS development, head to our minisite:
http://libpd.cc

And stay tuned here. We’ll keep bringing you audio and music news for users and developers alike, whatever the platform.

AMIGA, perhaps?

  • http://twitter.com/pneuman42 Leigh Dyer

    Thanks a lot for digging in to those details — I caught the mention of reducing latency on the Galaxy Nexus to 12ms in the fireside chat today, and I was hoping you’d caught that, too. I upgraded to a Nexus just last week, so now I’m really keen to see how developers run with this, provided that at least this year’s other flagship devices get timely updates.

    • Noel

      Its encouraging that Jellybean is making progress in this area. Thanks for the article.

  • Oootini

    looking forward to seeing these jelly bean improvements on android handsets in 2 or 3 years time.

    • http://pkirn.com/ Peter Kirn

      I think you may be missing some of the point here.

      OpenSL is available right now, today, with enhanced audio performance using Peter’s code. And we’ve already gotten indications from developers that it can improve performance on earlier OS revisions.

      Jelly Bean is available now to use on the Nexus. And it appears we’ll get a faster rollout than on other OSes – definitely worth noting after the ICS debacle.

    • http://pkirn.com/ Peter Kirn

      That said, yes, I’m concerned about two variables:

      1. When we do see 4.1 on non-Nexus devices.

      2. If and when the Android team can deliver on max latency requirements with OEMs. (It’s easier said than done, and I’m not clear from their answer whether this even happens with 4.1. This is not the same issue as when 4.1 rolls out.)

      All of this is making the Nexus 7″ tablet look compelling to me, though, if they’re getting similar low-latency performance on that as on the phone.

  • http://nettoyeur.noisepages.com/ Peter Brinkmann

    Quick note for libpd developers:  For the time being, OpenSL support is still in an experimental branch.  In order to get it, you need to clone Pd for Android as usual and then say “git checkout opensl” and “git submodule update”, in that order.

    I’ll merge the OpenSL bits into the master branch once we’ve had a chance to do a little more testing.  Any feedback would be appreciated!

  • gwenhwyfaer

    I was wondering why I might want a Nexus 7. This’d do it.

    • gwenhwyfaer

       (Although latency is less of an issue for things like trackers, or anything else which takes an essentially step-time approach.)

    • http://pkirn.com/ Peter Kirn

      It is, but that’s why it’s smart not to view this as *only* baseline latency measurements.

      If you can get reliable access to the mixer without getting pre-empted, then you’re also likely to avoid clicks and pops and the like. That’s important to everyone in all applications always. 

      I’m still not sure precisely which models these guys mean when they say “Nexus,” though, so I’m researching that very issue.

    • http://pkirn.com/ Peter Kirn

      By the way, I wouldn’t run out and buy a Nexus 7 yet. I confirmed that the language being used here is “Samsung Galaxy Nexus,” not the Asus-made Nexus 7 tablet.

      I sure as heck *hope* that the Nexus 7 makes audio a priority, too.

    • gwenhwyfaer

       Hence the conditional tense :)

  • http://www.facebook.com/qotile Paul Slocum

    Even iOS doesn’t get anywhere close to 10ms in practice, so I’ll believe the numbers when I see it working.  10ms may be the latency of the primary audio buffer, but based on my tests with iOS, actual latency of touch-to-sound or midi-to-sound is 40ms-100ms.  Touch processing has delays, MIDI has delays, and there can be other delays in the system too.

    • http://pkirn.com/ Peter Kirn

      No, that’s right. It’s a useful metric, but it’s not round-trip latency. 

    • Victor

      Well, the shortest buffer size on iOS is 128 frames, which at 44100K gives us 2.9ms. Round trip is double that, 5.8 ms. On Android, we are talking about 4096 samples if we’re lucky, round trip of around 185ms. I’m glad google is listening. I want to see it first before I believe it.

    • http://www.facebook.com/qotile Paul Slocum

      BTW: To test touch-to-sound latency, set up a mic and the device’s output to record simultaneously, and check the difference in time between your finger tapping on the screen and the sound output from the device in your DAW.  You can do the same with a MIDI device to test MIDI, but you have to know the exact latency of the MIDI device your testing with and subtract that out.  I haven’t tested audio in/out latency, but you could use a similar setup.

  • Victor

    As far as I know, no ICS device gives 12ms latency, so I am still doubtful about this < 10ms talk. I am still waiting for an upgrade to ICS for galaxy tab to test this. Currently, latency round trip is about 200ms. This is what I get with Csound and with a test app both using OpenSL ES

    • http://pkirn.com/ Peter Kirn

      No one is saying you’re going to get 12ms latency (playback or otherwise) on ICS. It’s important to read the fine print on this, generally:

      1. We’re talking playback latency, not round-trip latency.
      2. We’re only talking 4.1, with some significant changes, it seems, to how the system mixer works.
      3. For now, Google is making this claim only in regards to Galaxy Nexus, not any other device. Now, they do say in the video above that it’s a “goal” to work on other devices, too. But this is both an OS and a hardware/firmware thing, getting this working right.
      4. Our claims about OpenSL are not only in regards to latency, but more generally about audio performance and – as near as I and other developers can tell – overhead in the Java APIs. (This is not necessarily a function of “Java,” but certainly those Java APIs as engineered, in practice.)

      Your mileage may vary; these are my own, speculative opinions on the matter.

      In other words, I think your impulse is right, but it does sound like there’s some significant progress here.

  • http://www.facebook.com/profile.php?id=669912338 Göran Sandström

    My new app Synchroid uses libpd so it will be interesting to see how the latency is improved when I get to test it on a JB phone.

  • RJ Marsan

    I was the dude in back cheering when they mentioned that during the fireside chat.

    Except until I actually see it, I won’t believe it.

  • Aphelion

    USB playback is all fine and dandy but what about recording options… tabs like the Acer Iconia series have a full sized USB 2.0 port as well as a mini usb data port. Would be cool if a “system compliant” USB audio input standard was available. So far it looks like USB recording is not going to happen in 4.1

    • Nirmala

      it seems USB playback did not make it into 4.1 after all, but there is a new app called USB Audio Recorder Pro that allows use of external USB Dacs and is designed mostly as a pro recording system. the developer says he will also be adding some additional features for playback which is very basic right now.

  • http://www.wemakesound.co.uk/ wms

    Aaannd so…3 months on, has there been any movement regarding this, or is development subject to the fragmented os versions? I mean, we haven’t heard a squeak from audio tech companies supporting this, have we?

    • Conrad

      Precisely. I have a nexus 7. audio latency is really not that impressive at all.not sure how app dependant it is, but most pianos, drums, guitars etc are still unusable. No change in last few months at all. Somthing must be still very wrong that a fairly decent piece of modern tech behaves like something from the early nineties . . . .

  • Pablo

    I work on a real-time communication product where one of our major android related problems is big delay (android waveout introduces around 150-200ms of extra delay).
    So, I was tasked with a project and I actually evaluated benefit of OpenSLES. We have like 100 models of different mobiles and I tested most common ones. I want to disappoint you and libpd developer. Almost all phones performed worse with OpenSL (e.g. delay was even bigger!). Is there any scientific test that backs up “greatness” of opensl? Not a single android phone was faster playback than 100ms. The best was a sony mobiles with 105ms delay in OpenSL mode, most of other phones (including Samsugs) performed a bit worse with OpenSL (10-20ms slower).

    • Pablo

      Just to clarify, I measured round-trip time, since wavein and waveout both contribute to overall delay in a voice call. Best roundtrip was 105 with OpenSL with a sony phone. At the same time, OpenSL lacks many features that are present on phones (like build-in echo cancellation, they aren’t accessible from OpenSL, so it’s not OpenSL’s fault), so, overall, we totally scrapped OpenSL idea. Personally, I hope it doesn’t become commonly used: I prefer to learn different quirks of native api on each OS, than some super weird opengl-like api for sound.

  • Ashish Yadav

    Hey,

    Is it possible to tackle this issue with a separate dedicated hardware? – like a external soundcard?
    I’m a beginner planning to develop an app for android ,something like positive grid jam up pro but I really don’t have much experience in deciding if it’s possible or not.

    Thanks