
It’s not that easy to synchronize programmed animation and dynamic sound in Flash. To have something happen on screen at exactly the same time a sound starts to play.
I thought about it a bit and came up with two different scenarios where audio and visuals need to be synchronized:
- The first is where a sound plays and something visual on screen happens to accompany that audio event.
- The second is the other way around. Something happens on screen that should be accompanied by a sound.
In the first case the animation is the initiator, in the second case the sound. For the second case I included a Flash example here with the source code available.
1. Animation follows audio
This is the simpler scenario. For example: There’s an online drum machine built in Flash that plays a drum sound on every beat. A kick drum in a techno track for example. On the machine’s interface there’s a small red LED light (a MovieClip) that should blink each time the drum hits.
The drum machine has a steady rhythm set in BPM (Beats Per Minute). So once the machine has started it’s always known when the next drum hit will happen. Here’s a graph of the events:
- (a), (b), (c) … are the moments in time when a SampleDataEvent is received and the buffer is filled with samples.
- (p) is where the drum hit should happen. Because the drum BPM and buffer refresh rate are completely independent the drum hit will almost always start somewhere between two buffer refreshes.
- (q) is where the drum sound is actually heard through the speakers. There is a delay between when the samples are sent out from Flash to when they are played through the speakers. This is the latency; the time needed for the soundcard to process the samples and send them as analog audio to the soundcard’s outputs. Latency is the duration from (p) to (q).
To synchronize the two it’s a simple addition. Duration (a)(p) plus duration (p)(q).
- (a)(p) is the time from the first sample in the buffer to the first sample of the drum hit (and to convert this duration in samples to a duration in milliseconds it must be divided by 44.1 (sample rate / 1000)).
- (p)(q) is the latency. That’s ((SampleDataEvent.position / 44.1) – SoundChannelObject.position). (See the AS3.0 Language reference on SampleDataEvent.)
- The result is the total delay in milliseconds. A Timer could be started when the buffer is filled at (a) so that the LED blinks at the frame nearest (q).
2. Audio follows animation
Download the source code of this example.
This second scenario is a bit more complicated. Hopefully someone else has a simpler solution. But until then this is the solution I use.
Click the ‘Start’ button in the example above: Balls bounce on the floor and each time one of them hits the floor a ‘boing’ sound plays. A simple implementation of Newton’s laws of motion.
A graph of the events:
- (a), (b), (c) … are the times when a buffer is requested and filled. Same as in the previous example.
- (1), (2), (3) … are frames where Flash updates the screen (Event.ENTER_FRAME).
- (p) is where the ball hitting the floor is calculated to happen.
- (q) is where the sound is heard.
- (r) is where the ball hitting the floor is drawn on screen.
What’s going on?
The trick is to delay drawing the animation for a ‘latency’ amount of time.
- First calculate where the ball WOULD be (but NOT draw it).
- Fire a sound if the calculation shows the ball would hit the floor.
- Draw the animation with a delay determined by the latency.
That is the principle of how it works. The actual implementation is a bit more involved:
- At (a), (b), (c) … (each SampleDataEvent.SAMPLE_DATA) the position of the ball is calculated. The current time and position of the ball are stored in a Vector called _history.
- If the calculation shows the ball will hit the floor during the current buffer (at (p)):
- The time from the start of the buffer till the hit is calculated (a)(p).
- The sound is started at time offset (a)(p) in the buffer.
- Meanwhile at (1), (2), (3) … (at each Event.ENTER_FRAME) the screen is updated.
- To know where to draw the ball the _history of the ball is used.
- First get the current time and subtract the latency. This is the moment in the past that should be drawn on screen.
- Search in the balls _history for that moment in the past.
- It’s very unlikely however that the position at exactly the required millisecond is stored. More likely somewhere in between two stored moments in time. So get the closest positions before and after the requested time and use linear interpolation to draw the ball at the right position between the ‘before’ and ‘after’ records.
So what’s going on?
I’m afraid this explanation might be more complex than the actual code. So I’ll add some of the relevant ActionScript.
To understand it you should know that there is an AudioDriver class that listens to the SampleDataEvent. Each SampleDataEvent it sends an ENTER_BUFFER notification (this is PureMVC) to notify all Ball instances to store their position. To do this the function enterBuffer is called on each Ball instance:
public function enterBuffer() : void
{
var hit : Boolean;
if(_isDragging)
{
_vy = 0;
}
else
{
_vy += _gravity;
_y += _vy;
/** If the ball hits the floor. */
if(_y > _floorY - _radius)
{
var distance : Number = _vy;
var overlap : Number = _y + _radius - _floorY;
_hitAlpha = (distance - overlap) / distance;
_y = _floorY - _radius - overlap;
_vy *= -1;
hit = true;
dispatchEvent(new Event(Ball.BOUNCE));
}
}
/** Store as data for later use. */
_history.splice(0, 0, new BallVO(_x, _y, _vx, _vy, hit));
/** Keep maximum 100 history items so the maximum latency is buffer * 100. */
if(_history.length > 100)
{
var n : int = _history.length;
while(--n > 99)
{
_history.pop();
}
}
}
This function does what was expected: A new position is calculated and stored in a BallVO value object, the BallVO is added to the _history and a bit of maintenance is done: Keep _history’s length within bounds.
The only other thing is the _isDragging Boolean. Balls can be dragged to a new position.
The AudioDriver also sends an ENTER_FRAME notification. This is just like a regular Event.ENTER_FRAME from a DisplayObject except that it has a AudioStreamInfoVO value object as an argument. The AudioStreamInfoVO has a ‘latencyInMS’ property that holds (as you’ll have guessed) the latency in milliseconds that the animation should be delayed.
public function enterFrame(infoVO : AudioStreamInfoVO) : void
{
graphics.clear();
graphics.lineStyle(2, 0xFFFFFF);
graphics.beginFill(0xBBBBBB, 0.1);
if(_isDragging)
{
graphics.drawCircle(_x, _y, _radius);
}
else
{
var time : Number = new Date().getTime() - infoVO.latencyInMS;
var i : int = -1;
var n : int = _history.length;
while(++i < n)
{
/** Get the data from immediately before and after 'time'. */
if(_history[i].time < time && i > 0)
{
var before : BallVO = _history[i];
var after : BallVO = _history[i - 1];
var alpha : Number = (time - before.time) / (after.time - before.time);
/** If a bounce on the floor happened between before and after. */
if(before.vy > 0 && after.vy < 0)
{
/** Linear interpolation to correct the vertical position. */
var maxY : Number = _floorY - _radius;
var partBeforeBounce : Number = maxY - before.y;
var partAfterBounce : Number = maxY - after.y;
var bounceAlpha : Number = partBeforeBounce / (partBeforeBounce + partAfterBounce);
if(alpha < bounceAlpha)
{
var y : Number = before.y + (before.vy * bounceAlpha);
}
else
{
y = after.y + (before.vy * (1 - bounceAlpha));
}
}
else
{
/** Regular linear interpolation for a ball in mid flight. */
y = before.y + (alpha * (after.y - before.y));
}
if(before.hit) graphics.beginFill(0xFFFFFF);
graphics.drawCircle(_x, y, _radius);
break;
}
}
}
}
So in this function there’s a variable ‘time’ which holds the moment in time in the past that must be drawn, ‘before.time’ and ‘after.time’ and a resulting ‘alpha’ which is the interpolation value between before and after. The resulting ball position calculation is this:
y = before.y + (alpha * (after.y – before.y));

The new thing here is that if the ball hits the floor somewhere between before and after (if(before.vy > 0 && after.vy < 0)), then the regular linear interpolation doesn’t give the right visual result and some extra code is necessary. But that’s another story.
All this code is not optimized at all. In this example I wanted to keep it all as clear as possible to read and understand.
Leave a Reply