The general idea behind MFAudioBufferClass is similar to that of System.Collections.Generic.Queue<T> (FIFO collection) in C#. But instead of <T>, MFAudioBuffer works with individual audio samples, that are stored into separate audio channels. So that, IMFAudioBuffers provides you an opportunity to shift audio relative to video frames with the accuracy of one audio sample.
Each instance of the MFAudioBufferClass represents a list of FIFO collection of individual audio samples. The number of these collections corresponds to the number of audio channels in the buffer, while the sample format is aligned with the audio data format. You can add, remove, extract, amplify, attenuate, mix, and mute audio channels within the MFAudioBufferClass as needed.
The main objective of this article is to provide a more detailed description of the available methods from this article, along with examples and additional explanations.
The structure of the MFAudioBufferClass is as follows:
Important Note! One of the fundamental characteristics of the MFAudioBufferClass is that the number of audio samples in all audio channels is ALWAYS equal. This means that if you remove audio samples from one channel, the same number of samples will be removed from all remaining channels in the buffer. The same principle applies when adding audio samples: if you add samples to one of the channels, the remaining channels will be filled with silence to ensure that the number of samples is consistent across all channels.
Extracting Frame Audio Data to Buffer
In order to fill the buffer with audio data, it must first be extracted from an audio frame. The process is as follows:
// Clone the audio data of the frame before filling the buffer. It's optional for extra safety:
pFrame.MFClone(out MFFrame clonedFrame, eMFrameClone.eMFC_Audio, eMFCC.eMFCC_Default);
if (clonedFrame != null)
{
// Get audio format of the frame and number of the audio samples per each audio channel:
clonedFrame.MFAVPropsGet(out M_AV_PROPS m_AV_PROPS, out Int32 audioSamples);
// Get pointer to the audio data bytes array and length of the audio data bytes array:
clonedFrame.MFAudioGetBytes(out Int32 audioBytes, out Int64 audioPointer);
// Insert the obtained audio data from frame into audio buffer:
m_objAudioBuffer.BufferPutPtr("", ref m_AV_PROPS.audProps, audioSamples, audioPointer, "");
Marshal.ReleaseComObject(clonedFrame);
}Remarks:
Int32 audioSamples – represents the number of audio samples in ONE audio channel.
Int32 audioBytes – represents the total number of audio bytes (not audio samples) ACROSS ALL audio channels. For 16 channels of 16-bit audio, the number of audio bytes will be calculated as. Example:
16 audio channels * audioSamples * (16 bits / 8) = 32 * audioSamples
If you need to get audio data from one specific audio channel of the frame and fill the buffer with the obtained audio data, you can use the following approach:
// Clone the audio data of the frame before filling the buffer. It's optional for extra safety:
pFrame.MFClone(out MFFrame clonedFrame, eMFrameClone.eMFC_Audio, eMFCC.eMFCC_Default);
if (clonedFrame != null)
{
// Get audio format of the frame and number of the audio samples per each audio channel:
clonedFrame.MFAVPropsGet(out M_AV_PROPS m_AV_PROPS, out int audioSamples);
// Set the number of audio channels over which the audio bytes will be distributed in the audio buffer:
m_AV_PROPS.audProps.nChannels = 1;
// Get pointer to the audio data bytes array and length of the audio data bytes array:
clonedFrame.MFAudioChannelGetBytes(0, 0, out Int32 audioBytes, out Int64 audioPointer);
// Insert the obtained audio data from frame into audio buffer:
m_objAudioBuffer.BufferPutPtr("", ref m_AV_PROPS.audProps, audioSamples, audioPointer, "");
Marshal.ReleaseComObject(clonedFrame);
}Remarks:
If you specify an incorrect number of channels when filling the audio buffer, the bytes will be evenly distributed across all channels in the audio buffer. For example, if there is one audio channel in the frame and each frame contains 720 samples or 1440 audio bytes, and you specify 16 channels for the audio buffer, each audio channel in the buffer will be filled with 720 / 16 = 45 samples or 90 audio bytes.
Extracting Buffer Audio Data to Frame
To fill a frame with audio data from the buffer, the following approach is recommended:
// Make sure that the audiobuffer and frame are valid:
if (m_objAudioBuffer != null && pFrame != null)
{
// Get the number of audio samples remaining in the audio buffer:
m_objAudioBuffer.BufferPropsGet("", out _, out int audioSamples);
// Get frame information to determine how many audio samples it contains:
pFrame.MFAllGet(out MF_FRAME_INFO frameInfo);
// Add audio samples to the frame only if the MFAudioBuffer contains more samples than required for a single frame:
if (audioSamples > frameInfo.lAudioSamples)
{
// Mix audio into the frame and remove used samples.
// Last 2 string arguments map the audio channels. For example, if you want
// to put 1 and 2 audio channels into 2 and 1 audio channels of the frame,
// use these arguments: "0, 1" and "1, 0". In this case we copy 1st channel into 1 and 2 channels:
// Insert the audio channels from the audio buffer into the frame 1:1 without mixing or mapping:
m_objAudioBuffer.BufferFrameMix("", frame, out int samples, 0.0d, 0.0d,
// Input Map (audiobuffer layout):
"0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15",
// Output Map (frame layout):
"0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15");
// Remove inserted samples from audio buffer.
m_objAudioBuffer.BufferRemove("", ref samples);
}
} Important Note! The Int32 audioSamples value indicates the sum of audio samples from all audio channels, not per-channel.
Mixing and Mapping Audio Channels
MFAudioBufferClass allows you to map the audio channels to the desired layout before inserting the audio into the frame. The following example shows how to mix audio channels 2, 5, and 7 into channel 1, and mute all remaining channels.
// Add audio samples to the frame only if the MFAudioBuffer contains more samples than required for a single frame:
if (audioSamples > frameInfo.lAudioSamples)
{
// Mix audio into the frame and remove used samples.
// Last 2 string arguments map the audio channels. For example, if you want
// to put 1 and 2 audio channels into 2 and 1 audio channels of the frame,
// use these arguments: "0, 1" and "1, 0". In this case we copy 1st channel into 1 and 2 channels:
// Insert the audio channels from the audio buffer into the frame with custom mapping:
m_objAudioBuffer.BufferFrameMix("", frame, out int samples, 0.0d, 0.0d,
// Input Map (audiobuffer layout). To mute audio channel, -1 is used:
"0, 1, -1, -1, 4, -1, 6, -1, -1, -1, -1, -1, -1, -1, -1, -1",
// Output Map (frame layout) - All channels from audio buffer is mixed into the 1st audio channel of the frame:
"0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0"
// Remove inserted samples from audio buffer.
m_objAudioBuffer.BufferRemove("", ref samples);
}And simple reversing the audio channel layout:
audioBuffer.BufferFrameMix("", pFrame, out int samples, 0.0d, 0.0d,
// Input Map (audiobuffer layout):
"0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15",
// Output Map (-1 isn't supported for this argument):
"15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0"
);
// Delete the mixed samples from buffer:
audioBuffer.BufferRemove("", ref samples);Audio Gain
The .BufferFrameMix() method also allows you to adjust the volume of audio channels before adding them to the frame. This is controlled by the 4th and 5th arguments of the method:
audioBuffer.BufferFrameMix("", pFrame, out int samples, _dblGainFromStartDb : 10.0d, _dblGainFromEndDb : 10.0d,
// Input Map (audiobuffer layout):
"0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15",
// Output Map (frame layout):
"0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15"
);
// Delete the mixed samples from buffer:
audioBuffer.BufferRemove("", ref samples);Remarks:
double _dblGainFromStartDb- Specifies the audio gain level, in decibels (dB), at the beginning of the frame.double _dblGainFromStartDb- Specifies the audio gain level, in decibels (dB), at the end of the frame.If the values of
_dblGainFromStartDband_dblGainFromEndDbdiffer, the audio gain is gradually interpolated from the start value to the end value over the duration of the frame, resulting in a smooth gain transition.
Adding Specific Audio Channels from Buffers to a Frame
Consider the following example. We have a frame containing 4-channel audio and two instances of MFAudioBufferClass, each holding 2 channels of audio data. The BufferFrameChannelsAppend() method allows you to insert audio data from specific channels of an MFAudioBufferClass instance into specific audio channels of a frame.
This means you can selectively extract audio data from particular channels of different audio buffers and append that data into designated channels of your frame.
m_objAudioBuffer_1.BufferFrameChannelsAppend("", pFrame, ref audioSamples, "0, 3");
m_objAudioBuffer_2.BufferFrameChannelsAppend("", pFrame, ref audioSamples, "1, 2");
m_objAudioBuffer_1.BufferRemove("", ref audioSamples);
m_objAudioBuffer_2.BufferRemove("", ref audioSamples);Important note: The audio formats (sample rate and bit depth) of both the buffers and the frame must match.