OpenCV is a very popular image processing library that you can add to your video processing based on MFormats SDK.
For this, you need to get a frame data of the current frame and pass the data to OpenCV engine for further processing. Once the processing is done, you receive a modified frame that you can process further with MFormats SDK.
C#
The following example is based on a .Net-wrapper around OpenCV - Emgu CV.
With Visual Studio, you should install EMGU.CV NuGet package to your solution:
Then, you should load an OpenCV cascade classifier:
// Don't forget to copy a corresponding XML file for a cascade classifier
CascadeClassifier cascadeClassifier = new CascadeClassifier(@"haarcascade_frontalface_alt2.xml");
After that, you should get a clone of a source frame before create an OpenCV object:
m_objLive.SourceFrameGet(-1, out pFrame, "");
MFFrame pFrameDraw = null;
pFrame.MFClone(out pFrameDraw, eMFrameClone.eMFC_Video_ForceCPU, eMFCC.eMFCC_ARGB32);
You should use ForceCPU to make a frame in a system (not GPU) memory so you can get access to the frame data.
You should create an Image object of OpenCV from the cloned MFFrame object:
Image<Rgba, Byte> cvFrame = GetCVImageFromMFFrame(pFrameDraw);
Where GetCVImageFromMFFrame method is:
private Image<Rgba, Byte> GetCVImageFromMFFrame( MFFrame _pFrame )
{
M_AV_PROPS avProps;
int lAudioSamples;
_pFrame.MFAVPropsGet(out avProps, out lAudioSamples);
int nFrameWidth = avProps.vidProps.nWidth;
int nFrameHeight = Math.Abs(avProps.vidProps.nHeight);
int nFrameRowBytes = avProps.vidProps.nRowBytes;
int pcbSize;
long pbVideo;
_pFrame.MFVideoGetBytes(out pcbSize, out pbVideo);
IntPtr pnVideoData = new IntPtr(pbVideo);
// Pixel format for OpenCV image and MFormats frame must be the same
Image<Rgba, Byte> cvFrame = new Image<Rgba, Byte>(nFrameWidth, nFrameHeight, nFrameRowBytes, pnVideoData);
return cvFrame;
}
Once the Image object has been created, you can process it with OpenCV methods.
For instance, let's cover face detection. For this, it is necessary to specify minimal and maximal sizes of rectangles for detection and detect faces on the Image using the cascade classifier you've loaded:
// Recognize faces and detect their bounding boxes
int nSizeFactor = Math.Min(cvFrame.Height, cvFrame.Width);
Size szMinSize = new Size((int)(nSizeFactor * 0.2), (int)(nSizeFactor * 0.2));
Size szMaxSize = new Size((int)(nSizeFactor * 0.35), (int)(nSizeFactor * 0.35));
Rectangle[] rcFacesDetected = cascadeClassifier.DetectMultiScale(cvFrame, 1.1, 5, szMinSize, szMaxSize);
If any face is detected, the rcFacesDetected is not empty, and you can draw a rectangle around a face with a code like this:
// Draw corresponding bounding boxes
foreach( Rectangle rcFace in rcFacesDetected)
cvFrame.Draw(rcFace, new Rgba(255, 0, 0, 255), 3);
Because all the functionality of OpenCV used a data of the cloned MFFrame object, you can further use the pFrameDraw object with MFormats methods. For example, to send it to a preview:
m_objPreview.ReceiverFramePut(pFrameDraw, -1, "");
Note that you have 2 MFFrame objects in this case (an original one and a clone), so you should release both objects to avoid memory issues:
Marshal.ReleaseComObject(pFrame);
Marshal.ReleaseComObject(pFrameDraw);
The result of face detection depends on the settings of OpenCV objects. Here is an example of a result using the above code:
In the attachment to this article, there is a sample where the logic described above is implemented. Extract an archive, open Visual Studio and load a solution of the sample. EMGU.CV is quite heavy, so you need to restore NuGet packages for the project when you open the sample the 1st time.
C++
The following example is based on the in-built MFormats SDK sample - Sample Live Input Console. The sample is located here → "C:\Program Files (x86)\Medialooks\MFormats SDK\Samples\C++\Sample Live Input Console".
We will use the open-source OpenCV library. Download it from here and place the library in a convenient place for you → https://sourceforge.net/projects/opencvlibrary/
Now open the Sample Live Input Console via Visual Studio. First, we need to specify the paths to the OpenCV libraries. In our case, the libs are placed in the same folder where the solution is placed:
Now, go to the object properties, and set up the following properties in C/C++ General, Linker General and Linker Input:
When all the necessary libs are connected, include Open CV in your code:
#include "opencv2/opencv.hpp"
Now, your solution is ready for working with the Open CV. The approach is simple and consists of several steps:
1. Declare and initialize cv::CascadeClassifier instance:
cv::CascadeClassifier classifier;
BOOL Res = classifier.load("opencv\\sources\\data\\haarcascades\\haarcascade_frontalface_default.xml");
2. Get a frame from a video source in ARGB32 format:
cpFrame = NULL;
//Get frames one by one
avProps.vidProps.eVideoFormat = eMVF_Custom;
avProps.vidProps.fccType = eMFCC_ARGB32;
hr = cpSource->SourceFrameConvertedGet(&avProps, -1, &cpFrame, CComBSTR(L""));
3. Get the frame byte data and its resolution:
M_AV_PROPS Props;
long audioSample;
cpFrame->MFAVPropsGet(&Props, &audioSample);
long size;
LONGLONG videoBuff;
cpFrame->MFVideoGetBytes(&size, &videoBuff);
int height = std::abs(Props.vidProps.nHeight);
int width = Props.vidProps.nWidth;
cv::Mat frameIn(height, width, CV_8UC4, (char*)videoBuff);
cv::Mat frameIn1;
frameIn1 = frameIn.clone();
int nSizeFactor = min(height, width);
cv::Size szMinSize = cv::Size((int)(nSizeFactor * 0.2), (int)(nSizeFactor * 0.2));
cv::Size szMaxSize = cv::Size((int)(nSizeFactor * 0.35), (int)(nSizeFactor * 0.35));
std::vector<cv::Rect>* detected_objects = new std::vector<cv::Rect>;
//classifier.detectMultiScale(frameIn1, *detected_objects); // Slow but accurate
classifier.detectMultiScale(frameIn1, *detected_objects, 1.1, 5, 0, szMinSize, szMaxSize); // Fast but inaccurate
for (size_t i = 0; i < detected_objects->size(); i++)
{
rectangle(frameIn1, detected_objects->at(i), cv::Scalar(0, 0, 255), 3, cv::LineTypes::LINE_8);
}
6. Take the result cv::Mat image with the overlaid rectangles as a byte array and create a new CComPtr<IMFFrame> from the byte array:
CComPtr<IMFFrame> cpFrame_1;
hr = cpFactory->MFFrameCreateFromMem(&Props, (LONGLONG)frameIn1.data, 0, 0, &cpFrame_1, NULL);
if (FAILED(hr))
{
cout << "MFFrameCreateFromMem failed" << endl;
return false;
}
7. Send the new frame into a preview:
if (cpReceiverPreview)
{
hr = cpReceiverPreview->ReceiverFramePut(cpFrame_1, -1, CComBSTR(L""));
}
if (FAILED(hr))
{
break;
}
cpFrame = NULL;
cpFrame_1 = NULL;
The result of face detection depends on the settings of OpenCV objects. Here is an example of a result using the above code:
Here is a link for downloading a sample where the logic described above is implemented → https://drive.medialooks.com/index.php/s/3W3bm4HNXPEXawR. Extract an archive, open Visual Studio and load a solution of the sample.