We seeing consistent crash due to invalid memory access when using ippiFilter_32f_C1R, linking as static lib when called from multiple threads.
Pointers:
- Project is static linking IPP 7.0 library.
- ippiFilter_32f_C1R is used for sharpening images.
- there can be 100-5000 or more images, which are streamed and this method is called from multiple thread to sharpen each image individually.
- ippStaticInit is called at start.
- ippGetNumThreads returns 8, as i have 8 Core CPU. this is even happening for other cores configuration higher or lower.
- ROI provided - takes care of boundary assumption which IPP filter function needs.
-
Observation
- There is spike in kmp_launc_worker threads, waiting for instruction.
- multiple calls (serial or threaded) into ippiFilter_32f_C1R are resulting into more number of IPP threads getting created waiting, and at random point results in crash inside IPP.
- Diabling parallization fixes this issue - ippSetNumThreads(1).
is my issue something to do with Avoiding Nested Parallelization.
http://nf.nci.org.au/facilities/software/intel-ct/12.0.4.191/Documentati...
Image are 16bit per pixel monochrome images, so call sequence is like.
int imageSize = m_Width* m_Height; Ipp32f *pSrc = new Ipp32f[imageSize]; Ipp32f *pDst = new Ipp32f[imageSize]; 16bitData2IppImage(PixelData, pSrc); const int kernelWidth = m_kernel->Width(); const int kernelHeight = m_kernel->Height(); const int kernelHalfWidth = kernelWidth/2; const int kernelHalfHeight = kernelHeight/2; int srcStep = format->m_Width*sizeof(Ipp32f); int dstStep = format->m_Width*sizeof(Ipp32f); IppiSize dstRoiSize = {format->m_Width - 2*kernelHalfWidth, format->m_Height - 2*kernelHalfHeight}; IppiSize kernelSize = {kernelWidth, kernelHeight}; IppiPoint anchor = {kernelHalfWidth, kernelHalfHeight}; int firstPixGap = format->m_Width*kernelHalfHeight + kernelHalfWidth; Ipp32f* pKernel = new Ipp32f[kernelWidth*kernelHeight]; Matrix2IppKernel(m_kernel, pKernel); //function to copy internal matrix value to IPP float matrix. IppStatus stat = ippiFilter_32f_C1R(pSrc + firstPixGap, srcStep, pDst + firstPixGap, dstStep, dstRoiSize, pKernel, kernelSize, anchor); ASSERT(stat == ippStsNoErr); IppImage216bitData(data, pDst);