Quantcast
Channel: Intel® Software - Intel® Integrated Performance Primitives
Viewing all articles
Browse latest Browse all 1489

ipps/ippi mulpack

$
0
0

Hello,

I was checking some code using FFT and mulpack, and noticed a case where results are a bit off.

both ippi and ipps mulpack versions gave the same results.

Spec: i7-4790, Ipp 8.1 32bit. win 7 pro SP1 64bit.

Using VS2013 and stepping into the disassembly, the call stack shows that ippsh99-8.1dll is loaded.

Here is an example with ipps, based on mulpack.c from the IPP 9.0 update 2 documentation.

The example transforms the vector [0,8,0,0,0,0,0,0], and multiplies the results by itself.

        IppStatus status = ippStsNoErr;
        IppsFFTSpec_R_32f* pSpec = NULL;
        Ipp8u *pMemInit = NULL, *pBuffer = NULL, *pSpecMem = NULL; /* Pointer to the work buffers */
        int sizeSpec = 0, sizeInit = 0, sizeBuf = 0;               /* size of FFT pSpec structure, Init and work buffers */
        status = ippsFFTGetSize_R_32f(3, IPP_FFT_DIV_INV_BY_N, ippAlgHintNone, &sizeSpec, &sizeInit, &sizeBuf);
        /* memory allocation */
        pSpecMem = (Ipp8u*)ippMalloc(sizeSpec);
        pBuffer = (Ipp8u*)ippMalloc(sizeBuf);
        pMemInit = (Ipp8u*)ippMalloc(sizeInit);
        status = ippsFFTInit_R_32f(&pSpec, 3, IPP_FFT_DIV_INV_BY_N, ippAlgHintAccurate, pSpecMem, pMemInit);

        Ipp32f src[8] = { 0.0f, 8.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };
        Ipp32f fft_dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };

        status = ippsFFTFwd_RToPack_32f(src, fft_dst, pSpec, pBuffer);
        ippFree(pMemInit);
        ippFree(pSpec);
        ippFree(pBuffer);

        Ipp32f dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };

        status = ippsMulPack_32f(fft_dst, fft_dst, dst, 8);

The results in the fft_dst and dst buffers are in packed format. the first two indices are R0 and R1, the real components of the first two elements.

fft_dst =  [8.00000000       5.65685415      -5.65685415      0.000000000      -8.00000000      -5.65685415      -5.65685415]

dst = [64.0000000  8.12035296e-007      -63.9999962      -64.0000000     -0.000000000  8.12035296e-007       63.9999962       64.0000000]

Here R1 (the second index) should be zero, but instead it is a very small number.

So fft_dst[1] = 5.65685415 + j ( -5.65685415 )

This gives fft_dst[1]*fft_dst[1] = 5.65685415 * 5.65685415 + 2 j* ( -5.65685415) + j*j*( -5.65685415 )*( -5.65685415 )

Rearranging real and imaginary parts:

fft_dst[1]*fft_dst[1] = {5.65685415 * 5.65685415 - ( -5.65685415 )*( -5.65685415 ) } + { 2 j* ( -5.65685415)}

The real part should be zero, but it is R1 = 8.12035296e-007

Old IPL (Intels image processibng library) gave the correct result.


Viewing all articles
Browse latest Browse all 1489

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>