Quantcast
Channel: Intel® Software - Intel® Integrated Performance Primitives
Viewing all 1489 articles
Browse latest View live

64 bit C# wrapper for ipp 6.1

$
0
0

I am still using ipp 6.1 wrapped in C# library ipp_cs. I don't have the need to update the version as of now. At present I am using 32 bit version of C# wrapper. I need to migrate to 64bit version of this library. I couldn't this on the products page. Could someone please advise where can I download the 64bit version C# wrapper for ipp?

Also for experienced developers, will it be time saving just to upgrade to latest ipp version and write a C# wrapper myself? I read on the forums that starting version 8.0, intel doesn't provide C# wrapper but we need to write our own. This I suppose is done because the new wrappers are simpler to develop?

Regards,

Alok


uncore performance-monitoring events

$
0
0

Hello~ 
l am using a machine that have Intel Xeon(R) CPU, x5570 (2.93Hz) on IBM system x3650 M2 server. 
I have proceeded to experiment with manual 
"Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B: System Programming Guide, Part 2", chapter 19 performance-monitoring events. 

I want to get information uncore event. for example this manual's Table 19-14. "Non-Architectural Performance Events In the Processor Uncore for Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series (Contd.)". 

But, I has failed to obtain the information in several machine... 

process : 

1. check : 

Event num :2FH 
Umast num : 01H 
Event Mask Mnemonic : UNC_QMC_WRITES.FULL.CH0 
Description : Counts number of full cache line writes to DRAM 

2. progress experiment in my linux machine using linux perf tool 

for example : perf stat -e r12f sleep ( channel 0 ) 
perf stat -e r22f sleep ( channel 1 ) 
perf stat -e r42f sleep ( channel 2 ) 

3. result : data is zero( 0 ) 

In addition, a similar experiment was carried out. 
other cpu xeon 5600 Serise............. 

but,,same result... 

Table 19-16. Non-Architectural Performance Events In the Processor Uncore for 
Processors Based on Intel® Microarchitecture Code Name Westmere (Contd.) 

Event num :2CH 
Umast num : 01H 
Event Mask Mnemonic : UNC_QMC_NORMAL_READS.C 
H0 
Description : Counts the number of Quickpath Memory Controller 
channel 0 medium and low priority read requests. The 
QMC channel 0 normal read occupancy divided by this 
count provides the average QMC channel 0 read 
latency. 

In counclusion, Uncore data is most not output .... 

please help me

EigenValuesVectors - two matrices/complex values

$
0
0

Hello,

I really need to implement into C++ code calculation of EigenValues and EigenVectors using same algorithm as Matlab function:

    [V,D] = eig(A,B) produces a diagonal matrix D of generalized
    eigenvalues and a full matrix V whose columns are the corresponding
    eigenvectors so that A*V = B*V*D.

First of all, when I check at available constructors at Intel IPP documentation: https://software.intel.com/en-us/node/505270 I can't find any constructor that makes usage of complex numbers (I am interested in Ipp64fc).

Furthermore all constructors take only one matrix as an argument. Do you have any idea how can I get similar effect to Matlab eig(A,B) with usage of Intel IPP?

I am using Intel IPP 7.1.

 

 

Problem while replacing old ippiResizeCenter method with ippiResize method

$
0
0

Hi,

  We were using the resizeCenter method in our software which takes parameters to scaleX & scaleY and offsets in X & Y directions .

  Now we want to upgrade the software to 8.2 where this method is totally removed.

  I saw the resizeCubic method but this method doesn't take the scale in X & Y directions and also offset.

  But the new method considers only the ROI's of source and destination. 

  I really doesn't understand the concept behind the new resizeCubic method ,that how we can scale down/ up the source raster to fit  inside the

  destination raster.

   All i need is to perform both scaling and shift together like the methods ippiResizeCenter and ippiResizeSqrPixel.

   Unfortunately both these methods were depreciated and  ippiResizeCenter is totally remove.

   We have to perform the zoom and pan in the source image which we used to perform before using ippiResizeCenter using the new ippiresizeCubic

   We even have the source image with different aspect ration which we used to handle by setting the different scale X & ScaleY in resizeCenter    method. I didn' understand how to handle these kind of images and perform zoom and pan using the new method  ippiResize<interpolationtype>

   Can you please provide a small code snippet to perform the zoom and pan on the source image to display it in the destination buffer using the method    ippiResize<interpolationtype> method?

   

Thanks & Regards,

Muralidhar

 

  

 

  

   

Examples of IPP Bi-Quad Filter?

$
0
0

Hi All,

I'm trying to use IPP to run a bandpass filter one some audio data (single channel). 

I've written the following class and helper function, but my results seem to be way off the mark. I hope this isn't too much code to dump

// BiQuad Coefs
// http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt
struct BiQuad
{
float coefs[6];
};

// Return bandpass coefficients
BiQuad getBandPass( float f0, float Fs, float Q )
{
	float omega = IPP_2PI * f0 / Fs;
	float alpha = sinf( omega ) / ( 2.f*Q );

	float b0 = sinf( omega ) / 2.f;
	float b1 = 0.f;
	float b2 = -b0;
	float a0 = 1.f + alpha;
	float a1 = -2.f*cosf( omega );
	float a2 = 1.f - alpha;

	// Divide all by a0, set a0 to 1.f
	return{ {b0 / a0, b1 / a0, b2 / a0, 1.f, a1 / a0, a2 / a0} };
}

class IppBiquad
{
	int pBufSize{ 0 };
	int nBQ{ 0 };
	IppsIIRState_32f * m_State{ nullptr };
	Ipp8u * m_pBuf{ nullptr };
public:
	// Default constructor, takes # of cascaded filters
	IppBiquad( int N = 2 )
		: nBQ( N )
	{
		if ( nBQ > 0 )
		{
			ippsIIRGetStateSize_BiQuad_32f( 2, &pBufSize );
			m_pBuf = ippsMalloc_8u( pBufSize );
		}
	}
	// Set the filter components
	inline void setFilt( float f0, float Fs, float Q )
	{
		vector<BiQuad> taps( nBQ, getBandPass( f0, Fs, Q ) );
		ippsIIRInit_BiQuad_DF1_32f( &m_State, (Ipp32f *) taps.data( ), taps.size( ), 0, m_pBuf );
	}
	// Free work buf (Do I need to free the state?)
	~IppBiquad( )
	{
		if ( m_pBuf != nullptr )
			ippsFree( m_pBuf );
	}
	// Run the filter
	inline IppStatus operator()( float * input, float * output, int size )
	{
		if ( input && output && size > 0 && m_State )
			return ippsIIR_32f( input, output, size, m_State );
		return IppStatus::ippStsNullPtrErr;
	}
};

I use the BiQuad Struct to store my 6 float coefficients (taps, according to the docs), the getBandPass function to return the correct normalized taps for a Bandpass filter centered around f0 given the sample rate Fs and Q value, and I use the class in order to manage the work buffer without actually having to manage that. 

When I need to run the filter, I invoke the () (parentheses) operator, sort of making my class like a function. To test the class I made an audio sample with several 500Hz sine wave "chirps" to see if I could get isolate the chirps. However I see the chirps most at very low frequencies (f0=100Hz), and it seems like the amplitude of my output has been changed somehow. 

Am I interpreting the use of the BiQuad functions wrong? None of the examples in the docs actually use a BiQuad, they all use an arbitrary IIR filter (as far as I can tell). 

I apologize in advance if the example is too object oriented; I'm happy to provide some straight C code, I just thought this was a bit clearer. Sorry for the use of std::vector, if anyone is averse to that...

Thanks for your help,

John

IPP multi-threaded libraries are not installed - static link

$
0
0

hello,

my error is 

...v110\ImportBefore\Intel.Libs.IPP.v110.targets(92,5): error : IPP multi-threaded libraries are not installed.

i have one computer which I compiled a project with IPP. and linked the lib which is created from this project with another project. on this computer I have Intel parallel studio 2015 installed.

my goal is to link the IPP project into the other project without having to install IPP for all the the other developers on my team.

the error i'm getting is that probably IPP is not installed on the other computer.

how Can I compile and IPP dependent project into a lib? so other project want have to have IPP installed? I can attach ipp libs and include. but I don't want to have all the developers install IPP

 

Inverse Fourier Transform

$
0
0

Hello,

I am a bit struggling to find function that would allow me to perform Inverse Discrete Fourier transform. I am using Intel IPP 7.1.

I performed FFT operation with IPP_FFT_NODIV_BY_ANY parameter, ippsFFTFwd_CToC_64fc function, how can I inverse it?

 

where is the link for IPP JPEG and IPP-UIC ?

$
0
0

Hello

I want to downlad IPP based jpeg encoder/decorder sample application. Can you tell me the link (for IPP-UIC, IPP JPEG etc) so that I can download it. The old link below seems broken.

thanks
Frank

 


linear and nearest neighbor interpolation

$
0
0

function output = calculateBlackLevel(blStruct, AG, ET)
    blLut = zeros(length(blStruct), size(blStruct{1}.black_level,2));
    etLut = zeros(length(blStruct), 1);

    for k = 1 : length(blStruct)
        bl = blStruct{k};

        if length(bl.analog_gain) == 1
            blLut(k, :) = bl.black_level;
        elseif AG > max(bl.analog_gain) || AG < min(bl.analog_gain)
            blLut(k, :) = interp1(bl.analog_gain, bl.black_level, AG, 'nearest', 'extrap');
        else
            blLut(k, :) = interp1(bl.analog_gain, bl.black_level, AG);
        end

        etLut(k) = bl.exposure_time;
    end

    if length(etLut) == 1
        output = blLut;
    elseif ET > max(etLut) || ET < min(etLut)
        output = interp1(etLut, blLut, ET, 'nearest', 'extrap');
    else
        output = interp1(etLut, blLut, ET);
    end
end

 

here is the matlab code i'm trying to convert, my question is. does ipp have any sort of interpolation functions?

 

Reinstatement of Intel® IPP in-place functions

$
0
0

 

Some of the users are using the old Intel® IPP releases, and may notice the deprecation warnings on the in-place functions.

After reviewing the feedback from the users, we decided to keep these in-place functions in the Intel® IPP releases.

The deprecation warning was removed since Intel® IPP 8.1 release.  These functions continues to be supported.

Check here to find the new features in Intel® IPP 8.2, and your feedback is welcome on the product. 

Video is getting swapped when the image is decompressed using Intel Media SDK

$
0
0

Hi,

   Context: I am using Intel media sdk  to Decompress Image from different camera input(like 1080p, 720p....) at the same time. For each Camera I am using seperate pipe line. Like this I have configured my system in such a way that inputs from different 16 cameras are being decompressed. During Rendering the image, sometimes one camera image is taking the image of all other 15 cameras in a sequential manner. If we stop the pipeline for that camera(programatically) and reinitialize again, issue is not  disappearing also.

What could be the reason for this behaviour? Intel Graphics card is internally doing any swapping?

intel deflate decompression implementation/library

Intel® Integrated Performance Primitives (Intel® IPP) upgrade options

$
0
0

Dear IPP users,

If you are presently using Intel IPP in your applications and if license is expiring soon, we have exciting news for you regarding your Intel® IPP license extension.

Because our Intel customers were seeing a lot of synergy in using Intel® IPP in combination with the various Intel Development Tools, Intel® IPP is now delivered along with other performance libraries like threading libraries (Intel® TBB), Math Kernel Library (Intel® MKL) and Intel Compiler in our various suites (Intel® Parallel Studio XE, Intel® System Studio, or Intel® Integrated Native Developer Experience). Majority of our customers are already realizing the value that this change has brought.

As our existing customer you can either

  • Continue to renew the support maintenance for your existing Intel® IPP license or
  • Upgrade to one of our Intel Studio products based on your specific needs and enjoy a wider access to Intel performance libraries, threading libraries and Intel compilers.

Pick a suite which best fits your software application requirements. 

Product Name : Intel® System Studio
Type of Software Applications : Used in System software and applications for embedded or mobile devices. For example, embedded applications in digital surveillance, test measure equipment, medical imaging, telecommunication, multi- functional printer
The product supports Linux*, Android* and Windows* targets

Product Name :Intel® Parallel Studio XE
Type of Software Applications : Used in Enterprise and Desktop application with focus on parallelization and vectorization optimization. The product supports Windows*, Linux*, and OS X*

Product Name : Intel® Integrated Native Developer Experience (Intel® INDE)
Type of Software Applications : Used in any C++/Java* applications that has to support cross-OS, cross-Architecture for Windows* on Intel® architecture and Android* on Intel® architecture and ARM*.
Support host systems: Windows*, OS X*.
Support target systems: Android*, Windows*, OS X*

For buying and renewal options for Intel® IPP, please contact us by intel.software.sales@intel.com, or visit @ https://software.intel.com/en-us/intel-ipp/try-buy

Compiling G729 for PJSIP

$
0
0

Hello Sir/Madam,

I need to integrate G.729 codec with PJSIP project. I am developing Softphone dialers in android and iOS. From PJSIP website I came to know that I need to download some IPP samples and compile it. When I clicked on the link provided by PJSIP, I have seen three different applications

1. Part of Intel® Parallel Studio XE

2. Part of Intel® System Studio

3. Part of Intel® Integrated Native Developer Experience (INDE)

Out of this three, which one I need to purchase for G.729 Compilation?

Also in PJSIP Website they have given a link for downloading Sample IPP Project (http://www.intel.com/software/products/ipp/samples.htm). But when click this link it is showing me 'Page not found'. Please help me out. As I am new to this, I am not getting any idea. Is there any step by step documentation available for G.729 compilation?

Thanks in advance.

-

Shuhaib

h264 developing in network

$
0
0

hi,i am developing in network video project,but i find the cpu of ipp decoding and encoding h264 is so high ,more than 130 percent,i hope intel developer can give me some devices.THanks!! 


IPP 7.0

$
0
0

Hi,

We purchased the IPP SDK from Intel about 3 years ago and built a direct show filter to use the SDK to decode H264 video. We use the SDK as a static library. Our decoder filter is based on the w_ipp-samples_p_7.0.5.059 sample from Intel.

We ran into a problem recently when we tried to decode video streams in 1080p at 30 fps in 4 mbps or higher. The decoder shows stutter video when it tries to decode an I-Frame. For example, we see cars pause and resume in our video of moving cars every time an I-Frame needs to be rendered. The I-Frame size is between 170K to 200K bytes. We found that the pauses might come from the GetFrame() call. It takes the function about 100 milliseconds to decode an I-Frame.

Is this a known problem? Is there a new version of IPP fixes the problem? Any suggestions and help would be greatly appreciated.

Thanks

CPU feature recognition not always working

$
0
0

Hey there,

I am facing a problem with the ippInit() auto recognition of the available and enabled features. This only seems to happen on a WinXP SP3 32-bit on a notebook running an Intel i3-2348M. Some times our software was crashing with an illegal instruction error. We were able to identify an AVX instruction that was being executed. Since WinXP is not able to handle AVX at all it should not be enabled. This is in about 90-95% of the cases true but in these 5-10% IPP selects the g9 arch which would be AVX capable. Most of the time the p8 arch is selected which is totally fine for WinXP and the given CPU.

Now I am not sure how the initialization of the enabled instructionssets is working. Though I suspect that it needs to be enabled by the OS kernel, I am not quite sure about that.

My suspicion is based on this little code example i used for my tests:

#include "stdafx.h"

#include "immintrin.h"
#include <iostream>

#include "ippi.h"
#include "ippcore.h"
#include <Windows.h>

int _tmain(int argc, _TCHAR* argv[])
{
 	__m256* a;
 	__m256 b;

	int i = 0;

	Ipp64u features;
	ippGetCpuFeatures(&features, 0);
 	IppStatus status = ippInit();

 	std::cout << ippGetLibVersion()->Version << ""<< ippiGetLibVersion()->targetCpu << "; hasAVX: "<< (features & ippCPUID_AVX) << "; hasOSAVX: "<< (features & ippAVX_ENABLEDBYOS) << std::endl;

 	while (i<100)
 	{
 		a = new __m256;
 		std::cout << "*a = "<< ((float*)a)[0] << ""<< ((float*)a)[1] << ""<< ((float*)a)[2] << ""<< ((float*)a)[3] << std::endl;
 		std::cout << "a = "<< a << std::endl;

 		b = _mm256_loadu_ps((float*)a);
 		std::cout << "b = "<< ((float*)&b)[0] << ((float*)&b)[1] << ((float*)&b)[2] << ((float*)&b)[3] << std::endl;
 		++i;
	}

	return 0;
}

This usually gives us a p8 arch on WinXP but sometimes we get the mentioned g9. This behavior can sometimes be seen more often after a reboot. The latter codepart (the while loop) should mess with the AVX instruction. This works on some Win8.1 but crashes on WinXP though it sometimes does get through one iteration. I know that this could result from (un)lucky timing with the windows scheduler. For testing the selected arch I commented out the loop and executed the program 100 times using some batch for loop.

Still my colleagues and I have no clue why IPP is selecting the wrong arch. Right now we catch that case and initialize with the p8 arch manually. Does anyone have a clue? Thanks in advance.

ippiConv

$
0
0

Hi. Can anybody help me with this problem. I have 2 images src1 and src2 of Mat type.

I want to use ippiConv to convolve but there was a problem of accessing memory.

Here is my code

    Mat src1, src2, dest;

    const IppiSize src1Size = { src1.size().width, src1.size().height };
    const IppiSize src2Size = { src2.size().width, src2.size().height};
    
    
    IppEnum funCfgFull = (IppEnum)(ippAlgAuto | ippiROIFull | ippiNormNone);
    IppEnum funCfgValid = (IppEnum)(ippAlgAuto | ippiROIValid | ippiNormNone);
    Ipp8u *pBuffer;
    int bufSizeFull, bufSizeValid, bufSizeMax;
    IppStatus status;    
    

    status = ippiConvGetBufferSize(src1Size, src2Size, ipp32f, 1, funCfgFull, &bufSizeFull);
    
    status = ippiConvGetBufferSize(src1Size, src2Size, ipp32f, 1, funCfgValid, &bufSizeValid);
    
    bufSizeMax = IPP_MAX(bufSizeFull, bufSizeValid); // get max buffer size
    pBuffer = ippsMalloc_8u(bufSizeMax);

    dest.create(src1.size(), CV_32F);
    const IppiSize dstSize = { dest.size().width, dest.size().height };

    ippiSet_32f_C1R(0, (Ipp32f*)dest.data, dest.step, dstSize);

    ippiConv_32f_C1R((Ipp32f*)src1.data, src1.step, src1Size, (Ipp32f*)src2.data,
        src2.step, src2Size, (Ipp32f*)&dest.data, dest.step, funCfgFull, pBuffer);

    ippsFree(pBuffer);

Thank you in advance

 

Single threaded IPP and external parallelization

$
0
0

Hello,

I am implementing an application which uses single threaded IPP and external parallelization via MS OpenMP.

Below you can find a piece of the source code which I used for some tests (the full code is attached to the post).

for (auto t = 1; t <= maxThreads; t++)
{
	auto start = clock();
	#pragma omp parallel default(shared) num_threads(t)
	{
		auto id = omp_get_thread_num();
		auto buffer = buffers[id];
		auto step = steps[id];

		#pragma omp for schedule(dynamic, 1)
		for (auto i = 0; i < count; i++)
			ippiDivC_32f_C1IR(1.0f, buffer, step, roi);
	}
	auto stop = clock();
	cout << "threads="<< t << " time="<< (stop - start) << endl;
}

The code of application is very simple. It just checks an execution time of calculation using IPP depending on the number of threads used for this processing.

For width=5000, height=5000 and count=100 I've obtained following results:

Intel Core i7-3770 CPU @ 3.40GHz
version=7.0 build 205.58 name=ippie9_l.lib
threads=1 time=982
threads=2 time=947
threads=3 time=945
threads=4 time=957

Intel Xeon CPU E5-1660 0 @ 3.30GHz
version=7.0 build 205.58 name=ippie9_l.lib
threads=1 time=988
threads=2 time=698
threads=3 time=679
threads=4 time=678
threads=5 time=678
threads=6 time=699

As you can see it is very difficult to get any significant speed up using multiple threads. My question is what is the reason of above behavior? Could you please tell me what is the bottleneck of described solution?

Thank you in advance for your help.

Krzysztof Piotrowski.

AttachmentSize
DownloadIPPOpenMP.zip4.85 KB

About setting FTZ mode using Intel Compiler Option.

$
0
0

Have two conflictual points as below:

1. Can setting

The /Qftz- option used to disable FTZ mode.

It is descripts as these sentence "The -ftz and /Qftz options, when applied to the main program, set the FTZ and the DAZ hardware flags. The -no-ftz and /Qftz- options leave the flags as they are". Reference following link: https://software.intel.com/sites/products/documentation/doclib/iss/2013/compiler/cpp-lin/GUID-1659EAE1-583E-44EE-BDEA-7C68C46061C7.htm

 

2. Can’t setting

Can’t set FTZ mode by using Intel Compiler Option.

It is descripts as these sentence "I've said that IPP is built without this switch and FTZ bit is set to 1 in the IPP dllMailn() function - this behavior can't be changed in your app.".  Reference following link: https://software.intel.com/en-us/forums/topic/542786

 

So, can set FTZ mode using Intel Compiler Option or can’t?

If can set, why I doesn’t used /Qftz- option? The reason is version of IPP, isn’t it?

(I use: Visual Studio 2008 SP1, Intel Parallel Studio 2011)

Viewing all 1489 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>