Different results when using OpenMP and FFTW

I am trying to parallelize the following loop: #pragma omp parallel for private(j,i,mxy) firstprivate(in,out,p) for(int j = 0; j < Ny; j++) { // #pragma omp parallel for priv...