Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

display CU_PARAM_TR_DEFAULT in checkdrv and what is it? #51

Open
Roger-luo opened this issue Jun 28, 2016 · 1 comment
Open

display CU_PARAM_TR_DEFAULT in checkdrv and what is it? #51

Roger-luo opened this issue Jun 28, 2016 · 1 comment

Comments

@Roger-luo
Copy link

Roger-luo commented Jun 28, 2016

When I try this function on a server with Tesla GPU(device initialize and other staff is outside the function) , cuModuleLoad returns -1 in the console. But I didn't find this return in the cuModuleLoad's document. (All the returns are positive)

WARNING: /home/quaninfo/rogerluo/.julia/v0.4/Quantize/src/utils/cuda/cuMatrix.ptx
ERROR: LoadError: KeyError: -1 not found
 in checkdrv at /home/quaninfo/rogerluo/.julia/v0.4/CUDArt/src/module.jl:14
 in call at /home/quaninfo/rogerluo/.julia/v0.4/CUDArt/src/module.jl:24
 in diagexp at /home/quaninfo/rogerluo/.julia/v0.4/Quantize/src/utils/cuda/cuMatrix.jl:19
 in diagexp at /home/quaninfo/rogerluo/.julia/v0.4/Quantize/src/utils/cuda/cuMatrix.jl:29
 in realtimeop! at /home/quaninfo/rogerluo/.julia/v0.4/Quantize/src/Adiabatic/timeop.jl:5
 in next_timestep! at /home/quaninfo/rogerluo/.julia/v0.4/Quantize/src/Adiabatic/timeop.jl:20
 [inlined code] from util.jl:155
 in adia at /home/quaninfo/rogerluo/cooling-12.jl:8
 in include at ./boot.jl:261
 in include_from_node1 at ./loading.jl:320
 in process_options at ./client.jl:280
 in _start at ./client.jl:378
while loading /home/quaninfo/rogerluo/cooling-12.jl, in expression starting on line 30
function diagexp(A::CudaArray{Complex64})
    md = CuModule("$dir/src/utils/cuda/cuMatrix.ptx", false)
    diagexp = CuFunction(md, "diagexp_cf")
    nsm = attribute(device(), rt.cudaDevAttrMultiProcessorCount)
    mul = min(32, ceil(Int, length(A)/(256*nsm)))
    expH = CudaArray(Complex64,size(A)...)
    launch(diagexp, mul*nsm, 256, (A,expH,length(A)))
    return expH
end

The CUDA document describe this as

For texture references loaded into the module, use default texunit from texture reference.

But it works fine on my own laptop with a GT730M GPU.

As a freshman in CUDA, I'm not familiar with this error/warning, is there anyone who knows how to solve it?

@Roger-luo
Copy link
Author

Roger-luo commented Jun 28, 2016

Sorry, I need to re-open this, as this error stops the program again.

I used following code to test if it's the server's or bsub's problem, but CUDA C++ works fine with both bsub and manually run:

// device code
#include <cuda.h>
#include <stdio.h>

typedef struct
{
    float x;
    float y;
}complex;


int main()
{
    CUdevice device;
    CUcontext context;
    CUmodule module;
    CUfunction kernel;
    CUdeviceptr dptr[2];

    cuInit(0);

    cuDeviceGet(&device, 0);
    cuCtxCreate(&context,CU_CTX_SCHED_AUTO,device);
    int t = cuModuleLoad(&module,"diagexp.ptx");
    cuModuleGetFunction(&kernel,module,"diagexp_cf");
    printf("%d",t);

    #define n_thread 3
    size_t size = n_thread*sizeof(complex);
    cuMemAlloc(&dptr[0],size);
    cuMemAlloc(&dptr[1],size);

    complex *A = (complex *)malloc(3*sizeof(complex));
    complex *C = (complex *)malloc(3*sizeof(complex));
    for(int i=0;i<3;i++)
    {
        A[i].x = i;
        A[i].y = 0;
        C[i].x = 0;
        C[i].y = 0;
    }

    cuMemcpyHtoD(dptr[0],A,size);
    cuMemcpyHtoD(dptr[1],C,size);

    int len = 3;
    void *params[] = {&dptr[0],&dptr[1],&len};
    cuLaunchKernel(kernel,1,1,1,3,1,1,0,NULL,params,0);
    cuCtxSynchronize();

    cuMemcpyDtoH(C,dptr[1],size);
    for(int i=0;i<3;i++)
        printf("%f\t%f\n",C[i].x,C[i].y);

    cuMemFree(dptr[0]);
    cuMemFree(dptr[1]);
    cuModuleUnload(module);
    cuCtxDestroy(context);
    return 0;
}

This works fine both with bsub and laptop.

But the -1 error still occurs when I submit the Julia code to the server. My program is here AdiaComput

@Roger-luo Roger-luo reopened this Jun 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant