Skip to content

Collection of simple General Matrix Multiplication - GEMM implementations

License

Notifications You must be signed in to change notification settings

pedrovalerolara/simple-gemm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

simple-gemm

Collection of simple General Matrix Multiplication - GEMM implementations

C = a . A x B + C
if a = 1 and C = zeros
C = A x B

A and B are initialized with random numbers C is initialized with zeros

Arguments are always 3 matrix dimensions: args = [A_rows, A_cols (= B_rows), B_cols]

e.g. 5 5 5 or 10 10 10

CPU multithreading:

  • GemmDenseThreads: native Julia Threads implementation

    $ cd GemmDenseThreads
    $ julia -t 4 gemm-dense-threads.jl 5 5 5    
    
  • GemmDenseThreads.py: native Python Numba Threads implementation

    $ cd python/GemmDenseThreads
    $ NUMBA_NUM_THREADS=4 python3 GemmDenseThreads.py 5 5 5    
    
  • GemmDenseBlas: uses LinearAlgebra.jl (super-fast), if compiled with OpenBLAS set OPENBLAS_NUM_THREADS

    $ cd GemmDenseThreads
    $ OPENBLAS_NUM_THREADS=4 julia gemm-dense-blas.jl 5 5 5    
    

GPU :

  • GemmDenseCUDA : uses CUDA.jl which uses the optimized cuBLAS (very fast) on NVIDIA GPUs

    $ cd GemmDenseCUDA
    $ julia gemm-dense-cuda.jl 5 5 5
    

About

Collection of simple General Matrix Multiplication - GEMM implementations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Julia 54.3%
  • C++ 17.3%
  • Cuda 8.5%
  • Python 7.2%
  • C 6.0%
  • Shell 2.9%
  • Other 3.8%