Skip to content

Latest commit

 

History

History
59 lines (56 loc) · 2.28 KB

todo.md

File metadata and controls

59 lines (56 loc) · 2.28 KB

TODO List


The TODO section for this project is broken into course and fine grained tasks.

Course Task List:


  • Memory manager for handling host, device, and managed memory
  • Tests for memory manager
  • Tensor wrapper around memory manager for multi-axis data storage
  • Tests for tensor
  • Create Make file and Build/Install system
  • Creates Docs and doxygen config
  • Compute graph and basic operations for tensors
  • Tests for compute graph and tensor operations
  • Link with BLAS/LAPACK (OpenBLAS?) and MAGMA
  • Better MKL support
  • Basic Layer classes (Dense, Activation, Flatten, CNN)
  • Model with forward/backward propagation
  • Tests for Model/Layer training
  • Optimizers
  • Tests for Optimizers
  • Parallel training (Multi-GPU)
  • Tests for parallel training
  • Examples in Examples/ folder
  • Tutorial / Presentation Slides
  • Automatic or numerical gradient computations
  • Test gradient computations
  • I/O methods for Tensors
  • Tests for tensor I/O
  • Batch Loaders
  • 100% Documentation
  • Establish/Connect with build Pipeline
  • Preprocessing methods (PCA, LDA, encoding)
  • Tests for preprocessing methods
  • Implement RNN
  • Tests for RNN
  • Compute graph optimizers/minimizers
  • Hyperparameter Optimization tools
  • Tests for hyperparameter optimization tools
  • Package/Install configuration (deb packages, etc...)
  • Tune compilation and runtime parameters to hardware
  • Test on different hardwares (intel, amd, nvidia)
  • OpenCL support (possibly? perhaps with different BLAS)
  • AMD support (work with Frontier)

Fine Task List:


  • Ensure CPU and GPU training results are the same.
  • Revise memory system with compute graph and tensors. Check with gdb. Possibly replace MemoryManager pointers with reference-counting smart pointers.
  • remove unused operation _internal files.
  • check and fix speed of get/set with vector access
  • tensor axis iterators
  • CPU only convolution
  • Fast ReduceSum
  • Scalar Network output bug (CuDNN reduce sum issue)
  • Adam
  • h5 and/or onnx model load/save
  • NeuralNetwork add constructor with custom loss function