This repository includes the implementation of 'Prediction of gene expression from local regulatory sequence using transformer-based deep learning '.
In this package, we provides resources including: colab notebooks containing source codes of the DNABERT-reg model, analysis walkthrough performed in the paper, pre-trained model parameters, and fine-tuned model parameters (based on dataset constructed with gene expression dataset ENCFF910TAZ).
We finetuned the DNABERT-reg pre-trained model with Tesla P100-PCIE-16GB GPU with 26GB of memory. Please adjust batch size according to your specification.
The colab notebooks can be inspected from gitlab, inputs and intermediate data can be inspected from google drive link below under inputs.
To run the analysis please attach the shared google drive repository below using the "add shortcut to drive" option
https://drive.google.com/drive/folders/1inxu93Et8VA7iyCsTnQeiH2q5TTijLtM?usp=sharing
Siu Pui Chung