Bridging Citation Analysis and Language Models: A Hybrid Recommender System for Computer Science Papers
This repository presents my master's thesis, "Bridging Citation Analysis and Language Models: A Hybrid Recommender System for Computer Science Papers", submitted to the University of Göttingen on September 26th, 2023, as a requirement for the Master of Science in Applied Statistics.
The thesis was supervised by Dr. Corinna Breitinger and Dr. Terry Ruas from the Chair for Scientific Information Analytics at the University of Göttingen.
As part of this thesis, I developed the readnext package that implements the hybrid recommender system in a Python package. See the readnext documentation for more information about the installation and usage of the package.
- Code: readnext
- Documentation: readnext Docs
- Thesis: Bridging Citation Analysis and Language Models: A Hybrid Recommender System for Computer Science Papers
This repository contains the following folders:
-
PDF Versions: The final versions of the thesis are located in the
thesis/
folder. For reading the thesis as a PDF document, seebeck-joel_masters-thesis.pdf
. For printing the thesis, seebeck-joel_masters-thesis_print.pdf
. -
LaTeX Source Code: The
main.tex
file in the root directory builds the thesis. The individual chapters are located in thechapters/
folder. Themacros/
folder contains custom LaTeX commands. -
Python Source Code: The source code to generate the data-based figures in the thesis is located in the
code/
folder. The underlying data is not included in this repository, but can be generated by running the Python scripts in thereadnext/scripts/evaluation/
directory of the readnext repository. -
Images: The images used in the thesis are located in the
diagrams/
,plots/
,screenshots/
andlogos/
folders. The images in thediagrams/
folder were created using draw.io. The university logos were provided by the University of Göttingen.