Data and code of AKBC'22 paper "Open-World Taxonomy and Knowledge Graph Co-Learning". For any suggestion/question, please feel free to create an issue or drop an email @ ([email protected] and [email protected]).
Table of Contents
The six datasets used in paper can be downloaded from https://figshare.com/articles/dataset/Taxo-KG-Bench/16415727.
After downloading and decompressing them under ./data/
directory, the directory looks like:
📁 ./data/CGC-OLP-BENCH
|-- 📁 MSCG-OPIEC
|-- 📁 MSCG-ReVerb
|-- 📁 SEMedical-OPIEC
|-- 📁 SEMedical-ReVerb
|-- 📁 SEmusic-OPIEC
|-- 📁 SEMusic-ReVerb
|-- CHANGELOG
And each subdirectory represents one of the six datasets, for instance:
📁 ./data/CGC-OLP-BENCH/MSCG-OPIEC
|-- oie_triples.train.txt
|-- oie_triples.dev.txt
|-- oie_triples.test.txt
|-- cg_pairs.train.txt
|-- cg_pairs.dev.txt
|-- cg_pairs.test.txt
where oie_triples.*.txt
indicate the OpenKG triples after alginment, and cg_pairs.*.txt
indicate the AutoTAXOnomy pairs after aligment.
python>=3.7.10
conda>=4.10.3
The packages used in this project is listed in ./taxoKG_conda_env.yml
.
You can re-create and/or reactivate it by the following scripts:
# Re-create the environment. This will install the env *taxoKG* under your default conda path
conda env create --file taxoKG_conda_env.yml
# Or if you want to specify the env path
conda env create --file taxoKG_conda_env.yml -p /home/user/anaconda3/envs/env_name
# Reactivate the environment
conda activate taxoKG
The porposed HAKEGCN model checkpoints can be downloaded from https://figshare.com/articles/software/HakeGCN_Checkpoints/17108342.
Please decompress them under ./checkpoints
folder.
Examples of getting the reported numbers for the main experiments (general, medical and music domain TaxoKG completion result tables):
sh run_reproduce.sh
Notes for using run_reproduce.sh
:
L2-L4 just contains the set-ups for SLURM. I add it because I am working on a SLURM controled GPU server.
You don't need to change anything to run it.
#SBATCH --job-name=Reproduce-TaxoKG
#SBATCH --gres=gpu:1
#SBATCH --output=logs/slurm_reproduce_HAKEGCN
If you use our data or code in a scientific publication, please cite the following paper:
@inproceedings{lu22:hakeGCN,
author = {Jiaying Lu and
Carl Yang},
title = {Open-World Taxonomy and Knowledge Graph Co-Learning},
year = {2022},
Booktitle = {4th Conference on Automated Knowledge Base Construction},
Series = {AKBC 2022},
}