RecBole v1.2.0 Release Notes
After a long period of hard work, we have completed the upgrade of RecBole and released a new version: RecBole v1.2.0!
In this release, we fully consider users' feedback and demands to improve the user friendliness of RecBole. First, we include more benchmark models and datasets to meet the latest needs of users. Secondly, we improve the benchmark framework by including commonly used data processing methods and efficient training and evaluation APIs, and also provide more support for result analysis and use. Thirdly, in order to improve the user experience, we provide more comprehensive project pages and documentation. According to the issues and discussions, we also fix a number of bugs and update the documentation to make it more user-friendly.
In a word, RecBole v1.2.0 is more efficient, convenient and flexible than previous versions. More details will be introduced in the following part:
- Highlights
- New Features
- Bug Fixes
- Code Refactor
- Docs
Highlights
The RecBole v1.2.0 release includes a quantity of wonderful new features, some bug fixes and code refactor. A few of the highlights include:
- We add 7 new models and 2 new datasets.
- More flexible data processing. We reframe the overall data flow with PyTorch towards a compatible data module and add more task-oriented data processing methods.
- More user-friendly documentations. We update the website and documentation with detailed descriptions including visualization of benchmark configurations and more practical examples of the customized training strategy, multi-GPU training cases and detailed running steps. Besides, we also develop a FAQ page based on the existing GitHub issues of RecBole.
New Features
- Add 7 new models:
- Add 2 new datasets: Music4All-Onion (#1668), Amazon-M2 (#1828).
- Add the pretrain method to ConvNCF (#1651).
- Support converting results to latex code (#1645).
- Support different eval dataloaders for valid and test phases (#1666).
Bug Fixes
- Model:
- Fix a bug in DIEN: mask the padding value in aux loss and add softmax to attention values (#1485).
- Fix the bug of deepcopy in DCNV2, xDeepFM, SpectralCF, FOSSIL, HGN, SHAN and SINE (#1488).
- Fix a bug in EASE: change the data format (#1497).
- Fix a bug in NeuMF: fix the
load_pretrain
function (#1502). - Fix a bug in LINE: add log function when computing loss (#1507).
- Fix the field counts for float-like features in
abstract_recommender.py
(#1603). - Fix a bug in GCMC: change the last dense layer to
dense_layer_v
for item hidden representations (#1635). - Fix a bug in KD_DAGFM: use xavier_normal_initialization to initialize embedding (#1641).
- Fix a bug in KSR: add an extra param
kg_embedding_size
(#1647). - Fix a bug in S3Rec: load
item_seq
from gpu to cpu for indexing (#1651). - Fix a bug in
AutoEncoderMixin
: convert tensors into the correct device (#1749). - Fix a bug in DGCF: correct l2 distance computation (#1845).
- Dataset:
- Trainer:
- Util:
- Evaluator:
- Fix
data.count_users
incollector.py
(#1526).
- Fix
- Config:
- Main:
- Fix bugs when collecting results from
mp.spawn
in multi-GPU training (#1875).
- Fix bugs when collecting results from
- Typo:
- Fix typos in
dataset_list.json
(#1756).
- Fix typos in
Code Refactor
- Refactor all autoencoder models: add class
AutoEncoderMixin
and only set rating matrix to cuda whenget_rating_matrix
is called (#1491). - Refactor BERT4Rec: align with the original paper (#1522, #1639, #1859).
Docs
- Mask the ip information (#1479).
- Update docs of
train_neg_sample_args
parameter (#1513). - Add hypertune config docs (#1524).
- Add
model_list
anddataset_list
(#1525). - Add FiGNN to the
model_list
(#1548). - Add
numerical_feature
to docs (#1560). - Replace
neg_sampling
withtrain_neg_sample_args
in docs (#1569, #1570). - Add docs of KD_DAGFM (#1642).
- Add significant test (#1644).
- Add the rst file of FiGNN, KD_DAGFM and RecVAE (#1650).
- Add update for SIGIR 2023 in
README.md
(#1662). - Update
requirement.txt
(#1870).