Releases: lenskit/lkpy
LensKit 0.13.0 - critical bugs fixed
We're pleased to release LensKit 0.13!
Major Fixes
This release includes two critical fixes, for which everyone should upgrade:
- The
Bias
model'stransform
andinverse_transform
methods were incorrect (#265). These bugs did not affectBias
when used as a predictor or a recommender, but they did affect any model usingBias
as a normalization step, namely the biased matrix factorizers (since version 0.11, when this API was added). - Previous versions of LensKit did not clean up temporary files (or, on Python 3.8 and later, shared memory resources) when running parallel evaluation processes.
It also includes significant performance improvements and code to detect common problems with parallel processing configurations, and is tested on Python 3.9 and on Linux AArch64 (64-bit ARM).
Future Changes
This release deprecates two sets of APIs that will be removed in LensKit 0.14:
MultiEval
(#254) - it doesn't work well for realistic projects, and simple evaluations are easy enough to write in a loop, so we will be removingMultiEval
to reduce our maintenance burden going forward.- RNG seed management APIs - these are replaced by seedbank. In 0.13, the APIs are kept as compatibility shims for their SeedBank replacements, but we will remove them in 0.14 in favor of directly calling seedbank.
We haven't yet adopted any formal deprecation policies for LensKit, but my current tentative plan is to use this next-release cadence for nontrivial removals while we're still releasing 0.x versions; once we decide to bump to 4.x, we will use semantic versioning on all public APIs, and thus deprecations will not be enforced until the next major release.
In a future LensKit, I tentatively plan to factor out several of our bridges (TensorFlow, Implicit, HPF) into separate projects. We will keep compatibility imports for at least one 0.x release, and probably until 4.0. This will reduce the development overhead of the LensKit core.
What’s Changed
- Remove fastparquet import (#266) @mdekstrand
- Fix incorrect user bias transformation (#265) @mdekstrand
- Revise dependency specifications (#264) @mdekstrand
- Update deprecation notices (#263) @mdekstrand
- Detect problems with runtime environments (#248) @mdekstrand
- Add use_ratings option to ImplicitMF (#245) @mdekstrand
- Add 'k' support to top-N metrics (#247) @mdekstrand
- Further Top-N optimization updates (#242) @mdekstrand
- Free shared memory in parallel (#243) @mdekstrand
- Optimize top-N analysis (#237) @mdekstrand
- Add PlackettLuce stochastic ranking algorithm (#241) @mdekstrand
- Add PopScore algorithm for popularity-based scoring (#240) @mdekstrand
- Refactor ranking into a 'ranking' module (#239) @mdekstrand
- Enable tests on Python 3.9 (#234) @mdekstrand
- Deprecate MultiEval (#238) @mdekstrand
- Add more logging output to parallelism code (#236) @mdekstrand
- Add convenience prediction accuracy functions (#235) @mdekstrand
Really publish
This actually publishes the 0.12 bump, a tagging error prevented 0.12.2 from going out.
Education Bugfixes
This release contains the sampling function refactor (#230), and documentation improvements.
The Clone SVDs
No Longer Candide
This version of LensKit splits out the CSR routines into a separate CSR package, allowing LensKit to be a pure Python package.
This also makes a major change to TensorFlow BPR, using popularity-weighted negative sampling by default (this can be disabled with neg_weight=False
), and makes our TF recommenders much faster.
What’s Changed
- Use popularity-weighted sampling in BPR by default (#223) @mdekstrand
- Fix TF performance (#222) @mdekstrand
- Update to CSR 0.2 (#221) @mdekstrand
- use CSR from conda-forge (#220) @mdekstrand
- knn: don't add item means to similarity sums (#217) @mdekstrand
- Use flit to build LKPY (#219) @mdekstrand
- Remove CSR class in favor of separate library (#218) @mdekstrand
- Fix tests on MacOS OpenBLAS (#215) @mdekstrand
- Allow scipy='coo' in sparse_ratings (#214) @reppertj
- Add keystone test depending on others (#212) @mdekstrand
- Test on multiple BLAS versions (#211) @mdekstrand
- Support Numba 0.52 (#210) @mdekstrand
- Add option to drop user features after training ALS models (#209) @carlos10seg
- Add tests for ALS load/save (#207) @mdekstrand
Building It Live
This release is just to fix a build problem in 0.11.0 that prevented automatic package publication.
Let's Do It Live
This release brings a number of functionality and performance improvements. Highlights include:
- Refactoring the
Bias
model and using it consistently instead of re-implementing pieces in matrix factorizers - Support new ratings from a user in both ALS recommenders
- Fix crash when TensorFlow 1 is installed
The main-channel Conda packages for this release have disabled MKL support in macOS, due to environmental factors causing the build to fail. LensKit will still work fine in MKL environments, it just won't use its MKL-based k-NN acceleration on macOS. Linux and Windows are still unchanged. With 0.11, we will also begin publishing packages to conda-forge
; we expect MKL acceleration to work in that environment.
What’s Changed
- Update @actions/core for CI build (#205) @mdekstrand
- ALS: Refactor common matrix & fix tests (#204) @mdekstrand
- added new ratings for predict method in ImplicitMF (#202) @carlos10seg
- Remove BiasedMFPredictor in favor of Bias (#201) @mdekstrand
- Fix failures with unexpected parallel package installs (#199) @mdekstrand
- [DOC/FIX] Correction in ImplicitMF docstring (#196) @ShwetanshuSingh
- Add known-rating predictor (#182) (#184) @carlos10seg
- Move Bias class into
bias
package (#175) (#183) @carlos10seg - Bump @actions/core from 1.2.3 to 1.2.6 in /.github/actions/conda-env (#194) @dependabot
- Bump Numba support to 0.51 (#186) @mdekstrand
- Fix ALS run-time training (#114) for empty rating series (#187) @carlos10seg
- Add run-time training to ALS BiasedMF (#114) (#173) @carlos10seg
- Add transform_user and inverse_transform_user methods to bias. (#181) @carlos10seg
Better Processes
This release makes some improvements to multi-process support and item-item kNN resource use.
What’s Changed
- Manage random seeds in subprocesses (#179) @mdekstrand
- Support Numba 0.50 (#178) @mdekstrand
- Use parallel blocks for SciPy-based item-item CF training (#177) @mdekstrand
- Improve MP worker detection and disable item-item parallelism when run under MP (#176) @mdekstrand
- Use Hypothesis for testing and clean up tests (#172) @mdekstrand
- Remove unused math routines (#171) @mdekstrand
Flow
Highlights of this release are significant improvements to parallel processing (we no longer use joblib), shared memory, and our first TensorFlow integrations.
What’s Changed
- Reorganize and improve documentation (#169) @mdekstrand
- Improve RecListAnalysis performance and parallelize (#164) @mdekstrand
- Make persistence configurable & reduce open file count (#165) @mdekstrand
- Fix Python versions and conda environments in CI builds (#163) @mdekstrand
- Add TensorFlow support (#159) @mdekstrand
- Improve parallel configuration and docs (#161) @mdekstrand
- Use setup.cfg for all dev deps, including in Conda (#160) @mdekstrand
- Add fit_transform API to Bias (#158) @mdekstrand
- Remove dead code and add tests (#157) @mdekstrand
- Add scikit-learn SVD (#156) @mdekstrand
- Remove old sharing and file APIs (#155) @mdekstrand
- Use ProcessPoolExecutor instead of joblib (#154) @mdekstrand
- Add 'persist' API to sharing (#153) @mdekstrand
Gone in 60 Seconds
This release has some performance and improvements, including full Python 3.8, Pandas 1.0, and Numba 0.49 testing.
This is the last release we expect to use JobLib to parallelize batch prediction and recommendation. Any Python scripts that call the batch routines (batch.predict
, batch.recommend
, or MultiEval
) need to be import-protected: their code needs to be in functions, and only invoked with a __name__
guard:
if __name__ = '__main__':
do_stuff()
Unprotected scripts (where the code is just in the script, and runs when the script is imported as a module) will probably still work with LensKit 0.9, but will not work in the next version of LensKit. Jupyter notebooks should be just fine - when they are run, the IPython kernel is actually running, and it is properly protected.
What’s Changed
- Improving testing with minimal dependencies (#151) @mdekstrand
- Skip predictions when no ratings to predict (#149) @mdekstrand
- Use BinPickle for sharing (#148) @mdekstrand
- Support iterating over training iterations (#144) @mdekstrand
- Fix for Numba 0.49 compatibility (#146) @mdekstrand
- Use GitHub Actions for CI (#143) @mdekstrand
- Use declarative configuration for builds (#142) @mdekstrand
- Add model stores for batch multiprocessing (#139) @mdekstrand
- Improve top-N metric performance (#140) @mdekstrand
- Fix Conda Python 3.8 testing (#138) @mdekstrand
- Unify configuration points (#137) @mdekstrand
- Clean up RNG infrastructure (#136) @mdekstrand
- Add configurable RNG infrastructure (#135) @mdekstrand
- Remove deprecated and unused features (#134) @mdekstrand
- Version bumps - Pandas 1.0 and Python 3.8 (#133) @mdekstrand