Skip to content

Commit

Permalink
Merge pull request #1605 from microsoft/staging
Browse files Browse the repository at this point in the history
Staging to master for making release
  • Loading branch information
miguelgfierro authored Jan 12, 2022
2 parents dce8f71 + 7af6edd commit 652d101
Show file tree
Hide file tree
Showing 14 changed files with 94 additions and 64 deletions.
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# What's New

## Update January 13, 2022

We have a new release [Recommenders 1.0.0](https://github.com/microsoft/recommenders/releases/tag/1.0.0)! The codebase has now migrated to TensorFlow versions 2.6 / 2.7 and to Spark version 3. In addition, there are a few changes in the dependencies and extras installed by `pip` (see [this guide](recommenders/README.md#optional-dependencies)). We have also made improvements in the code and the CI / CD pipelines.

## Update September 27, 2021

Expand All @@ -13,7 +16,6 @@ We have also added new evaluation metrics: _novelty, serendipity, diversity and

Code coverage reports are now generated for every PR, using [Codecov](https://about.codecov.io/).


## Update June 21, 2021

We have a new release [Recommenders 0.6.0](https://github.com/microsoft/recommenders/releases/tag/0.6.0)!
Expand Down
73 changes: 40 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,15 @@

[![Documentation Status](https://readthedocs.org/projects/microsoft-recommenders/badge/?version=latest)](https://microsoft-recommenders.readthedocs.io/en/latest/?badge=latest)

## What's New (September 27, 2021)
## What's New (January 13, 2022)

We have a new release [Recommenders 0.7.0](https://github.com/microsoft/recommenders/releases/tag/0.7.0)!
We have a new release [Recommenders 1.0.0](https://github.com/microsoft/recommenders/releases/tag/1.0.0)! The codebase has now migrated to TensorFlow versions 2.6 / 2.7 and to Spark version 3. In addition, there are a few changes in the dependencies and extras installed by `pip` (see [this guide](recommenders/README.md#optional-dependencies)). We have also made improvements in the code and the CI / CD pipelines.

In this, we have changed the names of the folders which contain the source code, so that they are more informative. This implies that you will need to change any import statements that reference the recommenders package. Specifically, the folder `reco_utils` has been renamed to `recommenders` and its subfolders have been renamed according to [issue 1390](https://github.com/microsoft/recommenders/issues/1390).
Starting with release 0.6.0, Recommenders has been available on PyPI and can be installed using pip!

The recommenders package now supports three types of environments: [venv](https://docs.python.org/3/library/venv.html), [virtualenv](https://virtualenv.pypa.io/en/latest/index.html#) and [conda](https://docs.conda.io/projects/conda/en/latest/glossary.html?highlight=environment#conda-environment) with Python versions 3.6 and 3.7.

We have also added new evaluation metrics: _novelty, serendipity, diversity and coverage_ (see the [evalution notebooks](examples/03_evaluate/README.md)).

Code coverage reports are now generated for every PR, using [Codecov](https://about.codecov.io/).
Here you can find the PyPi page: https://pypi.org/project/recommenders/

Here you can find the package documentation: https://microsoft-recommenders.readthedocs.io/en/latest/

## Introduction

Expand All @@ -40,41 +37,51 @@ and currently does not support version 3.8 and above. It is recommended to insta

To set up on your local machine:

To install core utilities, CPU-based algorithms, and dependencies:
* To install core utilities, CPU-based algorithms, and dependencies:

1. Ensure software required for compilation and Python libraries
is installed.

+ On Linux this can be supported by adding:

```bash
sudo apt-get install -y build-essential libpython<version>
```

where `<version>` should be `3.6` or `3.7` as appropriate.

1. Ensure software required for compilation and Python libraries is installed. On Linux this can be supported by adding:
```bash
sudo apt-get install -y build-essential libpython<version>
```
where `<version>` should be `3.6` or `3.7` as appropriate.
+ On Windows you will need [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/).

On Windows you will need [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/).

2. Create a conda or virtual environment. See the [setup guide](SETUP.md) for more details.
2. Create a conda or virtual environment. See the
[setup guide](SETUP.md) for more details.

3. Within the created environment, install the package from [PyPI](https://pypi.org):
3. Within the created environment, install the package from
[PyPI](https://pypi.org):

```bash
pip install --upgrade pip
pip install --upgrade setuptools
pip install recommenders[examples]
```
```bash
pip install --upgrade pip
pip install --upgrade setuptools
pip install recommenders[examples]
```

4. Register your (conda or virtual) environment with Jupyter:
4. Register your (conda or virtual) environment with Jupyter:

```bash
python -m ipykernel install --user --name my_environment_name --display-name "Python (reco)"
```
```bash
python -m ipykernel install --user --name my_environment_name --display-name "Python (reco)"
```

5. Start the Jupyter notebook server
5. Start the Jupyter notebook server

```bash
jupyter notebook
```
```bash
jupyter notebook
```

6. Run the [SAR Python CPU MovieLens](examples/00_quick_start/sar_movielens.ipynb) notebook under the `00_quick_start` folder. Make sure to change the kernel to "Python (reco)".
6. Run the [SAR Python CPU MovieLens](examples/00_quick_start/sar_movielens.ipynb)
notebook under the `00_quick_start` folder. Make sure to
change the kernel to "Python (reco)".

For additional options to install the package (support for GPU, Spark etc.) see [this guide](recommenders/README.md).
* For additional options to install the package (support for GPU,
Spark etc.) see [this guide](recommenders/README.md).

**NOTE** - The [Alternating Least Squares (ALS)](examples/00_quick_start/als_movielens.ipynb) notebooks require a PySpark environment to run. Please follow the steps in the [setup guide](SETUP.md#dependencies-setup) to run these notebooks in a PySpark environment. For the deep learning algorithms, it is recommended to use a GPU machine and to follow the steps in the [setup guide](SETUP.md#dependencies-setup) to set up Nvidia libraries.

Expand Down
4 changes: 2 additions & 2 deletions SETUP.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ This document describes how to setup all the dependencies to run the notebooks i
* [Azure Databricks](https://azure.microsoft.com/en-us/services/databricks/)
* Docker container


## Table of Contents

- [Compute environments](#compute-environments)
Expand Down Expand Up @@ -397,7 +396,7 @@ You can then open the Jupyter notebook server at http://localhost:8888

The process of making a new release and publishing it to pypi is as follows:

First make sure that the tag that you want to add, e.g. `0.6.0`, is added in [recommenders.py/__init__.py](recommenders.py/__init__.py). Follow the [contribution guideline](CONTRIBUTING.md) to add the change.
First make sure that the tag that you want to add, e.g. `0.6.0`, is added in [`recommenders.py/__init__.py`](recommenders.py/__init__.py). Follow the [contribution guideline](CONTRIBUTING.md) to add the change.

1. Make sure that the code in main passes all the tests (unit and nightly tests).
1. Create a tag with the version number: e.g. `git tag -a 0.6.0 -m "Recommenders 0.6.0"`.
Expand All @@ -406,4 +405,5 @@ First make sure that the tag that you want to add, e.g. `0.6.0`, is added in [re
generates a wheel and a tar.gz which are uploaded to a [GitHub draft release](https://github.com/microsoft/recommenders/releases).
1. Fill up the draft release with all the recent changes in the code.
1. Download the wheel and tar.gz locally, these files shouldn't have any bug, since they passed all the tests.
1. Install twine: `pip install twine`
1. Publish the wheel and tar.gz to pypi: `twine upload recommenders*`
3 changes: 2 additions & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@ To setup the documentation, first you need to install the dependencies of the fu
conda activate reco_full

pip install numpy cython
pip install --no-binary scikit-surprise .[all,experimental]
pip install --no-binary scikit-surprise "scikit-surprise@https://github.com/NicolasHug/Surprise/archive/refs/tags/v1.1.1.tar.gz"
pip install "pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip"
pip install sphinx_rtd_theme


Expand Down
4 changes: 2 additions & 2 deletions examples/00_quick_start/tfidf_covid.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
"# TF-IDF Content-Based Recommendation on the COVID-19 Open Research Dataset\n",
"This demonstrates a simple implementation of Term Frequency Inverse Document Frequency (TF-IDF) content-based recommendation on the [COVID-19 Open Research Dataset](https://azure.microsoft.com/en-us/services/open-datasets/catalog/covid-19-open-research/), hosted through Azure Open Datasets.\n",
"\n",
"In this notebook, we will create a recommender which will return the top k recommended articles similar to any article of interest (query item) in the COVID-19 Open Reserach Dataset."
"In this notebook, we will create a recommender which will return the top k recommended articles similar to any article of interest (query item) in the COVID-19 Open Research Dataset."
]
},
{
Expand Down Expand Up @@ -1229,4 +1229,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
}
}
10 changes: 8 additions & 2 deletions recommenders/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ By default `recommenders` does not install all dependencies used throughout the
- experimental: current experimental dependencies that are being evaluated (e.g. libraries that require advanced build requirements or might conflict with libraries from other options)
- nni: dependencies for NNI tuning framework.

Note that, currently, xLearn, Surprise and Vowpal Wabbit are in the experimental group.
Note that, currently, xLearn and Vowpal Wabbit are in the experimental group.

These groups can be installed alone or in combination:
```bash
Expand Down Expand Up @@ -64,10 +64,16 @@ When installing with GPU support you will need to point to the PyTorch index to

We are currently evaluating inclusion of the following dependencies:

- scikit-surprise: due to incompatibilities with `numpy <= 1.19`, proper installation of Surprise requires `pip install numpy cython` and `pip install --no-binary scikit-surprise recommenders[experimental]`
- vowpalwabbit: current examples show how to use vowpal wabbit after it has been installed on the command line; using the [PyPI package](https://pypi.org/project/vowpalwabbit/) with the scikit-learn interface will facilitate easier integration into python environments
- xlearn: on some platforms, xLearn requires pre-installation of cmake.

## Other dependencies

Some dependencies are not available via the recommenders PyPI package, but can be installed in the following ways:
- scikit-surprise: due to incompatibilities with `numpy <= 1.19`, proper installation of Surprise requires `pip install numpy cython` and `pip install --no-binary scikit-surprise "scikit-surprise@https://github.com/NicolasHug/Surprise/archive/refs/tags/v1.1.1.tar.gz"`
- pymanopt: this dependency is required for the RLRMC and GeoIMC algorithms; a version of this code compatible with TensorFlow 2 can be
installed with `pip install "pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip"`.

## NNI dependencies

For NNI a more recent version can be installed but is untested.
Expand Down
11 changes: 6 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,6 @@
"memory_profiler>=0.54.0,<1",
"nltk>=3.4,<4",
"pydocumentdb>=2.3.3<3", # TODO: replace with azure-cosmos
# Temporary fix for pymanopt, only this commit works with TF2
"pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip",
"seaborn>=0.8.1,<1",
"transformers>=2.5.0,<5",
"bottleneck>=1.2.1,<2",
Expand Down Expand Up @@ -93,9 +91,6 @@
extras_require["experimental"] = [
# xlearn requires cmake to be pre-installed
"xlearn==0.40a1",
# Surprise needs to be built from source because of the numpy <= 1.19 incompatibility
# Requires pip to be run with the --no-binary option
"scikit-surprise@https://github.com/NicolasHug/Surprise/archive/refs/tags/v1.1.1.tar.gz",
# VW C++ binary needs to be installed manually for some code to work
"vowpalwabbit>=8.9.0,<9",
]
Expand All @@ -104,6 +99,12 @@
"nni==1.5",
]

# The following dependencies can be installed as below, however PyPI does not allow direct URLs.
# Surprise needs to be built from source because of the numpy <= 1.19 incompatibility
# Requires pip to be run with the --no-binary option
# "scikit-surprise@https://github.com/NicolasHug/Surprise/archive/refs/tags/v1.1.1.tar.gz",
# Temporary fix for pymanopt, only this commit works with TF2
# "pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip",

setup(
name="recommenders",
Expand Down
2 changes: 1 addition & 1 deletion tests/ci/azure_pipeline_test/dsvm_nightly_linux_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,6 @@ extends:
timeout: 180
conda_env: "nightly_linux_cpu"
conda_opts: "python=3.6"
pip_opts: "[examples,dev,experimental] --no-cache --no-binary scikit-surprise"
pip_opts: "[examples,dev,experimental] 'scikit-surprise@https://github.com/NicolasHug/Surprise/archive/refs/tags/v1.1.1.tar.gz' 'pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip' --no-cache --no-binary scikit-surprise"
pytest_markers: "not spark and not gpu"
pytest_params: "-x"
2 changes: 1 addition & 1 deletion tests/ci/azure_pipeline_test/dsvm_notebook_linux_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,5 +60,5 @@ extends:
task_name: "Test - Unit Notebook Linux CPU"
conda_env: "unit_notebook_linux_cpu"
conda_opts: "python=3.6"
pip_opts: "[examples,dev,experimental] --no-cache --no-binary scikit-surprise"
pip_opts: "[examples,dev,experimental] 'scikit-surprise@https://github.com/NicolasHug/Surprise/archive/refs/tags/v1.1.1.tar.gz' 'pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip' --no-cache --no-binary scikit-surprise"
pytest_markers: "notebooks and not spark and not gpu"
2 changes: 1 addition & 1 deletion tests/ci/azure_pipeline_test/dsvm_unit_linux_cpu.yml
Original file line number Diff line number Diff line change
Expand Up @@ -60,5 +60,5 @@ extends:
task_name: "Test - Unit Linux CPU"
conda_env: "unit_linux_cpu"
conda_opts: "python=3.6"
pip_opts: "[dev,experimental] --no-cache --no-binary scikit-surprise"
pip_opts: "[dev,experimental] 'scikit-surprise@https://github.com/NicolasHug/Surprise/archive/refs/tags/v1.1.1.tar.gz' 'pymanopt@https://github.com/pymanopt/pymanopt/archive/fb36a272cdeecb21992cfd9271eb82baafeb316d.zip' --no-cache --no-binary scikit-surprise"
pytest_markers: "not notebooks and not spark and not gpu"
1 change: 1 addition & 0 deletions tests/integration/examples/test_notebooks_python.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,6 +236,7 @@ def test_cornac_bpr_integration(


@pytest.mark.integration
@pytest.mark.experimental
@pytest.mark.parametrize(
"expected_values",
[({"rmse": 0.4969, "mae": 0.4761})],
Expand Down
1 change: 1 addition & 0 deletions tests/unit/examples/test_notebooks_python.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ def test_wikidata_runs(notebooks, output_notebook, kernel_name, tmp):
)


@pytest.mark.experimental
@pytest.mark.notebooks
def test_rlrmc_quickstart_runs(notebooks, output_notebook, kernel_name):
notebook_path = notebooks["rlrmc_quickstart"]
Expand Down
39 changes: 25 additions & 14 deletions tests/unit/recommenders/models/test_geoimc.py
Original file line number Diff line number Diff line change
@@ -1,20 +1,23 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.

import collections
import pytest
import numpy as np
from scipy.sparse import csr_matrix

from recommenders.models.geoimc.geoimc_data import DataPtr
from recommenders.models.geoimc.geoimc_predict import Inferer
from recommenders.models.geoimc.geoimc_algorithm import IMCProblem
from recommenders.models.geoimc.geoimc_utils import (
length_normalize,
mean_center,
reduce_dims,
)
from pymanopt.manifolds import Stiefel, SymmetricPositiveDefinite
try:
import collections
import pytest
import numpy as np
from scipy.sparse import csr_matrix

from recommenders.models.geoimc.geoimc_data import DataPtr
from recommenders.models.geoimc.geoimc_predict import Inferer
from recommenders.models.geoimc.geoimc_algorithm import IMCProblem
from recommenders.models.geoimc.geoimc_utils import (
length_normalize,
mean_center,
reduce_dims,
)
from pymanopt.manifolds import Stiefel, SymmetricPositiveDefinite
except:
pass # skip if pymanopt not installed

_IMC_TEST_DATA = [
(
Expand All @@ -35,6 +38,7 @@


# `geoimc_data` tests
@pytest.mark.experimental
@pytest.mark.parametrize("data, entities", _IMC_TEST_DATA)
def test_dataptr(data, entities):
ptr = DataPtr(data, entities)
Expand All @@ -44,6 +48,7 @@ def test_dataptr(data, entities):


# `geoimc_utils` tests
@pytest.mark.experimental
@pytest.mark.parametrize(
"matrix",
[
Expand All @@ -59,6 +64,7 @@ def test_length_normalize(matrix):
)


@pytest.mark.experimental
@pytest.mark.parametrize(
"matrix",
[
Expand All @@ -73,19 +79,22 @@ def test_mean_center(matrix):
)


@pytest.mark.experimental
def test_reduce_dims():
matrix = np.random.rand(100, 100)
assert reduce_dims(matrix, 50).shape[1] == 50


# `geoimc_algorithm` tests
@pytest.mark.experimental
@pytest.mark.parametrize(
"dataPtr, rank",
[
(DataPtr(_IMC_TEST_DATA[0][0], _IMC_TEST_DATA[0][1]), 3),
(DataPtr(_IMC_TEST_DATA[1][0], _IMC_TEST_DATA[1][1]), 3),
],
)
@pytest.mark.experimental
def test_imcproblem(dataPtr, rank):

# Test init
Expand All @@ -110,10 +119,12 @@ def test_imcproblem(dataPtr, rank):


# `geoimc_predict` tests
@pytest.mark.experimental
def test_inferer_init():
assert Inferer(method="dot").method.__name__ == "PlainScalarProduct"


@pytest.mark.experimental
@pytest.mark.parametrize(
"dataPtr",
[
Expand Down
2 changes: 1 addition & 1 deletion tests/unit/recommenders/models/test_surprise_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
compute_ranking_predictions,
)
except:
pass # skip if experimental not installed
pass # skip if surprise not installed

TOL = 0.001

Expand Down

0 comments on commit 652d101

Please sign in to comment.