{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Examples of calculating different metrics with RecTools\n", "\n", "**Table of Contents**\n", "\n", "* Load and preprocess data: Movielens\n", "* Build model\n", "* Calculate metrics\n", " * Metrics initialization\n", " * Single metric calculation\n", " * Per user metric calculation\n", " * Multiple metrics calculation with one function\n", "\n", "#### We provide all types of metrics to measure model performance from different aspects\n", "\n", "- Classification:\n", " - **HitRate**, **Precision** (with **R-Precision** variant which divides by minimum between k and number of user test items), **Recall**, **Accuracy**, **MCC**, **F1Beta**\n", "\n", "- Ranking:\n", " - **MRR**, **MAP** (with an option to divide by k ot by number of user test items), **NDCG** (with an option to select log base)\n", "\n", "- Advanced AUC based ranking:\n", " - **PartialAUC**, **PAP** (Partial AUC + Precision joint metric)\n", "\n", "- Beyond Accuracy:\n", " - **Serendipity**, **MeanInvUserFreq** (mean inverse user frequency to calculty \"novelty\"), **Intra-List Diversity** (based on some meta features of items)\n", "\n", "- Popularity bias:\n", " - **AvgRecPopularity**\n", "\n", "- Recommendations data quality:\n", " - **SufficientReco** (share of filled recommendations in the list), **UnrepeatedReco** (share of unique items recommended for each user), **CoveredUsers** (share of test users that have at least one recommendation)\n", "\n", "- Between-model comparison:\n", " - **Intersection** (share of common user-item pairs between different models recommendations)" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Users/dmtikhonov/git_project/metrics/RecTools/.venv/lib/python3.10/site-packages/lightfm/_lightfm_fast.py:9: UserWarning: LightFM was compiled without OpenMP support. Only a single thread will be used.\n", " warnings.warn(\n" ] } ], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "from implicit.nearest_neighbours import TFIDFRecommender\n", "\n", "from rectools import Columns\n", "from rectools.dataset import Dataset\n", "from rectools.metrics import (\n", " Precision,\n", " NDCG,\n", " AvgRecPopularity,\n", " Intersection,\n", " HitRate,\n", " SufficientReco,\n", " DebiasConfig,\n", " IntraListDiversity,\n", " Serendipity,\n", " calc_metrics,\n", ")\n", "from rectools.metrics.distances import PairwiseHammingDistanceCalculator\n", "from rectools.models import ImplicitItemKNNWrapperModel" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load and preprocess data: Movielens" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Archive: ml-1m.zip\n", " inflating: ml-1m/movies.dat \n", " inflating: ml-1m/ratings.dat \n", " inflating: ml-1m/README \n", " inflating: ml-1m/users.dat \n", "CPU times: user 125 ms, sys: 55.1 ms, total: 180 ms\n", "Wall time: 5.83 s\n" ] } ], "source": [ "%%time\n", "!wget -q https://files.grouplens.org/datasets/movielens/ml-1m.zip -O ml-1m.zip\n", "!unzip -o ml-1m.zip\n", "!rm ml-1m.zip" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(1000209, 4)\n", "CPU times: user 2.27 s, sys: 71.3 ms, total: 2.34 s\n", "Wall time: 2.35 s\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
user_iditem_idweightdatetime
0111935978300760
116613978302109
219143978301968
3134084978300275
4123555978824291
\n", "
" ], "text/plain": [ " user_id item_id weight datetime\n", "0 1 1193 5 978300760\n", "1 1 661 3 978302109\n", "2 1 914 3 978301968\n", "3 1 3408 4 978300275\n", "4 1 2355 5 978824291" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "ratings = pd.read_csv(\n", " \"ml-1m/ratings.dat\",\n", " sep=\"::\",\n", " engine=\"python\", # Because of 2-chars separators\n", " header=None,\n", " names=[Columns.User, Columns.Item, Columns.Weight, Columns.Datetime],\n", ")\n", "print(ratings.shape)\n", "ratings.head()" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(Timestamp('2000-04-25 23:05:32'), Timestamp('2003-02-28 17:49:50'))" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ratings[\"datetime\"] = pd.to_datetime(ratings[\"datetime\"] * 10 ** 9)\n", "ratings[\"datetime\"].min(), ratings[\"datetime\"].max()" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(3883, 3)\n", "CPU times: user 6.26 ms, sys: 1.53 ms, total: 7.79 ms\n", "Wall time: 8.6 ms\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
item_idtitlegenres
01Toy Story (1995)Animation|Children's|Comedy
12Jumanji (1995)Adventure|Children's|Fantasy
23Grumpier Old Men (1995)Comedy|Romance
34Waiting to Exhale (1995)Comedy|Drama
45Father of the Bride Part II (1995)Comedy
\n", "
" ], "text/plain": [ " item_id title genres\n", "0 1 Toy Story (1995) Animation|Children's|Comedy\n", "1 2 Jumanji (1995) Adventure|Children's|Fantasy\n", "2 3 Grumpier Old Men (1995) Comedy|Romance\n", "3 4 Waiting to Exhale (1995) Comedy|Drama\n", "4 5 Father of the Bride Part II (1995) Comedy" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%time\n", "movies = pd.read_csv(\n", " \"ml-1m/movies.dat\",\n", " sep=\"::\",\n", " engine=\"python\", # Because of 2-chars separators\n", " header=None,\n", " names=[Columns.Item, \"title\", \"genres\"],\n", " encoding_errors=\"ignore\",\n", ")\n", "print(movies.shape)\n", "movies.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Build model" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "# Split once by train and test to demonstrate how different metrics work\n", "split_dt = pd.Timestamp(\"2003-02-01\")\n", "df_train = ratings.loc[ratings[\"datetime\"] < split_dt]\n", "df_test = ratings.loc[ratings[\"datetime\"] >= split_dt]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CPU times: user 1.02 s, sys: 40.5 ms, total: 1.06 s\n", "Wall time: 1.08 s\n" ] } ], "source": [ "%%time\n", "\n", "# Prepare dataset, fit model and generate recommendations\n", "dataset = Dataset.construct(df_train)\n", "model = ImplicitItemKNNWrapperModel(TFIDFRecommender(K=10))\n", "model.fit(dataset)\n", "recos = model.recommend(\n", " users=ratings[Columns.User].unique(),\n", " dataset=dataset,\n", " k=10,\n", " filter_viewed=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Calculate metrics" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Metrics initialization" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To calculate a metric it is necessary to create its object.\n", "\n", "Most metrics have `k` parameter - the number of top recommendations that will be used for metric calculation. Some metrics have additional parameters." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Simple metrics\n" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "serendipity = Serendipity(k=10)\n", "precision = Precision(k=10, r_precision=True) # r_precision means division by min(k, n_user_test_items)\n", "ndcg = NDCG(k=10, log_base=3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Metric with complex additional parameter" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To calculate any diversity metric (e.g. `IntraListDivirsity`) you need to measure distance between items.\n", "\n", "For example, you can use Hamming distance.\n", "\n", "As features, let's use movie genres." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ActionAdventureAnimationChildren'sComedyCrimeDocumentaryDramaFantasyFilm-NoirHorrorMusicalMysteryRomanceSci-FiThrillerWarWestern
item_id
1001110000000000000
2010100001000000000
3000010000000010000
4000010010000000000
5000010000000000000
\n", "
" ], "text/plain": [ " Action Adventure Animation Children's Comedy Crime Documentary \\\n", "item_id \n", "1 0 0 1 1 1 0 0 \n", "2 0 1 0 1 0 0 0 \n", "3 0 0 0 0 1 0 0 \n", "4 0 0 0 0 1 0 0 \n", "5 0 0 0 0 1 0 0 \n", "\n", " Drama Fantasy Film-Noir Horror Musical Mystery Romance Sci-Fi \\\n", "item_id \n", "1 0 0 0 0 0 0 0 0 \n", "2 0 1 0 0 0 0 0 0 \n", "3 0 0 0 0 0 0 1 0 \n", "4 1 0 0 0 0 0 0 0 \n", "5 0 0 0 0 0 0 0 0 \n", "\n", " Thriller War Western \n", "item_id \n", "1 0 0 0 \n", "2 0 0 0 \n", "3 0 0 0 \n", "4 0 0 0 \n", "5 0 0 0 " ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "movies[\"genre\"] = movies[\"genres\"].str.split(\"|\")\n", "genre_exploded = movies[[\"item_id\", \"genre\"]].set_index(\"item_id\").explode(\"genre\")\n", "genre_dummies = pd.get_dummies(genre_exploded, prefix=\"\", prefix_sep=\"\").groupby(\"item_id\").sum()\n", "genre_dummies.head()" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [], "source": [ "distance_calculator = PairwiseHammingDistanceCalculator(genre_dummies)\n", "ild = IntraListDiversity(k=10, distance_calculator=distance_calculator)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Single metric calculation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The easiest way to calculate metric is to use `calc` method.\n", "\n", "Every metric has it, but arguments are different." ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "precision: 0.08501683501683503\n" ] } ], "source": [ "precision_value = precision.calc(reco=recos, interactions=df_test)\n", "print(f\"precision: {precision_value}\")" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Serendipity: 2.3436131849908687e-05\n" ] } ], "source": [ "catalog = df_train[Columns.Item].unique()\n", "\n", "serendipity_value = serendipity.calc(\n", " reco=recos,\n", " interactions=df_test,\n", " prev_interactions=df_train,\n", " catalog=catalog\n", ")\n", "print(\"Serendipity: \", serendipity_value)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "NDCG: 0.06808226116073855\n" ] } ], "source": [ "print(\"NDCG: \", ndcg.calc(reco=recos, interactions=df_test))" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "ILD: 3.1908278145695363\n", "CPU times: user 460 ms, sys: 39.5 ms, total: 499 ms\n", "Wall time: 501 ms\n" ] } ], "source": [ "%%time\n", "print(\"ILD: \", ild.calc(reco=recos))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Per user metric calculation\n", "If you need to get metric value for every user, use `calc_per_user` method." ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "precision per user:\n" ] }, { "data": { "text/plain": [ "user_id\n", "195 0.3\n", "229 0.0\n", "343 0.0\n", "349 0.0\n", "398 0.5\n", "dtype: float64" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Values are equal? True\n" ] } ], "source": [ "precision_per_user = precision.calc_per_user(reco=recos, interactions=df_test)\n", "print(\"\\nprecision per user:\")\n", "display(precision_per_user.head())\n", "\n", "print(\"Values are equal? \", precision_per_user.mean() == precision_value)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Multiple metrics calculation with one function" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is possible to calculate a bunch of metrics using only one function - `calc_metrics`.\n", "\n", "It contains important optimisations in performance: if several metrics do the same calculations, they will be performed only once." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "{'hit_rate@10': 0.1717171717171717,\n", " 'hit_rate_debiased@10': 0.16161616161616163,\n", " 'ndcg@10': 0.06808226116073855,\n", " 'pop_bias@10': 0.0017362993523658327,\n", " 'diversity@10': 3.1908278145695363,\n", " 'serendipity@10': 2.3436131849908687e-05,\n", " 'intersection@10_same_model': 1.0,\n", " 'sufficient@10': 1.0,\n", " 'sufficient@20': 0.5}" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Here we provide a debias config for one of the metrics apart from calculating it's regular value\n", "# Check our \"Debiased metrics calculation user guied\" for more info\n", "\n", "metrics = {\n", " \"hit_rate@10\": HitRate(k=10),\n", " \"hit_rate_debiased@10\": HitRate(k=10, debias_config=DebiasConfig(iqr_coef=1.5, random_state=32)),\n", " \"sufficient@10\": SufficientReco(k=10, deep=True),\n", " \"sufficient@20\": SufficientReco(k=20, deep=True),\n", " \"pop_bias@10\": AvgRecPopularity(k=10, normalize=True),\n", " \"ndcg@10\": ndcg,\n", " \"serendipity@10\": serendipity,\n", " \"diversity@10\": ild,\n", " \"intersection@10\": Intersection(k=10)\n", "}\n", "\n", "\n", "# Some arguments can be omitted if they are not needed for metrics calculation.\n", "calc_metrics(\n", " metrics,\n", " reco=recos,\n", " interactions=df_test, # needed fo all `TruePositive` based metrics\n", " prev_interactions=df_train, # needed for serendipity\n", " catalog=catalog, # needed for serendipity\n", " ref_reco = {\"same_model\": recos} # needed for intersection. usually this should be recos from a different model\n", ")" ] } ], "metadata": { "kernelspec": { "display_name": ".venv", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 1 }