{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Examples of calculating different metrics with RecTools\n",
    "\n",
    "**Table of Contents**\n",
    "\n",
    "* Load and preprocess data: Movielens\n",
    "* Build model\n",
    "* Calculate metrics\n",
    "    * Metrics initialization\n",
    "    * Single metric calculation\n",
    "        * Per user metric calculation\n",
    "    * Multiple metrics calculation with one function\n",
    "\n",
    "#### We provide all types of metrics to measure model performance from different aspects\n",
    "\n",
    "- Classification:\n",
    "    - **HitRate**, **Precision** (with **R-Precision** variant which divides by minimum between k and number of user test items), **Recall**, **Accuracy**, **MCC**, **F1Beta**\n",
    "\n",
    "- Ranking:\n",
    "    - **MRR**, **MAP** (with an option to divide by k ot by number of user test items), **NDCG** (with an option to select log base)\n",
    "\n",
    "- Advanced AUC based ranking:\n",
    "    - **PartialAUC**, **PAP** (Partial AUC + Precision joint metric)\n",
    "\n",
    "- Beyond Accuracy:\n",
    "    - **Serendipity**, **MeanInvUserFreq** (mean inverse user frequency to calculty \"novelty\"), **Intra-List Diversity** (based on some meta features of items)\n",
    "\n",
    "- Popularity bias:\n",
    "    - **AvgRecPopularity**\n",
    "\n",
    "- Recommendations data quality:\n",
    "    - **SufficientReco** (share of filled recommendations in the list), **UnrepeatedReco** (share of unique items recommended for each user), **CoveredUsers** (share of test users that have at least one recommendation)\n",
    "\n",
    "- Between-model comparison:\n",
    "    - **Intersection** (share of common user-item pairs between different models recommendations)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/dmtikhonov/git_project/metrics/RecTools/.venv/lib/python3.10/site-packages/lightfm/_lightfm_fast.py:9: UserWarning: LightFM was compiled without OpenMP support. Only a single thread will be used.\n",
      "  warnings.warn(\n"
     ]
    }
   ],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "from implicit.nearest_neighbours import TFIDFRecommender\n",
    "\n",
    "from rectools import Columns\n",
    "from rectools.dataset import Dataset\n",
    "from rectools.metrics import (\n",
    "    Precision,\n",
    "    NDCG,\n",
    "    AvgRecPopularity,\n",
    "    Intersection,\n",
    "    HitRate,\n",
    "    SufficientReco,\n",
    "    DebiasConfig,\n",
    "    IntraListDiversity,\n",
    "    Serendipity,\n",
    "    calc_metrics,\n",
    ")\n",
    "from rectools.metrics.distances import PairwiseHammingDistanceCalculator\n",
    "from rectools.models import ImplicitItemKNNWrapperModel"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load and preprocess data: Movielens"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Archive:  ml-1m.zip\n",
      "  inflating: ml-1m/movies.dat        \n",
      "  inflating: ml-1m/ratings.dat       \n",
      "  inflating: ml-1m/README            \n",
      "  inflating: ml-1m/users.dat         \n",
      "CPU times: user 125 ms, sys: 55.1 ms, total: 180 ms\n",
      "Wall time: 5.83 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "!wget -q https://files.grouplens.org/datasets/movielens/ml-1m.zip -O ml-1m.zip\n",
    "!unzip -o ml-1m.zip\n",
    "!rm ml-1m.zip"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(1000209, 4)\n",
      "CPU times: user 2.27 s, sys: 71.3 ms, total: 2.34 s\n",
      "Wall time: 2.35 s\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>user_id</th>\n",
       "      <th>item_id</th>\n",
       "      <th>weight</th>\n",
       "      <th>datetime</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>1193</td>\n",
       "      <td>5</td>\n",
       "      <td>978300760</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>661</td>\n",
       "      <td>3</td>\n",
       "      <td>978302109</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1</td>\n",
       "      <td>914</td>\n",
       "      <td>3</td>\n",
       "      <td>978301968</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1</td>\n",
       "      <td>3408</td>\n",
       "      <td>4</td>\n",
       "      <td>978300275</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>1</td>\n",
       "      <td>2355</td>\n",
       "      <td>5</td>\n",
       "      <td>978824291</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   user_id  item_id  weight   datetime\n",
       "0        1     1193       5  978300760\n",
       "1        1      661       3  978302109\n",
       "2        1      914       3  978301968\n",
       "3        1     3408       4  978300275\n",
       "4        1     2355       5  978824291"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "ratings = pd.read_csv(\n",
    "    \"ml-1m/ratings.dat\",\n",
    "    sep=\"::\",\n",
    "    engine=\"python\",  # Because of 2-chars separators\n",
    "    header=None,\n",
    "    names=[Columns.User, Columns.Item, Columns.Weight, Columns.Datetime],\n",
    ")\n",
    "print(ratings.shape)\n",
    "ratings.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Timestamp('2000-04-25 23:05:32'), Timestamp('2003-02-28 17:49:50'))"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ratings[\"datetime\"] = pd.to_datetime(ratings[\"datetime\"] * 10 ** 9)\n",
    "ratings[\"datetime\"].min(), ratings[\"datetime\"].max()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "(3883, 3)\n",
      "CPU times: user 6.26 ms, sys: 1.53 ms, total: 7.79 ms\n",
      "Wall time: 8.6 ms\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>item_id</th>\n",
       "      <th>title</th>\n",
       "      <th>genres</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1</td>\n",
       "      <td>Toy Story (1995)</td>\n",
       "      <td>Animation|Children's|Comedy</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2</td>\n",
       "      <td>Jumanji (1995)</td>\n",
       "      <td>Adventure|Children's|Fantasy</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3</td>\n",
       "      <td>Grumpier Old Men (1995)</td>\n",
       "      <td>Comedy|Romance</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>4</td>\n",
       "      <td>Waiting to Exhale (1995)</td>\n",
       "      <td>Comedy|Drama</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5</td>\n",
       "      <td>Father of the Bride Part II (1995)</td>\n",
       "      <td>Comedy</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   item_id                               title                        genres\n",
       "0        1                    Toy Story (1995)   Animation|Children's|Comedy\n",
       "1        2                      Jumanji (1995)  Adventure|Children's|Fantasy\n",
       "2        3             Grumpier Old Men (1995)                Comedy|Romance\n",
       "3        4            Waiting to Exhale (1995)                  Comedy|Drama\n",
       "4        5  Father of the Bride Part II (1995)                        Comedy"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "%%time\n",
    "movies = pd.read_csv(\n",
    "    \"ml-1m/movies.dat\",\n",
    "    sep=\"::\",\n",
    "    engine=\"python\",  # Because of 2-chars separators\n",
    "    header=None,\n",
    "    names=[Columns.Item, \"title\", \"genres\"],\n",
    "    encoding_errors=\"ignore\",\n",
    ")\n",
    "print(movies.shape)\n",
    "movies.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Split once by train and test to demonstrate how different metrics work\n",
    "split_dt = pd.Timestamp(\"2003-02-01\")\n",
    "df_train = ratings.loc[ratings[\"datetime\"] < split_dt]\n",
    "df_test = ratings.loc[ratings[\"datetime\"] >= split_dt]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "CPU times: user 1.02 s, sys: 40.5 ms, total: 1.06 s\n",
      "Wall time: 1.08 s\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "\n",
    "# Prepare dataset, fit model and generate recommendations\n",
    "dataset = Dataset.construct(df_train)\n",
    "model = ImplicitItemKNNWrapperModel(TFIDFRecommender(K=10))\n",
    "model.fit(dataset)\n",
    "recos = model.recommend(\n",
    "    users=ratings[Columns.User].unique(),\n",
    "    dataset=dataset,\n",
    "    k=10,\n",
    "    filter_viewed=True,\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Calculate metrics"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Metrics initialization"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To calculate a metric it is necessary to create its object.\n",
    "\n",
    "Most metrics have `k` parameter - the number of top recommendations that will be used for metric calculation. Some metrics have additional parameters."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Simple metrics\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "serendipity = Serendipity(k=10)\n",
    "precision = Precision(k=10, r_precision=True)  # r_precision means division by min(k, n_user_test_items)\n",
    "ndcg = NDCG(k=10, log_base=3)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Metric with complex additional parameter"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To calculate any diversity metric (e.g. `IntraListDivirsity`) you need to measure distance between items.\n",
    "\n",
    "For example, you can use Hamming distance.\n",
    "\n",
    "As features, let's use movie genres."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Action</th>\n",
       "      <th>Adventure</th>\n",
       "      <th>Animation</th>\n",
       "      <th>Children's</th>\n",
       "      <th>Comedy</th>\n",
       "      <th>Crime</th>\n",
       "      <th>Documentary</th>\n",
       "      <th>Drama</th>\n",
       "      <th>Fantasy</th>\n",
       "      <th>Film-Noir</th>\n",
       "      <th>Horror</th>\n",
       "      <th>Musical</th>\n",
       "      <th>Mystery</th>\n",
       "      <th>Romance</th>\n",
       "      <th>Sci-Fi</th>\n",
       "      <th>Thriller</th>\n",
       "      <th>War</th>\n",
       "      <th>Western</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>item_id</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Action  Adventure  Animation  Children's  Comedy  Crime  Documentary  \\\n",
       "item_id                                                                         \n",
       "1             0          0          1           1       1      0            0   \n",
       "2             0          1          0           1       0      0            0   \n",
       "3             0          0          0           0       1      0            0   \n",
       "4             0          0          0           0       1      0            0   \n",
       "5             0          0          0           0       1      0            0   \n",
       "\n",
       "         Drama  Fantasy  Film-Noir  Horror  Musical  Mystery  Romance  Sci-Fi  \\\n",
       "item_id                                                                         \n",
       "1            0        0          0       0        0        0        0       0   \n",
       "2            0        1          0       0        0        0        0       0   \n",
       "3            0        0          0       0        0        0        1       0   \n",
       "4            1        0          0       0        0        0        0       0   \n",
       "5            0        0          0       0        0        0        0       0   \n",
       "\n",
       "         Thriller  War  Western  \n",
       "item_id                          \n",
       "1               0    0        0  \n",
       "2               0    0        0  \n",
       "3               0    0        0  \n",
       "4               0    0        0  \n",
       "5               0    0        0  "
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "movies[\"genre\"] = movies[\"genres\"].str.split(\"|\")\n",
    "genre_exploded = movies[[\"item_id\", \"genre\"]].set_index(\"item_id\").explode(\"genre\")\n",
    "genre_dummies = pd.get_dummies(genre_exploded, prefix=\"\", prefix_sep=\"\").groupby(\"item_id\").sum()\n",
    "genre_dummies.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "distance_calculator = PairwiseHammingDistanceCalculator(genre_dummies)\n",
    "ild = IntraListDiversity(k=10, distance_calculator=distance_calculator)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Single metric calculation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The easiest way to calculate metric is to use `calc` method.\n",
    "\n",
    "Every metric has it, but arguments are different."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "precision: 0.08501683501683503\n"
     ]
    }
   ],
   "source": [
    "precision_value = precision.calc(reco=recos, interactions=df_test)\n",
    "print(f\"precision: {precision_value}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Serendipity:  2.3436131849908687e-05\n"
     ]
    }
   ],
   "source": [
    "catalog = df_train[Columns.Item].unique()\n",
    "\n",
    "serendipity_value = serendipity.calc(\n",
    "    reco=recos,\n",
    "    interactions=df_test,\n",
    "    prev_interactions=df_train,\n",
    "    catalog=catalog\n",
    ")\n",
    "print(\"Serendipity: \", serendipity_value)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "NDCG:  0.06808226116073855\n"
     ]
    }
   ],
   "source": [
    "print(\"NDCG: \", ndcg.calc(reco=recos, interactions=df_test))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ILD:  3.1908278145695363\n",
      "CPU times: user 460 ms, sys: 39.5 ms, total: 499 ms\n",
      "Wall time: 501 ms\n"
     ]
    }
   ],
   "source": [
    "%%time\n",
    "print(\"ILD: \", ild.calc(reco=recos))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Per user metric calculation\n",
    "If you need to get metric value for every user, use `calc_per_user` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "precision per user:\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "user_id\n",
       "195    0.3\n",
       "229    0.0\n",
       "343    0.0\n",
       "349    0.0\n",
       "398    0.5\n",
       "dtype: float64"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Values are equal?  True\n"
     ]
    }
   ],
   "source": [
    "precision_per_user = precision.calc_per_user(reco=recos, interactions=df_test)\n",
    "print(\"\\nprecision per user:\")\n",
    "display(precision_per_user.head())\n",
    "\n",
    "print(\"Values are equal? \", precision_per_user.mean() == precision_value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multiple metrics calculation with one function"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It is possible to calculate a bunch of metrics using only one function - `calc_metrics`.\n",
    "\n",
    "It contains important optimisations in performance: if several metrics do the same calculations, they will be performed only once."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "{'hit_rate@10': 0.1717171717171717,\n",
       " 'hit_rate_debiased@10': 0.16161616161616163,\n",
       " 'ndcg@10': 0.06808226116073855,\n",
       " 'pop_bias@10': 0.0017362993523658327,\n",
       " 'diversity@10': 3.1908278145695363,\n",
       " 'serendipity@10': 2.3436131849908687e-05,\n",
       " 'intersection@10_same_model': 1.0,\n",
       " 'sufficient@10': 1.0,\n",
       " 'sufficient@20': 0.5}"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Here we provide a debias config for one of the metrics apart from calculating it's regular value\n",
    "# Check our \"Debiased metrics calculation user guied\" for more info\n",
    "\n",
    "metrics = {\n",
    "    \"hit_rate@10\": HitRate(k=10),\n",
    "    \"hit_rate_debiased@10\": HitRate(k=10, debias_config=DebiasConfig(iqr_coef=1.5, random_state=32)),\n",
    "    \"sufficient@10\": SufficientReco(k=10, deep=True),\n",
    "    \"sufficient@20\": SufficientReco(k=20, deep=True),\n",
    "    \"pop_bias@10\": AvgRecPopularity(k=10, normalize=True),\n",
    "    \"ndcg@10\": ndcg,\n",
    "    \"serendipity@10\": serendipity,\n",
    "    \"diversity@10\": ild,\n",
    "    \"intersection@10\": Intersection(k=10)\n",
    "}\n",
    "\n",
    "\n",
    "# Some arguments can be omitted if they are not needed for metrics calculation.\n",
    "calc_metrics(\n",
    "    metrics,\n",
    "    reco=recos,\n",
    "    interactions=df_test,  # needed fo all `TruePositive` based metrics\n",
    "    prev_interactions=df_train,  # needed for serendipity\n",
    "    catalog=catalog,  # needed for serendipity\n",
    "    ref_reco = {\"same_model\": recos}  # needed for intersection. usually this should be recos from a different model\n",
    ")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": true,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}