Examples of calculating different metrics with RecTools
Initializing different metrics
Calculating a value of a single metric
Calculating metric values per user
Calculating values of a bunch of metrics using only one function
[2]:
import numpy as np
import pandas as pd
from implicit.nearest_neighbours import TFIDFRecommender
from rectools import Columns
from rectools.dataset import Dataset
from rectools.metrics import (
Precision,
Accuracy,
NDCG,
IntraListDiversity,
Serendipity,
calc_metrics,
)
from rectools.metrics.distances import PairwiseHammingDistanceCalculator
from rectools.models import ImplicitItemKNNWrapperModel
Load data
[3]:
%%time
!wget -q https://files.grouplens.org/datasets/movielens/ml-1m.zip -O ml-1m.zip
!unzip -o ml-1m.zip
!rm ml-1m.zip
Archive: ml-1m.zip
inflating: ml-1m/movies.dat
inflating: ml-1m/ratings.dat
inflating: ml-1m/README
inflating: ml-1m/users.dat
CPU times: user 39.5 ms, sys: 44.8 ms, total: 84.3 ms
Wall time: 3.22 s
[4]:
%%time
ratings = pd.read_csv(
"ml-1m/ratings.dat",
sep="::",
engine="python", # Because of 2-chars separators
header=None,
names=[Columns.User, Columns.Item, Columns.Weight, Columns.Datetime],
)
print(ratings.shape)
ratings.head()
(1000209, 4)
CPU times: user 3.51 s, sys: 270 ms, total: 3.78 s
Wall time: 3.77 s
[4]:
user_id | item_id | weight | datetime | |
---|---|---|---|---|
0 | 1 | 1193 | 5 | 978300760 |
1 | 1 | 661 | 3 | 978302109 |
2 | 1 | 914 | 3 | 978301968 |
3 | 1 | 3408 | 4 | 978300275 |
4 | 1 | 2355 | 5 | 978824291 |
[5]:
ratings["datetime"] = pd.to_datetime(ratings["datetime"] * 10 ** 9)
ratings["datetime"].min(), ratings["datetime"].max()
[5]:
(Timestamp('2000-04-25 23:05:32'), Timestamp('2003-02-28 17:49:50'))
[6]:
%%time
movies = pd.read_csv(
"ml-1m/movies.dat",
sep="::",
engine="python", # Because of 2-chars separators
header=None,
names=[Columns.Item, "title", "genres"],
encoding_errors="ignore",
)
print(movies.shape)
movies.head()
(3883, 3)
CPU times: user 9.53 ms, sys: 518 µs, total: 10 ms
Wall time: 9.36 ms
[6]:
item_id | title | genres | |
---|---|---|---|
0 | 1 | Toy Story (1995) | Animation|Children's|Comedy |
1 | 2 | Jumanji (1995) | Adventure|Children's|Fantasy |
2 | 3 | Grumpier Old Men (1995) | Comedy|Romance |
3 | 4 | Waiting to Exhale (1995) | Comedy|Drama |
4 | 5 | Father of the Bride Part II (1995) | Comedy |
Build model
[7]:
# Split once by train and test to demonstrate how different metrics work
split_dt = pd.Timestamp("2003-02-01")
df_train = ratings.loc[ratings["datetime"] < split_dt]
df_test = ratings.loc[ratings["datetime"] >= split_dt]
[8]:
%%time
# Prepare dataset, fit model and generate recommendations
dataset = Dataset.construct(df_train)
model = ImplicitItemKNNWrapperModel(TFIDFRecommender(K=10))
model.fit(dataset)
recos = model.recommend(
users=ratings[Columns.User].unique(),
dataset=dataset,
k=10,
filter_viewed=True,
)
CPU times: user 4.77 s, sys: 257 ms, total: 5.02 s
Wall time: 1.31 s
Calculate metrics
Metrics initialization
To calculate a metric it is necessary to create its object.
Most metrics have k
parameter - the number of top recommendations that will be used for metric calculation.
Some metrics have additional parameters.
Simple metrics
[7]:
precision = Precision(k=10)
accuracy_1 = Accuracy(k=1)
accuracy_10 = Accuracy(k=10)
serendipity = Serendipity(k=10)
Metric with simple additional parameter
[8]:
ndcg = NDCG(k=10, log_base=3)
Metric with complex additional parameter
To calculate any diversity metric (e.g. IntraListDivirsity
) you need to measure distance between items.
For example, you can use Hamming distance.
As features, let’s use movie genres.
[9]:
movies["genre"] = movies["genres"].str.split("|")
genre_exploded = movies[["item_id", "genre"]].set_index("item_id").explode("genre")
genre_dummies = pd.get_dummies(genre_exploded, prefix="", prefix_sep="").groupby("item_id").sum()
genre_dummies.head()
[9]:
Action | Adventure | Animation | Children's | Comedy | Crime | Documentary | Drama | Fantasy | Film-Noir | Horror | Musical | Mystery | Romance | Sci-Fi | Thriller | War | Western | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
item_id | ||||||||||||||||||
1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
2 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
3 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
4 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
5 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
[10]:
distance_calculator = PairwiseHammingDistanceCalculator(genre_dummies)
ild = IntraListDiversity(k=10, distance_calculator=distance_calculator)
Single metric calculation
The easiest way to calculate metric is to use calc
method.
Every metric has it, but arguments are different.
If you need to get metric value for every user, use calc_per_user
method.
[11]:
precision_value = precision.calc(reco=recos, interactions=df_test)
print(f"precision: {precision_value}")
precision_per_user = precision.calc_per_user(reco=recos, interactions=df_test)
print("\nprecision per user:")
display(precision_per_user.head())
print("Values are equal? ", precision_per_user.mean() == precision_value)
precision: 0.06464646464646465
precision per user:
user_id
195 0.3
229 0.0
343 0.0
349 0.0
398 0.5
dtype: float64
Values are equal? True
[12]:
# Catalog is a set of items that we recommend.
# Sometimes not all items from train dataset appear in recommendations list.
catalog = df_train[Columns.Item].unique()
print("Accuracy@1: ", accuracy_1.calc(reco=recos, interactions=df_test, catalog=catalog))
print("Accuracy@10: ", accuracy_10.calc(reco=recos, interactions=df_test, catalog=catalog))
Accuracy@1: 0.9956908534890186
Accuracy@10: 0.9935730756022174
[13]:
serendipity_value = serendipity.calc(
reco=recos,
interactions=df_test,
prev_interactions=df_train,
catalog=catalog
)
print("Serendipity: ", serendipity_value)
Serendipity: 2.3436131849908687e-05
[14]:
print("NDCG: ", ndcg.calc(reco=recos, interactions=df_test))
NDCG: 0.06808226116073855
[15]:
%%time
print("ILD: ", ild.calc(reco=recos))
ILD: 3.1908278145695363
CPU times: user 2.1 s, sys: 556 ms, total: 2.66 s
Wall time: 2.64 s
Multiple metrics calculation
It is possible to calculate a bunch of metrics using only one function - calc_metrics
.
It contains same optimisations in performance: if several metrics do the same calculations, they will be performed only once.
[16]:
metrics = {
"precision": precision,
"accuracy@1": accuracy_1,
"accuracy@10": accuracy_10,
"ndcg": ndcg,
"serendipity": serendipity,
"diversity": ild,
}
# Some arguments can be omitted if they are not needed for metrics calculation
calc_metrics(
metrics,
reco=recos,
interactions=df_test,
prev_interactions=df_train,
catalog=catalog
)
[16]:
{'precision': 0.06464646464646465,
'accuracy@10': 0.9935730756022174,
'accuracy@1': 0.9956908534890186,
'ndcg': 0.06808226116073855,
'diversity': 3.1908278145695363,
'serendipity': 2.3436131849908687e-05}
[ ]: