NDCG
- class rectools.metrics.ranking.NDCG(k: int, debias_config: DebiasConfig = None, log_base: int = 2, divide_by_achievable: bool = False)[source]
Bases:
_RankingMetricNormalized Discounted Cumulative Gain at k (NDCG@k).
Estimates relevance of recommendations taking in account their order. “Discounted Gain” means that original item relevance is being discounted based on this items rank. The closer is item to the top the, the more gain is achieved. “Cumulative” means that all items discounted gains from
kranks are being summed. “Normalized” means that the actual value of DCG is being divided by the “Ideal DCG” (IDCG). This is the maximum possible value of DCG@k, used as normalization coefficient to ensure that NDCG@k values lie in[0, 1].\[ \begin{align}\begin{aligned}NDCG@k=\frac{1}{|U|}\sum_{u \in U}\frac{DCG_u@k}{IDCG_u@k}\\DCG_u@k = \sum_{i=1}^{k} \frac{rel_u(i)}{log(i + 1)}\end{aligned}\end{align} \]- where
\(IDCG_u@k = \sum_{i=1}^{k} \frac{1}{log(i + 1)}\) when divide_by_achievable is set to
False(default).\(IDCG_u@k = \sum_{i=1}^{\min (|R(u)|, k)} \frac{1}{log(i + 1)}\) when divide_by_achievable is set to
True.\(rel_u(i)\) is “Gain”. Here it is an indicator function, it equals to
1if the item at rankiis relevant to useru,0otherwise.\(|R_u|\) is number of relevant (ground truth) items for user
u.
- Parameters
k (int) – Number of items at the top of recommendations list that will be used to calculate metric.
log_base (int, default
2) – Base of logarithm used to weight relevant items.divide_by_achievable (bool, default
False) – When set toFalse(default) IDCG is calculated as one value for all of the users and equals to the maximum gain, achievable when allkpositions are relevant. When set toTrue, IDCG is calculated for each user individually, considering the maximum possible amount of user test items on topkpositions.debias_config (DebiasConfig, optional, default None) – Config with debias method parameters (iqr_coef, random_state).
Examples
>>> reco = pd.DataFrame( ... { ... Columns.User: [1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4], ... Columns.Item: [7, 8, 1, 2, 1, 2, 3, 4, 1, 2, 3], ... Columns.Rank: [1, 2, 1, 2, 1, 2, 3, 4, 1, 2, 3], ... } ... ) >>> interactions = pd.DataFrame( ... { ... Columns.User: [1, 1, 2, 3, 3, 3, 4, 4, 4], ... Columns.Item: [1, 2, 1, 1, 3, 4, 1, 2, 3], ... } ... ) >>> # Here >>> # - for user ``1`` we return non-relevant recommendations; >>> # - for user ``2`` we return 2 items and relevant is first; >>> # - for user ``3`` we return 4 items, 1st, 3rd and 4th are relevant; >>> # - for user ``4`` we return 3 items and all are relevant; >>> NDCG(k=1).calc_per_user(reco, interactions).values array([0., 1., 1., 1.]) >>> NDCG(k=3).calc_per_user(reco, interactions).values array([0. , 0.46927873, 0.70391809, 1. ])
- Inherited-members
- Parameters
k (int) –
debias_config (DebiasConfig) –
log_base (int) –
divide_by_achievable (bool) –
Methods
calc(reco, interactions)Calculate metric value.
calc_from_merged(merged[, is_debiased])Calculate metric value from merged recommendations.
calc_per_user(reco, interactions)Calculate metric values for all users.
calc_per_user_from_merged(merged[, is_debiased])Calculate metric values for all users from merged recommendations.
Attributes
log_basedivide_by_achievable- calc_from_merged(merged: DataFrame, is_debiased: bool = False) float[source]
Calculate metric value from merged recommendations.
- Parameters
merged (pd.DataFrame) – Result of merging recommendations and interactions tables. Can be obtained using merge_reco function.
is_debiased (bool, default False) – An indicator of whether the debias transformation has been applied before or not.
- Returns
Value of metric (average between users).
- Return type
float
- calc_per_user(reco: DataFrame, interactions: DataFrame) Series[source]
Calculate metric values for all users.
- Parameters
reco (pd.DataFrame) – Recommendations table with columns Columns.User, Columns.Item, Columns.Rank.
interactions (pd.DataFrame) – Interactions table with columns Columns.User, Columns.Item.
- Returns
Values of metric (index - user id, values - metric value for every user).
- Return type
pd.Series
- calc_per_user_from_merged(merged: DataFrame, is_debiased: bool = False) Series[source]
Calculate metric values for all users from merged recommendations.
- Parameters
merged (pd.DataFrame) – Result of merging recommendations and interactions tables. Can be obtained using merge_reco function.
is_debiased (bool, default False) – An indicator of whether the debias transformation has been applied before or not.
- Returns
Values of metric (index - user id, values - metric value for every user).
- Return type
pd.Series