NDCG

class rectools.metrics.ranking.NDCG(k: int, debias_config: DebiasConfig = None, log_base: int = 2, divide_by_achievable: bool = False)[source]

Bases: _RankingMetric

Normalized Discounted Cumulative Gain at k (NDCG@k).

Estimates relevance of recommendations taking in account their order. “Discounted Gain” means that original item relevance is being discounted based on this items rank. The closer is item to the top the, the more gain is achieved. “Cumulative” means that all items discounted gains from k ranks are being summed. “Normalized” means that the actual value of DCG is being divided by the “Ideal DCG” (IDCG). This is the maximum possible value of DCG@k, used as normalization coefficient to ensure that NDCG@k values lie in [0, 1].

\[ \begin{align}\begin{aligned}NDCG@k=\frac{1}{|U|}\sum_{u \in U}\frac{DCG_u@k}{IDCG_u@k}\\DCG_u@k = \sum_{i=1}^{k} \frac{rel_u(i)}{log(i + 1)}\end{aligned}\end{align} \]
where
  • \(IDCG_u@k = \sum_{i=1}^{k} \frac{1}{log(i + 1)}\) when divide_by_achievable is set to False (default).

  • \(IDCG_u@k = \sum_{i=1}^{\min (|R(u)|, k)} \frac{1}{log(i + 1)}\) when divide_by_achievable is set to True.

  • \(rel_u(i)\) is “Gain”. Here it is an indicator function, it equals to 1 if the item at rank i is relevant to user u, 0 otherwise.

  • \(|R_u|\) is number of relevant (ground truth) items for user u.

Parameters
  • k (int) – Number of items at the top of recommendations list that will be used to calculate metric.

  • log_base (int, default 2) – Base of logarithm used to weight relevant items.

  • divide_by_achievable (bool, default False) – When set to False (default) IDCG is calculated as one value for all of the users and equals to the maximum gain, achievable when all k positions are relevant. When set to True, IDCG is calculated for each user individually, considering the maximum possible amount of user test items on top k positions.

  • debias_config (DebiasConfig, optional, default None) – Config with debias method parameters (iqr_coef, random_state).

Examples

>>> reco = pd.DataFrame(
...     {
...         Columns.User: [1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4],
...         Columns.Item: [7, 8, 1, 2, 1, 2, 3, 4, 1, 2, 3],
...         Columns.Rank: [1, 2, 1, 2, 1, 2, 3, 4, 1, 2, 3],
...     }
... )
>>> interactions = pd.DataFrame(
...     {
...         Columns.User: [1, 1, 2, 3, 3, 3, 4, 4, 4],
...         Columns.Item: [1, 2, 1, 1, 3, 4, 1, 2, 3],
...     }
... )
>>> # Here
>>> #    - for user ``1`` we return non-relevant recommendations;
>>> #    - for user ``2`` we return 2 items and relevant is first;
>>> #    - for user ``3`` we return 4 items, 1st, 3rd and 4th are relevant;
>>> #    - for user ``4`` we return 3 items and all are relevant;
>>> NDCG(k=1).calc_per_user(reco, interactions).values
array([0., 1., 1., 1.])
>>> NDCG(k=3).calc_per_user(reco, interactions).values
array([0. , 0.46927873, 0.70391809, 1. ])
Inherited-members

Parameters
  • k (int) –

  • debias_config (DebiasConfig) –

  • log_base (int) –

  • divide_by_achievable (bool) –

Methods

calc(reco, interactions)

Calculate metric value.

calc_from_merged(merged[, is_debiased])

Calculate metric value from merged recommendations.

calc_per_user(reco, interactions)

Calculate metric values for all users.

calc_per_user_from_merged(merged[, is_debiased])

Calculate metric values for all users from merged recommendations.

Attributes

log_base

divide_by_achievable

calc_from_merged(merged: DataFrame, is_debiased: bool = False) float[source]

Calculate metric value from merged recommendations.

Parameters
  • merged (pd.DataFrame) – Result of merging recommendations and interactions tables. Can be obtained using merge_reco function.

  • is_debiased (bool, default False) – An indicator of whether the debias transformation has been applied before or not.

Returns

Value of metric (average between users).

Return type

float

calc_per_user(reco: DataFrame, interactions: DataFrame) Series[source]

Calculate metric values for all users.

Parameters
  • reco (pd.DataFrame) – Recommendations table with columns Columns.User, Columns.Item, Columns.Rank.

  • interactions (pd.DataFrame) – Interactions table with columns Columns.User, Columns.Item.

Returns

Values of metric (index - user id, values - metric value for every user).

Return type

pd.Series

calc_per_user_from_merged(merged: DataFrame, is_debiased: bool = False) Series[source]

Calculate metric values for all users from merged recommendations.

Parameters
  • merged (pd.DataFrame) – Result of merging recommendations and interactions tables. Can be obtained using merge_reco function.

  • is_debiased (bool, default False) – An indicator of whether the debias transformation has been applied before or not.

Returns

Values of metric (index - user id, values - metric value for every user).

Return type

pd.Series