NDCG

class rectools.metrics.ranking.NDCG(k: int, log_base: int = 2)[source]

Bases: _RankingMetric

Normalized Discounted Cumulative Gain at k (NDCG@k).

Estimates relevance of recommendations taking in account their order.

\[NDCG@k = DCG@k / IDCG@k\]

where \(DCG@k = \sum_{i=1}^{k+1} rel(i) / log_{}(i+1)\) - Discounted Cumulative Gain at k, main part of NDCG@k.

The closer it is to the top the more weight it assigns to relevant items. Here: - rel(i) is an indicator function, it equals to 1 if an item at rank i is relevant, 0 otherwise; - log - logarithm at any given base, usually 2.

and \(IDCG@k = \sum_{i=1}^{k+1} (1 / log(i + 1))\) - Ideal DCG@k, maximum possible value of DCG@k, used as normalization coefficient to ensure that NDCG@k values lie in [0, 1].

Parameters
  • k (int) – Number of items at the top of recommendations list that will be used to calculate metric.

  • log_base (int, default 2) – Base of logarithm used to weight relevant items.

Examples

>>> reco = pd.DataFrame(
...     {
...         Columns.User: [1, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4],
...         Columns.Item: [7, 8, 1, 2, 1, 2, 3, 4, 1, 2, 3],
...         Columns.Rank: [1, 2, 1, 2, 1, 2, 3, 4, 1, 2, 3],
...     }
... )
>>> interactions = pd.DataFrame(
...     {
...         Columns.User: [1, 1, 2, 3, 3, 3, 4, 4, 4],
...         Columns.Item: [1, 2, 1, 1, 3, 4, 1, 2, 3],
...     }
... )
>>> # Here
>>> #    - for user ``1`` we return non-relevant recommendations;
>>> #    - for user ``2`` we return 2 items and relevant is first;
>>> #    - for user ``3`` we return 4 items, 1st, 3rd and 4th are relevant;
>>> #    - for user ``4`` we return 3 items and all are relevant;
>>> NDCG(k=1).calc_per_user(reco, interactions).values
array([0., 1., 1., 1.])
>>> NDCG(k=3).calc_per_user(reco, interactions).values
array([0. , 0.46927873, 0.70391809, 1. ])
Inherited-members

Parameters
  • k (int) –

  • log_base (int) –

Methods

calc(reco, interactions)

Calculate metric value.

calc_from_merged(merged)

Calculate metric value from merged recommendations.

calc_per_user(reco, interactions)

Calculate metric values for all users.

calc_per_user_from_merged(merged)

Calculate metric values for all users from merged recommendations.

Attributes

log_base

calc_from_merged(merged: DataFrame) float[source]

Calculate metric value from merged recommendations.

Parameters

merged (pd.DataFrame) – Result of merging recommendations and interactions tables. Can be obtained using merge_reco function.

Returns

Value of metric (average between users).

Return type

float

calc_per_user(reco: DataFrame, interactions: DataFrame) Series[source]

Calculate metric values for all users.

Parameters
  • reco (pd.DataFrame) – Recommendations table with columns Columns.User, Columns.Item, Columns.Rank.

  • interactions (pd.DataFrame) – Interactions table with columns Columns.User, Columns.Item.

Returns

Values of metric (index - user id, values - metric value for every user).

Return type

pd.Series

calc_per_user_from_merged(merged: DataFrame) Series[source]

Calculate metric values for all users from merged recommendations.

Parameters

merged (pd.DataFrame) – Result of merging recommendations and interactions tables. Can be obtained using merge_reco function.

Returns

Values of metric (index - user id, values - metric value for every user).

Return type

pd.Series