Splitter

class rectools.model_selection.splitter.Splitter(filter_cold_users: bool = True, filter_cold_items: bool = True, filter_already_seen: bool = True)[source]

Bases: object

Base class to construct data splitters. It cannot be used directly. New splitter can be defined by subclassing the Splitter class and implementing _split_without_filter method. Check specific class descriptions to get more information.

Inherited-members

Parameters
  • filter_cold_users (bool) –

  • filter_cold_items (bool) –

  • filter_already_seen (bool) –

Methods

filter(interactions, collect_fold_stats, ...)

Filter train and test indexes from one fold based on filter_cold_users, filter_cold_items,`filter_already_seen` class fields.

split(interactions[, collect_fold_stats])

Split interactions into folds and apply filtration to the result.

filter(interactions: Interactions, collect_fold_stats: bool, train_idx: ndarray, test_idx: ndarray, split_info: Dict[str, Any]) Tuple[ndarray, ndarray, Dict[str, Any]][source]

Filter train and test indexes from one fold based on filter_cold_users, filter_cold_items,`filter_already_seen` class fields. They are set to True by default.

Parameters
  • interactions (Interactions) – User-item interactions.

  • collect_fold_stats (bool, default False) – Add some stats to split info, like size of train and test part, number of users and items.

  • train_idx (array) – Train part row numbers.

  • test_idx (array) – Test part row numbers.

  • split_info (dict) – Information about the split.

Returns

Returns tuple with filtered train part row numbers, test part row numbers and split info.

Return type

Tuple(array, array, dict)

split(interactions: Interactions, collect_fold_stats: bool = False) Iterator[Tuple[ndarray, ndarray, Dict[str, Any]]][source]

Split interactions into folds and apply filtration to the result.

Parameters
  • interactions (Interactions) – User-item interactions.

  • collect_fold_stats (bool, default False) – Add some stats to split info, like size of train and test part, number of users and items.

Returns

Yields tuples with train part row numbers, test part row numbers and split info.

Return type

iterator(array, array, dict)