Interactions

class rectools.dataset.interactions.Interactions(df: DataFrame)[source]

Bases: object

Structure to store info about user-item interactions.

Usually it’s more convenient to use from_raw method instead of direct creating.

Parameters

df (pd.DataFrame) –

Table where every row contains user-item interaction and columns are:
  • Columns.User - internal user id (non-negative int values);

  • Columns.Item - internal item id (non-negative int values);

  • Columns.Weight - weight of interaction, float, use 1 if interactions have no weight;

  • Columns.Datetime - timestamp of interactions, assign random value if you’re not going to use it later.

Extra columns can also be present.

Inherited-members

Methods

convert_weight_and_datetime_types(df)

Convert weight column to float and datetime column to datetime64[ns] in-place.

from_raw(interactions, user_id_map, item_id_map)

Create Interactions from dataset with external ids and id mappings.

get_user_item_matrix([include_weights, dtype])

Form a user-item CSR matrix based on interactions data.

to_external(user_id_map, item_id_map[, ...])

Convert itself to pd.DataFrame with replacing internal user and item ids to external ones.

Attributes

df

static convert_weight_and_datetime_types(df: DataFrame) None[source]

Convert weight column to float and datetime column to datetime64[ns] in-place.

This method ensures that the specified weight column contains numeric values and that the datetime column can be converted to pandas’ datetime64[ns] format. The conversion is done in-place, so the original DataFrame will be modified.

Parameters

df (pd.DataFrame) –

Input DataFrame that must contain the following columns:
  • Columns.Weight - interaction weight;

  • Columns.Datetime - interaction timestamp.

Return type

None

classmethod from_raw(interactions: DataFrame, user_id_map: IdMap, item_id_map: IdMap, keep_extra_cols: bool = False) Interactions[source]

Create Interactions from dataset with external ids and id mappings.

Parameters
  • interactions (pd.DataFrame) –

    Table where every row contains user-item interaction and columns are:
    • Columns.User - user id;

    • Columns.Item - item id;

    • Columns.Weight - weight of interaction, float, use 1 if interactions have no weight;

    • Columns.Datetime - timestamp of interactions, assign random value if you’re not going to use it later.

  • user_id_map (IdMap) – User identifiers mapping.

  • item_id_map (IdMap) – Item identifiers mapping.

  • keep_extra_cols (bool, default False) – Flag to keep all columns from interactions besides the default ones.

Return type

Interactions

get_user_item_matrix(include_weights: bool = True, dtype: ~typing.Type = <class 'numpy.float32'>) csr_matrix[source]

Form a user-item CSR matrix based on interactions data.

Parameters
  • include_weights (bool, default True) – Whether include interaction weights in matrix or not. If False, all values in returned matrix will be equal to 1.

  • dtype (Type) –

Return type

csr_matrix

to_external(user_id_map: IdMap, item_id_map: IdMap, include_weight: bool = True, include_datetime: bool = True, include_extra_cols: Union[bool, List[str]] = True) DataFrame[source]

Convert itself to pd.DataFrame with replacing internal user and item ids to external ones.

Parameters
  • user_id_map (IdMap) – User id map that has to be used for converting internal user ids to external ones.

  • item_id_map (IdMap) – Item id map that has to be used for converting internal item ids to external ones.

  • include_weight (bool, default True) – Whether to include weight column into resulting table or not

  • include_datetime (bool, default True) – Whether to include datetime column into resulting table or not.

  • include_extra_cols (bool or List[str], default True) – If bool, indicates whether to include all extra columns into resulting table or not. If list of strings, indicates which extra columns to include into resulting table.

Return type

pd.DataFrame