Interactions
- class rectools.dataset.interactions.Interactions(df: DataFrame)[source]
Bases:
objectStructure to store info about user-item interactions.
Usually it’s more convenient to use from_raw method instead of direct creating.
- Parameters
df (pd.DataFrame) –
- Table where every row contains user-item interaction and columns are:
Columns.User - internal user id (non-negative int values);
Columns.Item - internal item id (non-negative int values);
Columns.Weight - weight of interaction, float, use
1if interactions have no weight;Columns.Datetime - timestamp of interactions, assign random value if you’re not going to use it later.
Extra columns can also be present.
- Inherited-members
Methods
Convert weight column to float and datetime column to datetime64[ns] in-place.
from_raw(interactions, user_id_map, item_id_map)Create Interactions from dataset with external ids and id mappings.
get_user_item_matrix([include_weights, dtype])Form a user-item CSR matrix based on interactions data.
to_external(user_id_map, item_id_map[, ...])Convert itself to pd.DataFrame with replacing internal user and item ids to external ones.
Attributes
df- static convert_weight_and_datetime_types(df: DataFrame) None[source]
Convert weight column to float and datetime column to datetime64[ns] in-place.
This method ensures that the specified weight column contains numeric values and that the datetime column can be converted to pandas’ datetime64[ns] format. The conversion is done in-place, so the original DataFrame will be modified.
- Parameters
df (pd.DataFrame) –
- Input DataFrame that must contain the following columns:
Columns.Weight - interaction weight;
Columns.Datetime - interaction timestamp.
- Return type
None
- classmethod from_raw(interactions: DataFrame, user_id_map: IdMap, item_id_map: IdMap, keep_extra_cols: bool = False) Interactions[source]
Create Interactions from dataset with external ids and id mappings.
- Parameters
interactions (pd.DataFrame) –
- Table where every row contains user-item interaction and columns are:
Columns.User - user id;
Columns.Item - item id;
Columns.Weight - weight of interaction, float, use
1if interactions have no weight;Columns.Datetime - timestamp of interactions, assign random value if you’re not going to use it later.
user_id_map (IdMap) – User identifiers mapping.
item_id_map (IdMap) – Item identifiers mapping.
keep_extra_cols (bool, default
False) – Flag to keep all columns from interactions besides the default ones.
- Return type
- get_user_item_matrix(include_weights: bool = True, dtype: ~typing.Type = <class 'numpy.float32'>) csr_matrix[source]
Form a user-item CSR matrix based on interactions data.
- Parameters
include_weights (bool, default
True) – Whether include interaction weights in matrix or not. IfFalse, all values in returned matrix will be equal to1.dtype (Type) –
- Return type
csr_matrix
- to_external(user_id_map: IdMap, item_id_map: IdMap, include_weight: bool = True, include_datetime: bool = True, include_extra_cols: Union[bool, List[str]] = True) DataFrame[source]
Convert itself to pd.DataFrame with replacing internal user and item ids to external ones.
- Parameters
user_id_map (IdMap) – User id map that has to be used for converting internal user ids to external ones.
item_id_map (IdMap) – Item id map that has to be used for converting internal item ids to external ones.
include_weight (bool, default
True) – Whether to include weight column into resulting table or notinclude_datetime (bool, default
True) – Whether to include datetime column into resulting table or not.include_extra_cols (bool or List[str], default
True) – If bool, indicates whether to include all extra columns into resulting table or not. If list of strings, indicates which extra columns to include into resulting table.
- Return type
pd.DataFrame