ddf.stainer.Stainer¶
-
class
ddf.stainer.
Stainer
(name='Unnamed Stainer', row_idx=[], col_idx=[])¶ Parent Stainer class that contains basic initialisations meant for all stainers to inherit from.
Note
This class is not meant to be used on its own, and is meant as the superclass of any custom stainer that may be developed in the future.
-
name
¶ Name of stainer.
- Type
str
-
row_idx
¶ Row indices that the stainer will operate on.
- Type
int list
-
col_idx
¶ Column indices that the stainer will operate on.
- Type
int list
-
col_type
¶ Column type that the stainer operates on, used for stainer to automatically select viable columns to operate on, if the user does not pass in any col_idx. Currently supports [“all”, “category”, “cat”, “datetime”, “date”, “time”, “numeric”, “int”, “float”].
- Type
str
-
__init__
(name='Unnamed Stainer', row_idx=[], col_idx=[])¶ The constructor for Stainer class.
- Parameters
name (str, optional) – Name of stainer. Default is “Unnamed Stainer”.
row_idx (int list, optional) – Row indices that the stainer will operate on. Default is empty list.
col_idx (int list, optional) – Column indices that the stainer will operate on. Default is empty list.
-
get_col_type
()¶ Returns the column type that the stainer operates on.
- Returns
Column type that the stainer operates on.
- Return type
string
-
get_history
()¶ Compiles history information for this stainer and returns it.
- Returns
name (str) – Name of stainer.
msg (str) – Message for user.
time (float) – Time taken to execute the self.transform() method.
-
get_indices
()¶ Returns the row indices and column indices.
- Returns
row_idx (int list) – Row indices that the stainer operates on.
col_idx (int list) – Column indices that the stainer operates on.
-
transform
(df, rng, row_idx, col_idx)¶ Applies staining on the given indices in the provided dataframe.
Note
This method does not return anything and simply raises an error. However, it is expected for the user to implement the transform method for their custom user-defined stainers.
- Parameters
df (pd.DataFrame) – Dataframe to be transformed.
rng (np.random.BitGenerator) – PCG64 pseudo-random number generator.
row_idx (int list) – Row indices that the stainer will operate on. Will take priority over the class attribute row_idx.
col_idx (int list) – Column indices that the stainer will operate on. Will take priority over the class attribute col_idx.
- Returns
new_df (pd.DataFrame) – Modified dataframe.
row_map ({int: int} dictionary) – Row mapping showing the relationship between the original and new row positions.
col_map ({int: int} dictionary) – Column mapping showing the relationship between the original and new column positions.
- Raises
Exception – Children class does not implement the transform method.
-
update_history
(message='', time=0)¶ Used by transform method to set attributes required to display history information
- Parameters
message (str) – Mesasge to be shown to user about the transformation
time (float) – Time taken to perform the transform
Methods
__init__
([name, row_idx, col_idx])The constructor for Stainer class.
Returns the column type that the stainer operates on.
Compiles history information for this stainer and returns it.
Returns the row indices and column indices.
transform
(df, rng, row_idx, col_idx)Applies staining on the given indices in the provided dataframe.
update_history
([message, time])Used by transform method to set attributes required to display history information
-