ddf.stainer.NullifyStainer¶
-
class
ddf.stainer.
NullifyStainer
(deg, name='Nullify', row_idx=[], col_idx=[], new_val=None, new_type=False)¶ Stainer that convert various values to missing data / values that represent missing values.
-
__init__
(deg, name='Nullify', row_idx=[], col_idx=[], new_val=None, new_type=False)¶ The constructor for NullifyStainer class.
- Parameters
deg (float (0, 1]) – Determines the proportion of selected data that would be nullified
name (str, optional) – Name of stainer. Default is “Nullify”
row_idx (int list, optional) – Indices of rows which will be considered for transformation (depending on the degree). Default is empty list
col_idx (int list, optional) – Indices of columns which will be considered for transformation (depending on the degree). Default is empty list
new_val (any, optional) – Value that would replace the specific data. Defaults to None
new_type (boolean, optional) –
Allows the new_val to be of a different type than the current column.
Defaults to False (new_val must be same type as the column to be changed)
- Raises
ValueError – Degree provided is not in the range of (0, 1]
TypeError – Only when new_type is set to False. Denotes column type is being changed via the addition of the new_val.
-
get_col_type
()¶ Returns the column type that the stainer operates on.
- Returns
Column type that the stainer operates on.
- Return type
string
-
get_history
()¶ Compiles history information for this stainer and returns it.
- Returns
name (str) – Name of stainer.
msg (str) – Message for user.
time (float) – Time taken to execute the self.transform() method.
-
get_indices
()¶ Returns the row indices and column indices.
- Returns
row_idx (int list) – Row indices that the stainer operates on.
col_idx (int list) – Column indices that the stainer operates on.
-
transform
(df, rng, row_idx=None, col_idx=None)¶ Applies staining on the given indices in the provided dataframe.
- Parameters
df (pd.DataFrame) – Dataframe to be transformed.
rng (np.random.BitGenerator) – PCG64 pseudo-random number generator.
row_idx (int list, optional) – Row indices that the stainer will operate on.
col_idx (int list, optional) – Column indices that the stainer will operate on.
- Returns
new_df (pd.DataFrame) – Modified dataframe.
row_map (empty dictionary) – This stainer does not produce any row mappings.
col_map (empty dictionary) – This stainer does not produce any column mappings.
-
update_history
(message='', time=0)¶ Used by transform method to set attributes required to display history information
- Parameters
message (str) – Mesasge to be shown to user about the transformation
time (float) – Time taken to perform the transform
Methods
__init__
(deg[, name, row_idx, col_idx, …])The constructor for NullifyStainer class.
Returns the column type that the stainer operates on.
Compiles history information for this stainer and returns it.
Returns the row indices and column indices.
transform
(df, rng[, row_idx, col_idx])Applies staining on the given indices in the provided dataframe.
update_history
([message, time])Used by transform method to set attributes required to display history information
-