ddf.stainer.FTransformStainer

class ddf.stainer.FTransformStainer(deg, name='Function Transform', col_idx=[], trans_lst=[], trans_dict={}, scale=False)

Stainer that takes a numerical column and applies a transformation to it. Only works on numerical columns. If any other column is selected, a type error will be raised.

__init__(deg, name='Function Transform', col_idx=[], trans_lst=[], trans_dict={}, scale=False)

The constructor for FTransformStainer class.

Parameters
  • deg (float (0, 1]) – Determines the proportion of selected data that would be transformed

  • name (str, optional) – Name of stainer. Default is “Function Transform”

  • col_idx (int list, optional) – Column indices that the stainer will operate on. Default is empty list.

  • trans_lst (str list, optional) – Names of transformations in function_dict to include in the pool of possible transformations. Default is empty list.

  • trans_dict ({str : function} dictionary, optional) –

    {Name of transformation: Function} to include in the pool of possible transformations.

    Default is empty dictionary. If no transformation has been selected, all default functions will be selected instead.

  • scale (boolean) – If True, will scale the data back to its original range. Defaults to False

Raises
  • ValueError – Degree provided is not in the range of (0, 1]

  • Exception – If multiple functions are given the same name

  • KeyError – Name provided in trans_lst is not one of the 7 default transformations

  • TypeError – Invalid column type provided

  • ZeroDivisionError – Transformation would reuslt in division by zero

get_col_type()

Returns the column type that the stainer operates on.

Returns

Column type that the stainer operates on.

Return type

string

get_history()

Compiles history information for this stainer and returns it.

Returns

  • name (str) – Name of stainer.

  • msg (str) – Message for user.

  • time (float) – Time taken to execute the self.transform() method.

get_indices()

Returns the row indices and column indices.

Returns

  • row_idx (int list) – Row indices that the stainer operates on.

  • col_idx (int list) – Column indices that the stainer operates on.

transform(df, rng, row_idx, col_idx)

Applies staining on the given indices in the provided dataframe.

Parameters
  • df (pd.DataFrame) – Dataframe to be transformed.

  • rng (np.random.BitGenerator) – PCG64 pseudo-random number generator.

  • row_idx (int list, optional) – Unused parameter as this stainer does not use row indices. All rows within the selected columns will be transformed.

  • col_idx (int list, optional) – Column indices that the stainer will operate on. Will take priority over the class attribute col_idx.

Returns

  • new_df (pd.DataFrame) – Modified dataframe.

  • row_map (empty dictionary) – This stainer does not produce any row mappings.

  • col_map (empty dictionary) – This stainer does not produce any column mappings.

update_history(message='', time=0)

Used by transform method to set attributes required to display history information

Parameters
  • message (str) – Mesasge to be shown to user about the transformation

  • time (float) – Time taken to perform the transform

Methods

__init__(deg[, name, col_idx, trans_lst, …])

The constructor for FTransformStainer class.

get_col_type()

Returns the column type that the stainer operates on.

get_history()

Compiles history information for this stainer and returns it.

get_indices()

Returns the row indices and column indices.

transform(df, rng, row_idx, col_idx)

Applies staining on the given indices in the provided dataframe.

update_history([message, time])

Used by transform method to set attributes required to display history information