ddf.DirtyDF.DirtyDF

class ddf.DirtyDF.DirtyDF(df, seed=None, copy=False)

Dirty DataFrame. Stores information about the dataframe to be stained, previous staining results, and the mapping of the rows and columns.

To be used in conjunction with Stainer class to add and execute stainers.

__init__(df, seed=None, copy=False)

Constructor for DirtyDF

Parameters
  • df (pd.DataFrame) – Dataframe to be transformed.

  • seed (int, optional) – Controls the randomness of the staining process. For a deterministic behaviour, seed has to be fixed to an integer. If unspecified, will choose a random seed

  • copy (boolean, optional) – Not for use by user. Determines if a copy of DirtyDF is being created. If True, will copy the details from the previous DDF.

add_stainers(stain, use_orig_row=True, use_orig_col=True)

Adds a stainer / list of stainers to current list of stainers to be executed.

Parameters
  • stain (Stainer or Stainer list) – stainers to be added to the DDF to be executed in the future

  • use_orig_row (boolean, optional) – Indicates if indices in stainer refers to the initial dataframe, or the index of the dataframe at time of execution. If True, indices from initial dataframe are used. Defaults to True

  • use_orig_col (boolean, optional) – Indicates if indices in stainer refers to the initial dataframe, or the index of the dataframe at time of execution. If True, indices from initial dataframe are used. Defaults to True

Returns

ddf – Returns new copy of DDF with the stainer added

Return type

DirtyDF

copy()

Creates a copy of the DDF

Returns

ddf – Returns copy of DDF

Return type

DirtyDF

get_df()

Returns the dataframe

Returns

df – Current dataframe in DDF

Return type

pd.DataFrame

get_map_from_history(index, axis=0)

Mapping of rows/cols of the sepcified stainer transformation that had been executed. A dictionary is returned with information on what row/col index right before the specified transformation has converted to after the transformation. For instance, if row 3 got shuffled to row 8 in the new dataframe, then row 8 got shuffled to row 2, calling index=0 will return {3: [8]} and calling index=1 will return {8: [2]}

Parameters
  • index (int) – Index of stainer sequence to query mapping. E.g. index=1 will query the mapping performed by the 2nd stainer operation.

  • axis ((0/1), optional) –

    If 0, returns the row mapping. If 1, returns the col mapping.

    Defaults to 0

Returns

map – Mapping of original row/col indices to current dataframe’s row/col indices.

Return type

{int : int list} dictionary

Raises

Exception – If axis provided is not 0/1

get_mapping(axis=0)

Mapping of rows/cols from original dataframe to most recent dataframe. A dictionary is returned with information on which index the original rows/cols are displayed in the newest dataframe. For instance, if row 3 got shuffled to row 8 in the new dataframe, then row 8 got shuffled to row 2, the function will return {3: [2]}

Parameters

axis ((0/1), optional) –

If 0, returns the row mapping. If 1, returns the col mapping.

Defaults to 0

Returns

map – Mapping of original row/col indices to current dataframe’s row/col indices.

Return type

{int : int list} dictionary

Raises

Exception – If axis provided is not 0/1

get_previous_map(axis=0)

Mapping of rows/cols of the most recent stainer transformation that had been executed. A dictionary is returned with information on what row/col index right before the transformation has converted to after the transformation. For instance, if row 3 got shuffled to row 8 in the new dataframe, then row 8 got shuffled to row 2, the function will return {8: [2]}

Parameters

axis ((0/1), optional) –

If 0, returns the row mapping. If 1, returns the col mapping.

Defaults to 0

Returns

map – Mapping of original row/col indices to current dataframe’s row/col indices.

Return type

{int : int list} dictionary

Raises

Exception – If axis provided is not 0/1

get_rng()

Returns seed generator

Returns

rng – PCG64 pseudo-random number generator used for randomisation

Return type

np.random.BitGenerator

get_seed()

Returns seed number

Returns

seed – Integer seed used to create Generator for randomisation

Return type

int

print_history()

Print historical details of the stainers that have been executed

reindex_stainers(new_order)

Reorder stainers in a specified order

Parameters

new_order (int list) – Indices of the new order of stainers. If original was [A, B, C] and new_order = [1, 2, 0], the resulting order will be [C, A, B].

Returns

ddf – Returns new copy of DDF with the stainers rearranged

Return type

DirtyDF

reset_rng()

Resets Random Generator object

run_all_stainers()

Applies the transformation of all stainers in order

Returns

ddf – Returns new DDF after all the stainers have been executed

Return type

DirtyDF

run_stainer(idx=0)

Applies the transformation of the specified stainer

Parameters

idx (int, optional) – Index of stainer to execute. Defaults to 0 (first stainer added)

Returns

ddf – Returns new DDF after the specified stainer has been executed

Return type

DirtyDF

shuffle_stainers()

Randomly reorder the stainers

Returns

ddf – Returns new copy of DDF with the stainers rearranged

Return type

DirtyDF

summarise_stainers()

Prints names of stainers that have yet to be executed

Methods

__init__(df[, seed, copy])

Constructor for DirtyDF

add_stainers(stain[, use_orig_row, use_orig_col])

Adds a stainer / list of stainers to current list of stainers to be executed.

copy()

Creates a copy of the DDF

get_df()

Returns the dataframe

get_map_from_history(index[, axis])

Mapping of rows/cols of the sepcified stainer transformation that had been executed.

get_mapping([axis])

Mapping of rows/cols from original dataframe to most recent dataframe.

get_previous_map([axis])

Mapping of rows/cols of the most recent stainer transformation that had been executed.

get_rng()

Returns seed generator

get_seed()

Returns seed number

print_history()

Print historical details of the stainers that have been executed

reindex_stainers(new_order)

Reorder stainers in a specified order

reset_rng()

Resets Random Generator object

run_all_stainers()

Applies the transformation of all stainers in order

run_stainer([idx])

Applies the transformation of the specified stainer

shuffle_stainers()

Randomly reorder the stainers

summarise_stainers()

Prints names of stainers that have yet to be executed