ddf.stainer.LatlongSplitStainer¶
-
class
ddf.stainer.
LatlongSplitStainer
(col_idx, name='Latlong Split', prob=1.0)¶ Stainer that splits each given latlong columns into 6 columns, representing degree, minute, and seconds, for lat and long respectively. If a given column’s name is ‘X’, then the respective generated column names ‘X_lat_deg’, ‘X_lat_min’, ‘X_lat_sec’, ‘X_long_deg’, ‘X_long_min’, and ‘X_long_sec’. If a column is split, the original column will be dropped.
-
__init__
(col_idx, name='Latlong Split', prob=1.0)¶ Constructor for LatlongSplitStainer
- Parameters
col_idx (int list) – latlong columns to perform latlong splitting on.
name (str, optional) – Name of stainer. Default is Latlong Split.
prob (float [0, 1], optional) – probability that the stainer splits a latlong column. Probabilities of split for each given date column are independent. Defaults to 1.
- Raises
ValueError: – Probability provided is not in the range of [0, 1]
-
get_col_type
()¶ Returns the column type that the stainer operates on.
- Returns
Column type that the stainer operates on.
- Return type
string
-
get_history
()¶ Compiles history information for this stainer and returns it.
- Returns
name (str) – Name of stainer.
msg (str) – Message for user.
time (float) – Time taken to execute the self.transform() method.
-
get_indices
()¶ Returns the row indices and column indices.
- Returns
row_idx (int list) – Row indices that the stainer operates on.
col_idx (int list) – Column indices that the stainer operates on.
-
transform
(df, rng, row_idx=None, col_idx=None)¶ Applies staining on the given indices in the provided dataframe.
- Parameters
df (pd.DataFrame) – Dataframe to be transformed.
rng (np.random.BitGenerator) – PCG64 pseudo-random number generator.
row_idx (int list, optional) – Unused parameter as this stainer does not use row indices.
col_idx (int list, optional) – Column indices that the stainer will operate on.
- Returns
new_df (pd.DataFrame) – Modified dataframe.
row_map (empty dictionary) – This stainer does not produce any row mappings.
col_map (empty dictionary) – Column mapping showing the relationship between the original and new column positions.
-
update_history
(message='', time=0)¶ Used by transform method to set attributes required to display history information
- Parameters
message (str) – Mesasge to be shown to user about the transformation
time (float) – Time taken to perform the transform
Methods
__init__
(col_idx[, name, prob])Constructor for LatlongSplitStainer
Returns the column type that the stainer operates on.
Compiles history information for this stainer and returns it.
Returns the row indices and column indices.
transform
(df, rng[, row_idx, col_idx])Applies staining on the given indices in the provided dataframe.
update_history
([message, time])Used by transform method to set attributes required to display history information
-