astronomaly.anomaly_detection package

Submodules

astronomaly.anomaly_detection.human_loop_learning module

class astronomaly.anomaly_detection.human_loop_learning.NeighbourScore(min_score=0.1, max_score=5, alpha=1, **kwargs)

Bases: astronomaly.base.base_pipeline.PipelineStage

anom_func(nearest_neighbour_distance, user_score, anomaly_score)

Simple function that is dominated by the (predicted) user score in regions where we are reasonably sure about our ability to predict that score, and is dominated by the anomaly score from an algorithms in regions we have little data.

Parameters:
  • nearest_neighbour_distance (array) – The distance of each instance to its nearest labelled neighbour.
  • user_score (array) – The predicted user score for each instance
  • anomaly_score (array) – The actual anomaly score from a machine learning algorithm
Returns:

The final anomaly score for each instance, penalised by the predicted user score as required.

Return type:

array

combine_data_frames(features, ml_df)

Convenience function to correctly combine dataframes.

compute_nearest_neighbour(features_with_labels)

Calculates the distance of each instance to its nearest labelled neighbour.

Parameters:features_with_labels (pd.DataFrame) – A dataframe where the first columns are the features and the last two columns are ‘human_label’ and ‘score’ (the anomaly score from the ML algorithm).
Returns:Distance of each instance to its nearest labelled neighbour.
Return type:array
train_regression(features_with_labels)

Uses machine learning to predict the user score for all the data. The labels are provided in the column ‘human_label’ which must be -1 if no label exists.

Parameters:features_with_labels (pd.DataFrame) – A dataframe where the first columns are the features and the last two columns are ‘human_label’ and ‘score’ (the anomaly score from the ML algorithm).
Returns:The predicted user score for each instance.
Return type:array
class astronomaly.anomaly_detection.human_loop_learning.ScoreConverter(lower_is_weirder=True, new_min=0, new_max=5, convert_integer=False, column_name='score', **kwargs)

Bases: astronomaly.base.base_pipeline.PipelineStage

astronomaly.anomaly_detection.isolation_forest module

class astronomaly.anomaly_detection.isolation_forest.IforestAlgorithm(contamination='auto', **kwargs)

Bases: astronomaly.base.base_pipeline.PipelineStage

save_iforest_obj()

Stores the iforest object to the output directory to allow quick rerunning on new data.

astronomaly.anomaly_detection.lof module

class astronomaly.anomaly_detection.lof.LOF_Algorithm(contamination='auto', n_neighbors=20, **kwargs)

Bases: astronomaly.base.base_pipeline.PipelineStage

save_algorithm_obj()

Stores the LOF object to the output directory to allow quick rerunning on new data.

Module contents