astronomaly.anomaly_detection package¶
Submodules¶
astronomaly.anomaly_detection.human_loop_learning module¶
-
class
astronomaly.anomaly_detection.human_loop_learning.NeighbourScore(min_score=0.1, max_score=5, alpha=1, **kwargs)¶ Bases:
astronomaly.base.base_pipeline.PipelineStage-
anom_func(nearest_neighbour_distance, user_score, anomaly_score)¶ Simple function that is dominated by the (predicted) user score in regions where we are reasonably sure about our ability to predict that score, and is dominated by the anomaly score from an algorithms in regions we have little data.
Parameters: - nearest_neighbour_distance (array) – The distance of each instance to its nearest labelled neighbour.
- user_score (array) – The predicted user score for each instance
- anomaly_score (array) – The actual anomaly score from a machine learning algorithm
Returns: The final anomaly score for each instance, penalised by the predicted user score as required.
Return type: array
-
combine_data_frames(features, ml_df)¶ Convenience function to correctly combine dataframes.
-
compute_nearest_neighbour(features_with_labels)¶ Calculates the distance of each instance to its nearest labelled neighbour.
Parameters: features_with_labels (pd.DataFrame) – A dataframe where the first columns are the features and the last two columns are ‘human_label’ and ‘score’ (the anomaly score from the ML algorithm). Returns: Distance of each instance to its nearest labelled neighbour. Return type: array
-
train_regression(features_with_labels)¶ Uses machine learning to predict the user score for all the data. The labels are provided in the column ‘human_label’ which must be -1 if no label exists.
Parameters: features_with_labels (pd.DataFrame) – A dataframe where the first columns are the features and the last two columns are ‘human_label’ and ‘score’ (the anomaly score from the ML algorithm). Returns: The predicted user score for each instance. Return type: array
-
-
class
astronomaly.anomaly_detection.human_loop_learning.ScoreConverter(lower_is_weirder=True, new_min=0, new_max=5, convert_integer=False, column_name='score', **kwargs)¶
astronomaly.anomaly_detection.isolation_forest module¶
-
class
astronomaly.anomaly_detection.isolation_forest.IforestAlgorithm(contamination='auto', **kwargs)¶ Bases:
astronomaly.base.base_pipeline.PipelineStage-
save_iforest_obj()¶ Stores the iforest object to the output directory to allow quick rerunning on new data.
-
astronomaly.anomaly_detection.lof module¶
-
class
astronomaly.anomaly_detection.lof.LOF_Algorithm(contamination='auto', n_neighbors=20, **kwargs)¶ Bases:
astronomaly.base.base_pipeline.PipelineStage-
save_algorithm_obj()¶ Stores the LOF object to the output directory to allow quick rerunning on new data.
-