astronomaly.base package¶
Submodules¶
astronomaly.base.base_dataset module¶
-
class
astronomaly.base.base_dataset.Dataset(*args, **kwargs)¶ Bases:
object-
clean_up()¶ Allows for any clean up tasks that might be required.
-
get_display_data(idx)¶ Returns a single instance of the dataset in a form that is ready to be displayed by the web front end.
Parameters: idx (str) – Index (should be a string to avoid ambiguity) Raises: NotImplementedError– This function must be implemented when the base class is inherited.
-
get_sample(idx)¶ Returns a single instance of the dataset given an index.
Parameters: idx (str) – Index (should be a string to avoid ambiguity) Raises: NotImplementedError– This function must be implemented when the base class is inherited.
-
astronomaly.base.base_pipeline module¶
-
class
astronomaly.base.base_pipeline.PipelineStage(*args, **kwargs)¶ Bases:
object-
hash_data(data)¶ Returns a checksum on the first few rows of a DataFrame to allow checking if the input changed.
Parameters: data (pd.DataFrame or similar) – The input data on which to compute the checksum Returns: checksum – The checksum Return type: str
-
load(filename, file_format='')¶ Loads previous output of this pipeline stage.
Parameters: Returns: output – Whatever the output is of this stage.
Return type: pd.DataFrame
-
run(data)¶ This is the external-facing function that should always be called (rather than _execute_function). This function will automatically check if this stage has already been run with the same arguments and on the same data. This can allow a much faster user experience avoiding rerunning functions unnecessarily.
Parameters: data (pd.DataFrame) – Input data on which to run this pipeline stage on. Returns: Output Return type: pd.DataFrame
-
run_on_dataset(dataset=None)¶ This function should be called for pipeline stages that perform feature extraction so require taking a Dataset object as input. This is an external-facing function that should always be called (rather than _execute_function). This function will automatically check if this stage has already been run with the same arguments and on the same data. This can allow a much faster user experience avoiding rerunning functions unnecessarily.
Parameters: dataset (Dataset) – The Dataset object on which to run this feature extraction function, by default None Returns: Output Return type: pd.Dataframe
-
astronomaly.base.logging_tools module¶
-
astronomaly.base.logging_tools.check_if_inputs_same(class_name, local_variables)¶ Reads the log to check if this function has already been called with the same arguments (this may still result in the function being rerun if the input data has changed).
Parameters: Returns: - args_same, bool – True if the function was last called with the same arguments.
- checksum, int – Reads the checksum stored in the log file and returns it.
-
astronomaly.base.logging_tools.format_function_call(func_name, *args, **kwargs)¶ Formats a function of a PipelineStage or Dataset object to ensure proper recording of the function and its arguments. args and kwargs should be exactly those passed to the function.
Parameters: func_name (str) – Name of the stage Returns: Formatted function call Return type: str
-
astronomaly.base.logging_tools.log(msg, level='INFO')¶ Actually logs a message. Ensures the logger has been set up first.
Parameters:
-
astronomaly.base.logging_tools.setup_logger(log_directory='', log_filename='astronomaly.log')¶ Ensures the system logger is set up correctly. If a FileHandler logger has already been attached to the current logger, nothing new is done.
Parameters: Returns: The Logger object
Return type: Logger