astronomaly.base package

Submodules

astronomaly.base.base_dataset module

class astronomaly.base.base_dataset.Dataset(*args, **kwargs)

Bases: object

clean_up()

Allows for any clean up tasks that might be required.

get_display_data(idx)

Returns a single instance of the dataset in a form that is ready to be displayed by the web front end.

Parameters:idx (str) – Index (should be a string to avoid ambiguity)
Raises:NotImplementedError – This function must be implemented when the base class is inherited.
get_sample(idx)

Returns a single instance of the dataset given an index.

Parameters:idx (str) – Index (should be a string to avoid ambiguity)
Raises:NotImplementedError – This function must be implemented when the base class is inherited.

astronomaly.base.base_pipeline module

class astronomaly.base.base_pipeline.PipelineStage(*args, **kwargs)

Bases: object

hash_data(data)

Returns a checksum on the first few rows of a DataFrame to allow checking if the input changed.

Parameters:data (pd.DataFrame or similar) – The input data on which to compute the checksum
Returns:checksum – The checksum
Return type:str
load(filename, file_format='')

Loads previous output of this pipeline stage.

Parameters:
  • filename (str) – File name of the output file.
  • file_format (str, optional) – File format can be provided to override the class’s file format
Returns:

output – Whatever the output is of this stage.

Return type:

pd.DataFrame

run(data)

This is the external-facing function that should always be called (rather than _execute_function). This function will automatically check if this stage has already been run with the same arguments and on the same data. This can allow a much faster user experience avoiding rerunning functions unnecessarily.

Parameters:data (pd.DataFrame) – Input data on which to run this pipeline stage on.
Returns:Output
Return type:pd.DataFrame
run_on_dataset(dataset=None)

This function should be called for pipeline stages that perform feature extraction so require taking a Dataset object as input. This is an external-facing function that should always be called (rather than _execute_function). This function will automatically check if this stage has already been run with the same arguments and on the same data. This can allow a much faster user experience avoiding rerunning functions unnecessarily.

Parameters:dataset (Dataset) – The Dataset object on which to run this feature extraction function, by default None
Returns:Output
Return type:pd.Dataframe
save(output, filename, file_format='')

Saves the output of this pipeline stage.

Parameters:
  • output (pd.DataFrame) – Whatever the output is of this stage.
  • filename (str) – File name of the output file.
  • file_format (str, optional) – File format can be provided to override the class’s file format

astronomaly.base.logging_tools module

astronomaly.base.logging_tools.check_if_inputs_same(class_name, local_variables)

Reads the log to check if this function has already been called with the same arguments (this may still result in the function being rerun if the input data has changed).

Parameters:
  • class_name (str) – Name of PipelineStage
  • local_variables (dict) – List of all local variables.
Returns:

  • args_same, bool – True if the function was last called with the same arguments.
  • checksum, int – Reads the checksum stored in the log file and returns it.

astronomaly.base.logging_tools.format_function_call(func_name, *args, **kwargs)

Formats a function of a PipelineStage or Dataset object to ensure proper recording of the function and its arguments. args and kwargs should be exactly those passed to the function.

Parameters:func_name (str) – Name of the stage
Returns:Formatted function call
Return type:str
astronomaly.base.logging_tools.log(msg, level='INFO')

Actually logs a message. Ensures the logger has been set up first.

Parameters:
  • msg (str) – Log message
  • level (str, optional) – DEBUG, INFO, WARNING or ERROR, by default ‘INFO’
astronomaly.base.logging_tools.setup_logger(log_directory='', log_filename='astronomaly.log')

Ensures the system logger is set up correctly. If a FileHandler logger has already been attached to the current logger, nothing new is done.

Parameters:
  • log_directory (str, optional) – Location of log file, by default ‘’
  • log_filename (str, optional) – Log file name, by default “astronomaly.log”
Returns:

The Logger object

Return type:

Logger

Module contents