yotse.utils package

Submodules

yotse.utils.blueprint_tools module

blueprint_tools.py.

This module provides helper functions for the NL blueprint experiment setup within the QIA project.

yotse.utils.blueprint_tools.create_separate_files_for_job(experiment: Experiment, datapoint_item: List[float], step_number: int, job_number: int) List[Any][source]

Create separate parameter and configuration files for a job and prepare for execution.

Parameters:
  • experiment (Experiment) – The experiment object containing information about the experiment.

  • datapoint_item (List[float]) – A single item of data points for the job, represented as a list.

  • step_number (int) – The number of the step in the experiment.

  • job_number (int) – The number of the job within the step.

Returns:

job_cmdline – The command line arguments for running the job.

Return type:

list

Notes

This function creates separate parameter and configuration files for a job based on the provided experiment, datapoint item, step number, and job number. It prepares the job for execution by setting up the necessary files and returning the command line arguments for running the job. The function returns the command line arguments as a list for use with QCG-Pilotjob.

The created files will be saved in the experiment’s directory, under a subdirectory for the step and job. The parameter file will have a name like “params_stepY_jobX.yaml” and the configuration file will have a name like “config_stepY_jobX.yaml”, where “X” is the job number and “Y” the step number.

yotse.utils.blueprint_tools.replace_include_param_file(configfile_name: str, paramfile_name: str) None[source]

Replace the INCLUDE keyword in a YAML config file with a reference to a parameter file.

Parameters:
  • configfile_name (str) – The name of the YAML configuration file to modify.

  • paramfile_name (str) – The name of the parameter file to include in the configuration file.

Notes

This function replaces an INCLUDE keyword in a YAML configuration file with a reference to a parameter file. It loads the YAML config file, searches recursively for an INCLUDE keyword, and replaces it with a reference to the specified parameter file. If the INCLUDE keyword is not found, an error is raised.

yotse.utils.blueprint_tools.represent_scalar_node(dumper: Dumper, data: ScalarNode) ScalarNode[source]

Represent a ScalarNode object as a scalar value in a YAML file.

Parameters:
  • dumper (yaml.Dumper) – The YAML dumper object being used to write the file.

  • data (yaml.ScalarNode) – The ScalarNode object being represented.

Returns:

scalar – The scalar value of the ScalarNode object.

Return type:

str

yotse.utils.blueprint_tools.setup_optimization_dir(experiment: Experiment, step_number: int, job_number: int) None[source]

Create the directory structure for an optimization step.

Parameters:
  • experiment (Experiment) – The Experiment object for which the directory structure should be set up.

  • step_number (int) – The number of the current optimization step.

  • job_number (int) – The number of the job within the optimization step.

Notes

This function creates the directory structure for an optimization step within an experiment. The structure includes a src directory containing several files related to the optimization, and an output directory containing directories for each step and job. The function does not return anything but modifies the file system to create the necessary directories.

The directory structure for the optimization step is as follows (for m optimization steps and n jobs):

src/
├── unified_script.py
├── processing_function.py
├── config.yaml
├── baseline_params.yaml
├── qiapt_runscript.py
output/
├── experiment_name_timestamp_str/
│   ├── step0/
│   │   ├── job0/
│   │   │   ├── stdout0.txt
│   │   │   ├── dataframe_holder.pickle
│   │   │   ├── baseline_params_job0.yaml
│   │   │   └── config_job0.yaml
│   │   ...
│   │   └── jobn/
│   ...
│   └── stepm/
yotse.utils.blueprint_tools.update_yaml_params(param_list: List[Tuple[str, Any]], paramfile_name: str) None[source]

Update parameter values in a YAML file and save the updated file.

Parameters:
  • param_list (List[Tuple[str, Any]]) – A list of tuples containing parameter names and their updated values.

  • paramfile_name (str) – The name of the YAML file containing the parameters to update.

yotse.utils.prediction module

predict_module.py.

This module provides a Predict class for learning and predicting using different regression models.

class yotse.utils.prediction.Predict(model_name: str = 'LR')[source]

Bases: object

A class for learning and predicting using Linear Regression (LR), Bayesian Ridge (BR), or SGDRegressor (AR) models.

Parameters:

model_name (str, optional) – Regression model name (“LR” for Linear Regression, “BR” for Bayesian Ridge, “AR” for SGDRegressor), by default “LR”.

model

The regression model.

Type:

sklearn.base.BaseEstimator

learn(x: np.ndarray, y: np.ndarray) None:[source]

Executes the learning process.

predict(x_new: np.ndarray) np.ndarray:[source]

Predicts value(s) using the linear model.

__init__(model_name: str = 'LR')[source]

Deafault constructor :param model_name:Regression model name.

learn(x: ndarray, y: ndarray) None[source]

Execute learning process :param x: Training data, ndarray of shape (n_samples, n_features) :param y: Target values, ndarray of shape (n_samples,) :return: None.

predict(x_new: ndarray) ndarray[source]

Predict value(s) using linear model :param x_new: Samples, ndarray of shape (n_samples, n_features) :return: Mean of predictive distribution of query points.

yotse.utils.utils module

utils.py.

This module provides functions for file processing, including searching for files with a specific extension in a directory, combining the content of multiple CSV, JSON, or pickle files into a single pandas DataFrame, and converting between NumPy arrays and Python lists.

Functions

get_files_by_extension(directory: str, extension: str) -> List[str]:

Returns a list of files in the given directory with the specified extension.

file_list_to_single_df(files: List[str], extension: str) -> pandas.DataFrame:

Reads CSV, json, or pickle files from a list and combines their content in a single pandas DataFrame.

ndarray_to_list(numpy_array: np.ndarray) -> List[Any]:

Convert a NumPy array to a Python list.

list_to_numpy_array(list_data: List[Any]) -> np.ndarray:

Convert a Python list to a NumPy array.

yotse.utils.utils.file_list_to_single_df(files: List[str], extension: str) DataFrame[source]

Reads CSV, json or pickle files from a list and combines their content in a single pandas dataframe.

Parameters:
  • files (List[str]) – A list of files to read.

  • extension (str) – File extension of the files in the list.

Returns:

Pandas dataframe containing the combined contents of all the files.

Return type:

pandas.Dataframe

yotse.utils.utils.get_files_by_extension(directory: str, extension: str) List[str][source]

Returns a list of files in the given directory with the specified extension.

Parameters:
  • directory (str) – The directory to search for files in.

  • extension (str) – The file extension to search for.

Returns:

A list of files (and their actual location) in the given directory with the specified extension.

Return type:

List[str]

yotse.utils.utils.list_to_numpy_array(list_data: List[Any]) ndarray[source]

Convert a Python list to a NumPy array.

Parameters:

list_data (List[Any]) – The Python list to be converted.

Returns:

A NumPy array containing the elements of the input Python list.

Return type:

np.ndarray

yotse.utils.utils.ndarray_to_list(numpy_array: ndarray) List[Any][source]

Convert a NumPy array to a Python list.

Parameters:

numpy_array (np.ndarray) – The NumPy array to be converted.

Returns:

A Python list containing the elements of the input NumPy array.

Return type:

List[Any]

Module contents