yotse.optimization package
Submodules
yotse.optimization.blackbox_algorithms module
Collection of Subclasses of :class:GenericOptimization implementing different optimization algorithms.
- class yotse.optimization.blackbox_algorithms.BayesOpt(blackbox_optimization: bool, pbounds: Dict[Any, Tuple[int, int]], initial_data_points: ndarray | None = None, naive_parallelization: bool = False, grid_size: int = 1, refinement_factors: List[float] | None = None, fitness_func: Callable[[...], float] | None = None, logging_level: int = 1, **bayesopt_kwargs: Any)[source]
Bases:
GenericOptimization
Bayesian optimization.
- Parameters:
blackbox_optimization (bool) – Whether this is used as a blackbox optimization.
pbounds (dict) – Dictionary with parameters names as keys and a tuple with minimum and maximum values.
initial_data_points (np.ndarray (optional)) – Initial population of data points to start the optimization with. If none are specified lets the algorithm suggest a point. Defaults to None.
fitness_func (function (optional)) – Fitness/objective/cost function/function to optimize. Only needed if blackbox_optimization=False. Default is None.
logging_level (int (optional)) – Level of logging: 1 - only essential data; 2 - include plots; 3 - dump everything. Defaults to 1.
bayesopt_kwargs ((optional)) – Optional arguments to be passed to bayes_opt.BayesianOptimization. See the documentation of that class for more info.
- MAXIMUM = 0
- MINIMUM = 1
- __init__(blackbox_optimization: bool, pbounds: Dict[Any, Tuple[int, int]], initial_data_points: ndarray | None = None, naive_parallelization: bool = False, grid_size: int = 1, refinement_factors: List[float] | None = None, fitness_func: Callable[[...], float] | None = None, logging_level: int = 1, **bayesopt_kwargs: Any) None [source]
Initialize Bayesian optimization.
- create_points_around_suggestion(suggested_point: Dict[str, float]) ndarray [source]
BayesOpt only suggest a single point per iteration. In order to parallelize the optimization we use this function to create multiple data points to evaluate per iteration.
Note: This current implementation is by now way optimal! This is the most naive and simple way to create multiple points to evaluate and should be improved in the future.
- property current_datapoints: ndarray
Return the current datapoints that will be used if an optimization is started now.
In this case it is the currently suggested point.
- get_best_solution() Tuple[List[float], float, int] [source]
Get the best solution. Should be implemented in every derived class.
- Returns:
Solution its fitness and its index in the list of data points.
- Return type:
solution, solution_fitness, solution_idx
- get_function() Callable[[...], float]
Returns the cost function.
- get_new_points() ndarray [source]
Get new points from the BayesianOptimization instance. This is done via the suggest function.
- Returns:
new_points – New points for the next iteration of the optimization.
- Return type:
np.ndarray
- input_params_to_cost_value(solution: List[float], solution_idx: int) Any
Return value of cost function for given set of input parameter values and their index in the set of points.
- Parameters:
solution (list) – Set of input parameter values of shape [param_1, param_2, .., param_n].
solution_idx (int) – Index of the solution within the set of points.
- property max_iterations: int
Return maximum number of iterations of bayesian optimization.
- overwrite_internal_data_points(data_points: ndarray) None [source]
After we generated a new point with get_new_points this function can be used to write that data point (or another point to investigate next) to the class.
- update_internal_cost_data(data: DataFrame) None
Update internal dataframe mapping input parameters to the associated cost from input data.
- Parameters:
data (pandas.Dataframe) – A pandas dataframe containing the collected data in the format cost_value init_param_1 … init_param_n.
- class yotse.optimization.blackbox_algorithms.GAOpt(blackbox_optimization: bool, initial_data_points: ndarray, num_generations: int, num_parents_mating: int, gene_space: ConstraintDict | None = None, refinement_factors: List[float] | None = None, logging_level: int = 1, allow_duplicate_genes: bool = False, fitness_func: Callable[[...], float] | None = None, **pygad_kwargs: Any)[source]
Bases:
GenericOptimization
Genetic algorithm.
- Parameters:
blackbox_optimization (bool) – Whether this is used as a blackbox optimization.
initial_data_points (np.ndarray) – Initial population of data points to start the optimization with.
num_generations (int) – Number of generations in the genetic algorithm.
num_parents_mating (int) – Number of solutions to be selected as parents in the genetic algorithm.
fitness_func (function (optional)) – Fitness/objective/cost function/function to optimize. Only needed if blackbox_optimization=False. Default is None.
gene_space (dict or list (optional)) – Dictionary with constraints. Keys can be ‘low’, ‘high’ and ‘step’. Alternatively list with acceptable values or list of dicts. If only single object is passed it will be applied for all input parameters, otherwise a separate list or dict has to be supplied for each parameter. Defaults to None.
refinement_factors (list (optional)) – Refinement factors for each active parameter in the optimization in range [0.,1.] to be used for manual grid point generation. Defaults to None.
logging_level (int (optional)) – Level of logging: 1 - only essential data; 2 - include plots; 3 - dump everything. Defaults to 1.
allow_duplicate_genes (bool (optional)) – If True, then a solution/chromosome may have duplicate gene values. If False, then each gene will have a unique value in its solution. Defaults to False.
pygad_kwargs ((optional)) – Optional pygad arguments to be passed to pygad.GA. See https://pygad.readthedocs.io/en/latest/README_pygad_ReadTheDocs.html#pygad-ga-class for documentation.
- constraints
Constraints to check for during generation of new points.
- Type:
dict or list
- MAXIMUM = 0
- MINIMUM = 1
- __init__(blackbox_optimization: bool, initial_data_points: ndarray, num_generations: int, num_parents_mating: int, gene_space: ConstraintDict | None = None, refinement_factors: List[float] | None = None, logging_level: int = 1, allow_duplicate_genes: bool = False, fitness_func: Callable[[...], float] | None = None, **pygad_kwargs: Any)[source]
Initialize the GAOpt object.
- _objective_func(ga_instance: pygad.pygad.GA, solution: List[float], solution_idx: int) float [source]
Fitness function to be called from PyGAD.
Wrapper around the actual function to give pygad some more functionality. First, it adds the possibility to choose whether to max-/minimize the fitness. Second, it removes the necessity to pass the ga_instance to the function, thus making the implementation more general.
- Parameters:
ga_instance – Instance of pygad.GA.
solution (List[float]) – List of solutions.
solution_idx (int) – Index of solution.
- Return type:
Fitness value.
- property current_datapoints: ndarray
Return the current datapoints that will be used if an optimization is started now.
In this case it is the population.
- get_best_solution() Tuple[List[float], None, None] [source]
Get the best solution. We don’t yet know the fitness for the solution (because we have not run the simulation for those values yet), so just return the point.
- Returns:
Solution its fitness and its index in the list of cost function solutions.
- Return type:
solution, solution_fitness, solution_idx
- get_function() Callable[[...], float]
Returns the cost function.
- get_new_points() ndarray [source]
Get new points from the GA (aka return the next population).
- Returns:
new_points – New points for the next iteration of the optimization.
- Return type:
np.ndarray
- input_params_to_cost_value(solution: List[float], solution_idx: int) Any
Return value of cost function for given set of input parameter values and their index in the set of points.
- Parameters:
solution (list) – Set of input parameter values of shape [param_1, param_2, .., param_n].
solution_idx (int) – Index of the solution within the set of points.
- property max_iterations: int
Return maximum number of iterations of GA = number of generation.
- overwrite_internal_data_points(data_points: ndarray) None [source]
Overwrite internal GA population with new datapoints from experiment.
- update_internal_cost_data(data: DataFrame) None
Update internal dataframe mapping input parameters to the associated cost from input data.
- Parameters:
data (pandas.Dataframe) – A pandas dataframe containing the collected data in the format cost_value init_param_1 … init_param_n.
yotse.optimization.fitting module
Module: my_func_fit
This module provides a class, FuncFit, for fitting polynomials to data and calculating errors and chi-square values.
- Classes:
FuncFit: A class for polynomial fitting and error/chi-square calculation.
- Functions:
add_noise: Static method to add noise to an array.
chi_square: Static method to calculate chi-square.
err: Static method to calculate error.
find_poly_fit: Method to find the best-fitting polynomial for given data.
- class yotse.optimization.fitting.FuncFit[source]
Bases:
object
FuncFit class for fitting polynomials, calculating errors, and chi-square values.
- - add_noise
Static method to add noise to an array.
- - chi_square
Static method to calculate chi-square.
- - err
Static method to calculate error.
- - find_poly_fit
Method to find the best-fitting polynomial for given data.
- static add_noise(arr: ndarray, level: float = 0.2) ndarray [source]
Add noise to the given array.
- Parameters:
arr (numpy.ndarray) – Array of values.
level (float, optional) – Noise level. Default is 0.2.
- Returns:
Input array with added noise.
- Return type:
numpy.ndarray
- static chi_square(poly: Polynomial, x: List[float], y: List[float]) Any [source]
Calculate chi-square.
- Parameters:
poly (numpy.polynomial.Polynomial) – Polynomial (numpy.poly1d object).
x (list of float) – X-points of the original dataset.
y (list of float) – Y-points of the original dataset.
- Returns:
Chi-square.
- Return type:
Any
- static err(poly: poly1d, x: List[float], y: List[float]) Any [source]
Calculate error.
- Parameters:
poly (numpy.poly1d) – Polynomial (numpy.poly1d object).
x (list of float) – X-points of the original dataset.
y (list of float) – Y-points of the original dataset.
- Returns:
Error.
- Return type:
Any
- find_poly_fit(x: List[float], y: List[float], max_order: int = 41) List[Tuple[Any, Any]] [source]
Find a polynomial that fits the given data best.
- Parameters:
x (list of float) – X-points.
y (list of float) – Y-points.
max_order (int, optional) – Maximum order of a polynomial. Default is 41.
- Returns:
A list of tuples containing polynomials and their corresponding errors. The list is sorted in ascending order using the error values.
- Return type:
list of tuple
yotse.optimization.generic_optimization module
generic_optimization.py.
This module defines the GenericOptimization class, a base class for optimization algorithms.
Classes
- GenericOptimization:
A base class for optimization algorithms.
- class yotse.optimization.generic_optimization.GenericOptimization(function: Callable[[...], float], opt_instance: ModGA | BayesianOptimization = None, refinement_factors: List[float] | None = None, logging_level: int = 1, extrema: int = 1, evolutionary: bool = False)[source]
Bases:
object
Base class for optimization algorithms.
- Parameters:
function (Callable[..., float]) –
- Fitness function to be used for optimization. This can either be a discrete mapping between input parameters
and associated cost (in the case of blackbox optimization) or a known analytical function (in the case of whitebox optimization).
opt_instance (Union[ModGA, BayesianOptimization, None] (optional)) – Instance of the optimization engine. Defaults to None.
refinement_factors (list (optional)) – Refinement factors for all parameters. If specified must be list of length = #params. Defaults to None.
logging_level (int (optional)) – Level of logging: 1 - only essential data; 2 - include plots; 3 - dump everything. Defaults to 1.
extrema (int (optional)) – Define what type of problem to solve. ‘extrema’ can be equal to either MINIMUM or MAXIMUM. The optimization algorithm will look for minimum and maximum values respectively. Defaults to MINIMUM.
evolutionary (bool (optional)) – Whether the optimization algorithm allows for evolutionary optimization. Defaults to False.
- MAXIMUM = 0
- MINIMUM = 1
- __init__(function: Callable[[...], float], opt_instance: ModGA | BayesianOptimization = None, refinement_factors: List[float] | None = None, logging_level: int = 1, extrema: int = 1, evolutionary: bool = False)[source]
Initialize the GenericOptimization object.
- abstract property current_datapoints: ndarray
Return the current datapoints that will be used if an optimization is started now.
- get_best_solution() Tuple[List[float], float, int] [source]
Get the best solution (aka the best set of input parameters). Should be implemented in every derived class.
- Returns:
Solution (set of params), its fitness (associated cost) and its index in the list of data points. # todo: why do we care about the index?
- Return type:
solution, solution_fitness, solution_idx
- abstract get_new_points() ndarray [source]
Get new points. Should be implemented in every evolutional algorithm.
- Returns:
new_points – New points for the next iteration of the optimization.
- Return type:
np.ndarray
- input_params_to_cost_value(solution: List[float], solution_idx: int) Any [source]
Return value of cost function for given set of input parameter values and their index in the set of points.
- Parameters:
solution (list) – Set of input parameter values of shape [param_1, param_2, .., param_n].
solution_idx (int) – Index of the solution within the set of points.
- abstract property max_iterations: int
Return the maximum number of iterations of the optimization if applicable.
- abstract overwrite_internal_data_points(data_points: ndarray) None [source]
Overwrite the internal set of data points with one externally generated. E.g. when manually passing new points to an evolutionary optimization algorithm.
- Parameters:
data_points (np.ndarray) – Array containing all new data points that should be passed to the optimization.
yotse.optimization.modded_pygad_ga module
modded_pygad_ga.py.
This module provides a modified version of the Genetic Algorithm (GA) class from the PyGAD library.
Classes
- ModGA:
A modified Genetic Algorithm class that extends the functionality of the GA class in the PyGAD library.
- class yotse.optimization.modded_pygad_ga.ModGA(*args: Any, **kwargs: Any)[source]
Bases:
GA
A modified Genetic Algorithm class that extends the functionality of the GA class in the PyGAD library.
- run() None: [source]
Runs the genetic algorithm. This is the main method in which the genetic algorithm is evolved through a number of generations.
- __init__(*args: Any, **kwargs: Any) None
yotse.optimization.optimizer module
optimizer.py.
This module defines the Optimizer class, which serves as a facilitator for running optimization algorithms. It ensures that the provided optimization algorithm is valid and manages the execution state of the optimization process.
Classes
- Optimizer:
A class that wraps around a generic optimization algorithm.
- class yotse.optimization.optimizer.Optimizer(optimization_algorithm: GenericOptimization)[source]
Bases:
object
Optimizer class that wraps around a generic optimization algorithm.
This class serves as a facilitator for running optimization algorithms which are defined as subclasses of GenericOptimization. It ensures that the provided optimization algorithm is valid and manages the execution state of the optimization process.
- optimization_algorithm
The optimization algorithm instance that will be executed.
- Type:
- _is_executed
Internal flag to track whether the optimization has been executed.
- Type:
bool
- Raises:
ValueError – If the optimization_algorithm is not an instance of GenericOptimization.
- __init__(optimization_algorithm: GenericOptimization)[source]
Initializes the Optimizer with the given optimization algorithm.
- Parameters:
optimization_algorithm (GenericOptimization) – An instance of GenericOptimization or its subclass that defines the optimization algorithm to be executed.
- Raises:
ValueError – If the given optimization_algorithm is not an instance or subclass of GenericOptimization.
- construct_points(experiment: Experiment, evolutionary: bool, points_per_param: int | None = None) None [source]
Constructs new set of values around the solution and write them to the Experiment.
- Parameters:
experiment (Experiment) – Object of Experiment that the points should be constructed for.
evolutionary (bool) – True if the optimization algorithm is evolutionary and generates a new set of points. False if the optimization algorithm generates a best point and points should be constructed around it.
points_per_param (int (optional)) – Number of points to construct for each parameter. If None then for each parameter the initially specified number of points Parameter.number_points will be created. Only used when evolutionary=False. Defaults to None.
- grid_based_point_creation(experiment: Experiment, points_per_param: int | None = None) None [source]
Refines the parameter search space based on the best solution and creates new data points for the next round of optimization.
This method uses the best solution found so far to refine the parameter ranges and generate a new grid of data points for further exploration. New points are created around the best solution using the refinement factors defined in the optimization algorithm.
- Parameters:
experiment (Experiment) – The experiment object containing the parameters and current data points.
points_per_param (Optional[int], optional) – The number of points to generate for each parameter. If None, the number specified in each parameter object is used, by default None.
- Raises:
AssertionError – If the refinement factors are not defined in the optimization algorithm.
ValueError – If the number of refinement factors does not match the number of parameters in the experiment.
Notes
The method first retrieves the best solution from the optimization algorithm and uses the associated fitness value to guide the creation of a refined parameter space. It then updates the ranges of each active parameter based on the refinement factors and generates a new set of data points accordingly. The new data points are then used to overwrite the internal data points of the optimization algorithm.
This method must be called after the optimization algorithm has found an initial best solution. It modifies the experiment object’s parameters in-place, adjusting their ranges and data points without returning any value.
- optimize() None [source]
Executes the optimization algorithm.
This method calls the execute method of the optimization_algorithm and sets the _is_executed flag to True upon successful completion. This method should be invoked to perform the optimization process.
- Return type:
None
- suggest_best_solution() Tuple[List[float], float, int] [source]
Suggest the best solution found by the optimization algorithm.
This method queries the underlying optimization algorithm for the best solution it has discovered so far, along with the corresponding fitness value and the index of the solution.
- Returns:
A tuple containing the best solution as a list of floats, the fitness value of this solution as a float, and the index of the solution within the population as an integer.
- Return type:
Tuple[List[float], float, int]
Notes
The ‘best_solution’ represents the variables of the optimum result according to the objective function used in the optimization algorithm. The ‘solution_fitness’ is a numerical value representing how ‘good’ the solution is - the higher/lower the better depending on the problem. The ‘solution_index’ indicates the position of the best solution in the population if applicable.
- update_blackbox_cost_data(experiment: Experiment, data: DataFrame) None [source]
Update internal dataframe of the optimization algorihtm, mapping input parameters to the associated cost from input data.
Note: This also checks that the ordering of the entries is the same as the data_points of the experiment.
- Parameters:
experiment (Experiment) – The experiment object containing the parameters and current data points.
data (pandas.Dataframe) – A pandas dataframe containing the collected data in the format cost_value init_param_1 … init_param_n.
yotse.optimization.whitebox_algorithms module
whitebox_algorithms.py.
This module provides classes for performing whitebox optimization, aka an optimization where the function is known.
Classes
- SciPyOpt:
A class for optimization using the scipy.optimize.minimize function.
- class yotse.optimization.whitebox_algorithms.SciPyOpt(fun: Callable[[...], float], x0: Any, args: Tuple[Any] | None = (), method: str = 'BFGS', jac: Callable[[...], Any] | None = None, bounds: Sequence[Tuple[float, float]] | None = None, constraints: dict | Sequence[dict] | None = (), tol: float | None = None, callback: Callable[[...], Any] | None = None, options: Dict[str, Any] | None = None)[source]
Bases:
GenericOptimization
A class for optimization using the scipy.optimize.minimize function.
- Parameters:
fun (callable) – The objective function to be minimized.
x0 (array_like) – Initial guess.
args (tuple, optional) – Extra arguments passed to the objective function and its derivatives (if any).
method (str, optional) – Type of solver. Default is ‘BFGS’.
jac (callable or None, optional) – Jacobian (gradient) of the objective function. If None, it will be computed numerically.
bounds (sequence or None, optional) – Bounds for variables (only for L-BFGS-B, TNC, COBYLA, and trust-constr methods).
constraints (dict or sequence of dict, optional) – Constraints definition (only for COBYLA and trust-constr methods).
tol (float or None, optional) – Tolerance for termination. For detailed control, use solver-specific options.
callback (callable, optional) – Called after each iteration, as callback(xk), where xk is the current parameter vector.
options (dict, optional) – A dictionary of solver options.
- MAXIMUM = 0
- MINIMUM = 1
- __init__(fun: Callable[[...], float], x0: Any, args: Tuple[Any] | None = (), method: str = 'BFGS', jac: Callable[[...], Any] | None = None, bounds: Sequence[Tuple[float, float]] | None = None, constraints: dict | Sequence[dict] | None = (), tol: float | None = None, callback: Callable[[...], Any] | None = None, options: Dict[str, Any] | None = None)[source]
Initialize the SciPyOpt object.
- Parameters:
fun (callable) – The objective function to be minimized.
x0 (array_like) – Initial guess.
args (tuple, optional) – Extra arguments passed to the objective function and its derivatives (if any).
method (str, optional) – Type of solver. Default is ‘BFGS’.
jac (callable or None, optional) – Jacobian (gradient) of the objective function. If None, it will be computed numerically.
bounds (sequence or None, optional) – Bounds for variables (only for L-BFGS-B, TNC, COBYLA, and trust-constr methods).
constraints (dict or sequence of dict, optional) – Constraints definition (only for COBYLA and trust-constr methods).
tol (float or None, optional) – Tolerance for termination. For detailed control, use solver-specific options.
callback (callable, optional) – Called after each iteration, as callback(xk), where xk is the current parameter vector.
options (dict, optional) – A dictionary of solver options.
- Return type:
None
- property current_datapoints: ndarray
Return the current datapoints that will be used if an optimization is started now.
In this case it is the initial guess.
- execute() None [source]
Execute the optimization using the specified parameters.
- Returns:
The optimization result.
- Return type:
scipy.optimize.OptimizeResult
- get_best_solution() Tuple[List[float], float, int] [source]
Get the best solution. Should be implemented in every derived class.
- Returns:
Solution, its fitness and its index in the list of data points.
- Return type:
solution, solution_fitness, solution_idx
- get_function() Callable[[...], float]
Returns the cost function.
- input_param_cost_df: pandas.DataFrame
- input_params_to_cost_value(solution: List[float], solution_idx: int) Any
Return value of cost function for given set of input parameter values and their index in the set of points.
- Parameters:
solution (list) – Set of input parameter values of shape [param_1, param_2, .., param_n].
solution_idx (int) – Index of the solution within the set of points.
- property max_iterations: int
Return maximum number of iterations of SciPy optimization if specified.
- update_internal_cost_data(data: DataFrame) None
Update internal dataframe mapping input parameters to the associated cost from input data.
- Parameters:
data (pandas.Dataframe) – A pandas dataframe containing the collected data in the format cost_value init_param_1 … init_param_n.