SG++-Doxygen-Documentation
optimization.py

On this page, we look at an example application of the sgpp::optimization module.

Versions of the example are given in all languages currently supported by SG++: C++, Python, Java, and MATLAB.

The example interpolates a bivariate test function with B-splines instead of piecewise linear basis functions to obtain a smoother interpolant. The resulting sparse grid function is then minimized with the method of steepest descent. For comparison, we also minimize the objective function with Nelder-Mead's method.

First, we import pysgpp and the required modules.

1 import pysgpp
2 import math
3 import sys

The function $$f\colon [0, 1]^d \to \mathbb{R}$$ to be minimized is called objective function and has to derive from pysgpp.OptScalarFunction. In the constructor, we give the dimensionality of the domain (in this case $$d = 2$$). The eval method evaluates the objective function and returns the function value $$f(\vec{x})$$ for a given point $$\vec{x} \in [0, 1]^d$$.

1 class ExampleFunction(pysgpp.OptScalarFunction):
2  """Example objective function from the title of my Master's thesis."""
3  def __init__(self):
4  super(ExampleFunction, self).__init__(2)
5
6  def eval(self, x):
7  """Evaluates the function."""
8  return math.sin(8.0 * x[0]) + math.sin(7.0 * x[1])
9
10 def printLine():
11  print("----------------------------------------" + \
12  "----------------------------------------")

We have to disable OpenMP within pysgpp since it interferes with SWIG's director feature.

2
3 print("sgpp::optimization example program started.\n")
4 # increase verbosity of the output
5 pysgpp.OptPrinter.getInstance().setVerbosity(2)

Here, we define some parameters: objective function, dimensionality, B-spline degree, maximal number of grid points, and adaptivity.

1 # objective function
2 f = ExampleFunction()
3 # dimension of domain
4 d = f.getNumberOfParameters()
5 # B-spline degree
6 p = 3
7 # maximal number of grid points
8 N = 30
9 # adaptivity of grid generation
10 gamma = 0.95

First, we define a grid with modified B-spline basis functions and an iterative grid generator, which can generate the grid adaptively.

1 grid = pysgpp.Grid.createModBsplineGrid(d, p)
2 gridGen = pysgpp.OptIterativeGridGeneratorRitterNovak(f, grid, N, gamma)

With the iterative grid generator, we generate adaptively a sparse grid.

1 printLine()
2 print("Generating grid...\n")
3
4 if not gridGen.generate():
5  print("Grid generation failed, exiting.")
6  sys.exit(1)

Then, we hierarchize the function values to get hierarchical B-spline coefficients of the B-spline sparse grid interpolant $$\tilde{f}\colon [0, 1]^d \to \mathbb{R}$$.

1 printLine()
2 print("Hierarchizing...\n")
3 functionValues = gridGen.getFunctionValues()
4 coeffs = pysgpp.DataVector(len(functionValues))
5 hierSLE = pysgpp.OptHierarchisationSLE(grid)
6 sleSolver = pysgpp.OptAutoSLESolver()
7
8 # solve linear system
9 if not sleSolver.solve(hierSLE, gridGen.getFunctionValues(), coeffs):
10  print("Solving failed, exiting.")
11  sys.exit(1)

We define the interpolant $$\tilde{f}$$ and its gradient $$\nabla\tilde{f}$$ for use with the gradient method (steepest descent). Of course, one can also use other optimization algorithms from sgpp::optimization::optimizer.

1 printLine()
2 print("Optimizing smooth interpolant...\n")
3 ft = pysgpp.OptInterpolantScalarFunction(grid, coeffs)
6 x0 = pysgpp.DataVector(d)

The gradient method needs a starting point. We use a point of our adaptively generated sparse grid as starting point. More specifically, we use the point with the smallest (most promising) function value and save it in x0.

1 gridStorage = gridGen.getGrid().getStorage()
2
3 # index of grid point with minimal function value
4 x0Index = 0
5 fX0 = functionValues[0]
6 for i in range(1, len(functionValues)):
7  if functionValues[i] < fX0:
8  fX0 = functionValues[i]
9  x0Index = i
10
11 x0 = gridStorage.getCoordinates(gridStorage.getPoint(x0Index));
12 ftX0 = ft.eval(x0)
13
14 print("x0 = {}".format(x0))
15 print("f(x0) = {:.6g}, ft(x0) = {:.6g}\n".format(fX0, ftX0))

We apply the gradient method and print the results.

3 xOpt = gradientDescent.getOptimalPoint()
4 ftXOpt = gradientDescent.getOptimalValue()
5 fXOpt = f.eval(xOpt)
6
7 print("\nxOpt = {}".format(xOpt))
8 print("f(xOpt) = {:.6g}, ft(xOpt) = {:.6g}\n".format(fXOpt, ftXOpt))

For comparison, we apply the classical gradient-free Nelder-Mead method directly to the objective function $$f$$.

1 printLine()
2 print("Optimizing objective function (for comparison)...\n")
5 xOptNM = nelderMead.getOptimalPoint()
6 fXOptNM = nelderMead.getOptimalValue()
7 ftXOptNM = ft.eval(xOptNM)
8
9 print("\nxOptNM = {}".format(xOptNM))
10 print("f(xOptNM) = {:.6g}, ft(xOptNM) = {:.6g}\n".format(fXOptNM, ftXOptNM))
11
12 printLine()
13 print("\nsgpp::optimization example program terminated.")

The example program outputs the following results:

sgpp::optimization example program started.

--------------------------------------------------------------------------------
Generating grid...

Adaptive grid generation (Ritter-Novak)...
100.0% (N = 29, k = 3)
Done in 3ms.
--------------------------------------------------------------------------------
Hierarchizing...

Solving linear system (automatic method)...
estimated nnz ratio: 59.8%
Solving linear system (Armadillo)...
constructing matrix (100.0%)
nnz ratio: 58.0%
Done in 0ms.
Done in 1ms.
--------------------------------------------------------------------------------
Optimizing smooth interpolant...

x0 = [0.625, 0.75]
f(x0) = -1.81786, ft(x0) = -1.81786

9 steps, f(x) = -2.000780
Done in 1ms.

xOpt = [0.589526, 0.673268]
f(xOpt) = -1.99999, ft(xOpt) = -2.00078

--------------------------------------------------------------------------------
Optimizing objective function (for comparison)...


We see that both the gradient-based optimization of the smooth sparse grid interpolant and the gradient-free optimization of the objective function find reasonable approximations of the minimum, which lies at $$(3\pi/16, 3\pi/14) \approx (0.58904862, 0.67319843)$$.