New release - SUMO Toolbox 2014a

After several years of development we are proud to release a new version of the SUMO Toolbox. In this release we have switched to using a year based versioning system making this version "2014a". This new release can be downloaded through the download page and includes many bug fixes and new features.

The SUMO Toolbox 2014a has support for the latest version of Matlab (2014a). We also have doxygen documentation now. We have highlighted a number of important changes and additions below. A more detailed changelog is included at the bottom of this page.

Note: We are planning to make significant changes to the SUMO toolbox in the future, including reworking several subsystems of SUMO (e.g., significantly reducing the overhead on model building), remove features that are rarely used, etc. To that end we would like to ask you to fill in this small webform so we can get a better idea of what kind of features and improvements our userbase is interested in and what features are rarely used. Thanks!

Configuration

Important: Several components in the SUMO toolbox have been renamed. This means that old xml files will not work without some modifications as several xml tags have also been renamed. In particular:

  • SampleSelector is now SequentialDesign
  • SampleEvaluator is now DataSource
  • AdaptiveModelBuilder is now ModelBuilder

Simply changing the affected xml tags to the new name should work, though it is suggested to refer to the default configuration file as more changes may be needed.

This release of the SUMO toolbox has also proper support for input validation of the <Option> tags. If an option does not exist for a component (e.g., a small typo) then an appropriate error will be given. Moreover SUMO will give suggestions for options:

10:11 [SEVERE] The following error was caught in SUMODriver: Unmatched option: 'regressionFdunction'
Did you mean one of regressionFunction, creationFcn, rootDirectory, constraintFcn, crossoverFcn, multipleBasisFunctionsAllowed, mutationFcn, minimaOutput, inDim ?

In addition, it is now possible to construct many components in Matlab without using the xml framework of SUMO. Almost all components, except for ModelBuilders and SUMO Models, can be directly used in your own custom algorithms. For instance, you can use the initial designs of SUMO to easily generate datasets for your problems. It has also full support for the above mentioned input validation improvements and will give hints on how to construct the class.

>> lhd = TPLatinHypercubeDesign                    
Error using InitialDesign (line 46)
Too few parameters. Missing: points, inDim, outDim.
Signature: (points, inDim, outDim, ...)

In the above example the TPLatinHypercubeDesign has three mandatory arguments (points, inDim, outDim). The ellipus (...) indicates the additional options (as accepted by the <Option> tags in the xml file for that component).

>> lhd = TPLatinHypercubeDesign(50, 2, 1 );
>> samples = lhd.generate();              
>> size(samples)

ans =

    50     2

Examples of custom algorithms utilizing SUMO components can be found in src/scripts/matlab/:

  • runKrigingOptimization.m: implements the traditional expectd improvement-based optimization algorithm
  • runCoKrigingOptimization.m: implements an optimization algorithm using CoKriging for multi-fidelity data
  • runParegoOptimization.m: implements a multi-objective optimization algorithm similar to ParEGO
  • runMinimaxOptimization.m: uses the expected improvement to solve minimax optimization problems

Extreme Learning Machines (ELM)

Extreme Learning Machines have been added as model type. This new training method for single layer feedforward neural networks is a lot faster compared to traditional training algorithms. This model type can be applied when usage of neural networks is appropriate. The default configuration file includes an example. The number of hidden neurons and network weights
are optimized automatically.

Translational Propagation Latin Hypercube Designs (TPLHD)

Latin hypercubes have always been the primary initial design of the SUMO Toolbox due to their space-filling properties. An important aspect of Latin Hypercubes is the optimization process to estimate the optimal configuration. This process can be very lengthy as many configurations of the samples are possible. In the past, the toolbox downloaded pre-optimized Latin Hypercubes from the excellent space-filling designs website. Unfortunately, this website is temporarily offline and the designs have disappeared.

A new implementation to generate (nearly) optimal Latin Hypercubes known as the Translational Propagation algorithm has been added to the toolbox. This algorithm is very fast and up to an input dimensionality of 6, it approaches the optimal Latin Hypercube very well. For problems of higher dimensionality, the design will still be space-filling but the probability of hitting the optimal Latin Hypercube is low. As this new algorithm is now the default choice, existing experiments may produce different results due to a different initial design, although the impact is minimal.

Sequential initial designs

The toolbox supports two sequential space-filling design strategies (density and density-optimizer). An initial design type was added to allow generation of an initial space-filling design based on these methods. Custom sequential design methods can be used as well, although these methods may not use simulator responses or intermediate models (as these are not yet available during initial design phase).

 

WEKA library

SUMO now supports the machine learning algorithms from the WEKA library. WEKA is a collection of algorithms for data mining and machine learning tasks such as regression, classification, clustering, association rule mining and data pre-processing and visualisation. In addition to the model types already available in SUMO, it is now possible to use models from WEKA.

The list of algorithms supported by WEKA can be found here. Only classification and regression algorithms from WEKA have been tested with SUMO, however it is possible to use data pre-processing, clustering and other algorithms as well. Please refer to the default configuration file for the explanation of the usage.

To specify whether the WEKA model is being used for classification or regression, the variables ‘classificationMode’ and ‘numberOfClasses’ need to be set in the ContextConfig in the configuration file. The classificationMode needs to be ‘true’ for classification problems. The variable numberOfClasses is ignored for regression problems.

Multi-objective optimization

The SUMO toolbox already supported the single-objective expected improvement (and probability of improvement) criterion for a while. Now it is also possible to use the multi-objective expected improvement and probability of improvement criteria for efficient multi-objective surrogate-based optimization (MOSBO) of expensive problems.

An example of the hypervolume-based probability of improvements is found in the default configuration file as the sequential design with id 'paretoPoIHv'. In addition a full demo configuration file is included as well (config/demo/demo-krigingMosbo.xml).

Probability of Feasibility (PoF)

In addition, SUMO now supports expensive constraints using the probability of feasibility (PoF) criterion for both single- and multi-objective optimization [1]. An example is found as the 'constrainedExpectedImprovement' sequential design in the default configuration file. Please note that in case of constrained optimization problems, the expensive constraints are specified in the simulator configuration file as additional outputs. Thus, it is necessary to specify which of the outputs are ‘objectives’  (as opposed to constraints). The ‘objectivesIdx’ tag needs to be set within the candidate ranker in the sequential design to specify this. The indices corresponding to the true objectives should be specified as ‘1’ in the binary array and the total number of elements in the array must equal the sum of objectives and constraints.

[1] A Multi-Objective Surrogate-Based Constrained Optimization Algorithm
Prashant Singh, Ivo Couckuyt, Francesco Ferranti and Tom Dhaene
Proceedings of IEEE World Congress on Computational Intelligence 2014, Beijing, China.

Changelog

  • Support for new matlab versions (tested upto Matlab 2014a)
  • Important: renamed several components in SUMO, existing xml files will not work anymore!
    • SampleSelector is now SequentialDesign
    • SampleEvaluator is now DataSource
    • AdaptiveModelBuilder is now ModelBuilder
  • Added new space-filling sequential design algorithms (see density and density-optimizer components in default.xml)
  • Added the TPLHD Algorithm, which is now the default for lhdWithCornerPoints
  • Added new Input-Output sequential design algorithm (FLOLA-Voronoi)
  • Added new surrogate model types (Eureqa, Extreme Learning Machines, WEKA library)
  • Added the CombinedModelBuilder
    • applies different ModelBuilders (and thus Model types) on the same data and optionally builds an Ensemble model of the best ones
    • similar to the heterogenetic ModelBuilder but faster
    • see the default.xml for an example (heterosequential)
  • Updated third-party surrogate model libraries (LS-SVM, FANN, GPML)
  • Input validation for the xml files (SUMO now gives an error for unknown options)
  • Many SUMO components (Optimizers, SequentialDesigns, ...) are now directly usable from custom Matlab scripts
    • This bypasses the xml configuration and allows implementation of custom surrogate modeling algorithms
    • See runKrigingOptimization, etc. scripts (inside src/scripts/matlab)
  • Added the Efficient Multi-objective Optimization (EMO) algorithm (multi-objective expected improvement)
  • Added the Probability of Feasibility (PoF) criterion for constrained optimization problems
  • Added the well-known NSGAII, SPEA2 and SMS-EMOA evolutionary algorithms (for multi-objective optimization)
  • Reworked and improved the Kriging surrogate model
  • New and much faster Matlab DataSource (former SampleEvaluator) that avoids the java overhead (it is the default now)
  • Improved the log files generated by SUMO
    • Added a timestamp
    • Added the component name that outputs each line (useful for debugging)
  • Support for the Matlab Compiler Runtime (MCR) environment. This is useful for running the toolbox on a cluster where matlab may not be available otherwise

Comments

Hi,
I am trying to model with 8 inputs and what I am getting presently are plot in the form of slices which aren't useful for me. What can I do to change that. Also, how can I make predictions using SUMO ?

Thanks & Regards,
Tapas

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.