Class-local configuration proposal

I have several problems with a centralized config object. You have to change something in two separate places when you add a new configuration variable to a class. Often the centralized config object doesn't know much about its readers, it is just a dict of string keys to values (using an any type or variant type). The string keys and the values need to be kept in sync manually with the consumer class that is using the config. If you make a typo in the string key, this is usually not detected. If an option is removed from the consumer, nobody notices.

A centralized config object creates a tight coupling between two things which are far away in the source code and conceptually. Mostly, people write a centralized config object, because they want to dump the configuration in some way, and because some options affect several consumers (like in our case with the cuts). An elegant design for this was already discussed briefly at the end of the June workshop. Build a tree of all objects with a configuration, so that calling a dump method of the object that is the root node also dumps the configuration of all children. Children that need to access a configuration of a higher node can do that by walking the tree upwards. Qt uses something like this (for life-time management).

I suppose we want C8 to be configurable from C++ and from Python. Let's discuss both options.

Configuration only in Python

If the configuration happens only in Python, we can just use Python to generate standard configured objects via some factory function. Some mock-up code:

The main script.

from corsika import ShowerSimulator, Particle, HDFWriter
from corsika.ProcessSequence import standard_sequence # a factory function
from corsika.units import GeV

def main():
   with HDFWriter(...) as w:
      seq = standard_sequence(energy_cut=1*GeV, w)
      sim = ShowerSimulator(seq)
      proton = Particle(...)
      sim(proton)

ProcessSequence.py would look like this:

from corsika.cpp import _ProcessSequence as ProcessSequence
from corsika.Process1 import standard_process1
from corsika.Process2 import standard_process2

# makes a standard configured sequence, options without a default are passed as args
def standard_sequence(energy_cut, output_writer):
   seq = ProcessSequence(energy_cut=energy_cut)
   seq.add(standard_process1())
   seq.add(standard_process2())
   seq.add(output_writer)
   return seq

Process1.py would look like this:

from corsika.cpp import _Process1 as Process1

def standard_process1():
    p = Process1(1, 3, 4)
    p.set_some_option(5)
    return p

Make your own configuration:

custom_process1.py

from corsika.Process1 import Process1, standard_process1

# use standard, just change some setting
def modified_standard_process1():
    p = standard_process1()
    p.set_some_option(6)
    return p

# completely new settings: copy paste from standard_process1 and change values
def custom_process1():
    p = Process1(3, 4, 6)
    p.set_some_option(6)
    return p

This doesn't look like normal configuration, but requires zero overhead since the Python bindings are written anyway.

Pros:

No extra code needs to be written for configuration
Support for setting options that need to be fixed after object construction

Cons:

Dumping the current configuration in a way that it can be read in again is not trivial
C++ programs that want to use C8 must embed a Python interpreter (absolutely doable, but additional trouble)

Configuration in C++

// generic streamable named value
template <class T>
struct Setting {
  const char* name; // must be unique
  T value;

  friend std::ostream& operator<<(std::ostream& os, const Setting& s) {
    os << name << ": " << value << "\n";
    return os;
  }

  friend std::istream& operator>>(std::istream& is, Setting& s) {
    // read next line, check if key is our key name, if not, search for
    // key name in input stream and set value, otherwise keep current default value;
    // alternatively one could fail if the value is not found, to enforce setting
    // all values
    // ...
    return is;
  }
};

template <class Parent, class Child>
class MyProcess {
public:
  // implementation
  MyProcess(Parent* p) : parent_(p) {}

  friend std::ostream& operator<<(std::ostream& os, const MyProcess& m) {
    os << m.a_ << m.b_;
    for (auto&& ch : children_)
      os << ch;
    return os;
  }

  friend std::istream& operator>>(std::istream& os, MyProcess& m) {
    is >> m.a_ >> m.b_;
    for (auto&& ch : children_)
      is >> ch;
    return is;
  }

private:
  Parent* parent_; // pointer to parent not strictly needed here
  std::vector<Child> children_;

  Setting<double> a_ = {"MyProcess.a", 1};
  Setting<std::string> b_ = {"MyProcess.b", std::string("foo")};
};

All configurable objects are linked, parents know their children and vice-versa. To read/write the whole configuration, one calls the stream operator of the root parent, os << root or is >> root.

The streamers are boilerplate code that has to be written for all configurables, but other solutions also require writing boilerplate code, possibly more.

Pros:

Configuration is local, defaults are local, adding a new setting is easy
Dumping the current configuration in a way that it can be read in again is easy
No Python required for configuration

Cons:

No special support for setting options that need to be fixed after object construction
typos in key names still trigger use of a default value (but one could enforce setting all values)

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information