The format of CORSIKA~8 is designed to allow simple and robust managmenet of large libraries, as well as high reading performance. There is a dedicated python library to help processing data.
The format of CORSIKA8 is designed to allow simple and robust managmenet of large libraries, as well as high reading performance. There is a dedicated python library to help processing data.
The basic structure of output is structured on the filesystem itself with a couple of subdirectories and files. Each run of CORSIKA~8 creates a library that can contain any number of showers.
The basic structure of output is structured on the filesystem itself with a couple of subdirectories and files. Each run of CORSIKA~8 creates a library that can contain any number of showers.
The format is equally suited for single huge showers as well as for a very high number of very low-energy showers. Each module included in the run can
The format is equally suited for single huge showers as well as for a very high number of very low-energy showers. Each module included in the run can
produce output inside this directory. The module output is separeted in individual user-named sub-directories, each containing files produced by the module. The file format is either yaml for basic configuration and summary data, or Apache parquet for any other (binary, compressed)
produce output inside this directory. The module output is separeted in individual user-named sub-directories, each containing files produced by the module. The file format is either yaml for basic configuration and summary data, or Apache parquet for any other (binary, compressed)
data. Parquet is optimal for columnar/tabular data as it is produced by CORSIKA~8.
data. Parquet is optimal for columnar/tabular data as it is produced by CORSIKA8.
One advantage of this format is that with normal filesystem utilties users can manage the libraries. On all systems there are tools available to
One advantage of this format is that with normal filesystem utilties users can manage the libraries. On all systems there are tools available to
directly read/process yaml as well as parquet files. If you, for example, don't need the particle data for space reasons, this is very simple to remove from a library. Individual
directly read/process yaml as well as parquet files. If you, for example, don't need the particle data for space reasons, this is very simple to remove from a library. Individual
...
@@ -19,13 +19,13 @@ For example, the output of the "vertical_EAS" example program looks like this:
...
@@ -19,13 +19,13 @@ For example, the output of the "vertical_EAS" example program looks like this:
vertical_EAS_outputs/
vertical_EAS_outputs/
config.yaml
config.yaml
summary.yaml
summary.yaml
obsplane/
particles/
config.yaml
config.yaml
summary.yaml
summary.yaml
particles.parquet
particles.parquet
The "vertical_EAS_outputs" and the "obsplane" are user-defined names and can be arranged/changed. But the type
The "vertical_EAS_outputs" and the "particles" are user-defined names and can be arranged/changed. But the type
of data is well defined, e.g. in "obsplane" the data from an ObservationPlane object is stored. This is relevant,
of data is well defined, e.g. in "particles" the data from an ObservationPlane object is stored. This is relevant,
since it allows python to access this data in a controlled way.
since it allows python to access this data in a controlled way.
The top level "config.yaml" contains top-level library information:
The top level "config.yaml" contains top-level library information:
...
@@ -58,15 +58,15 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
...
@@ -58,15 +58,15 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
'creator': 'CORSIKA8',
'creator': 'CORSIKA8',
'version': '8.0.0-prealpha'}
'version': '8.0.0-prealpha'}
>>> lib.names # get a list of all registered processes in the library
>>> lib.names # get a list of all registered processes in the library
['obsplane']
['particles']
>>> lib.summary # you can also load the summary information
>>> lib.summary # you can also load the summary information
{'showers': 1,
{'showers': 1,
'start time': '06/02/2021 23:46:18 HST',
'start time': '06/02/2021 23:46:18 HST',
'end time': '06/02/2021 23:46:30 HST',
'end time': '06/02/2021 23:46:30 HST',
'runtime': 11.13}
'runtime': 11.13}
>>> lib.get("obsplane") # you can then get the process by its registered name.
>>> lib.get("particles") # you can then get the process by its registered name.
ObservationPlane('obsplane')
ObservationPlane('particles')
>>> lib.get("obsplane").config # and you can also get its config as well
>>> lib.get("particles").config # and you can also get its config as well
{'type': 'ObservationPlane',
{'type': 'ObservationPlane',
'plane': {'center': [0, 0, 6371000],
'plane': {'center': [0, 0, 6371000],
'center.units': 'm',
'center.units': 'm',
...
@@ -74,8 +74,8 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
...
@@ -74,8 +74,8 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
'x-axis': [1, 0, 0],
'x-axis': [1, 0, 0],
'y-axis': [0, 1, 0],
'y-axis': [0, 1, 0],
'delete_on_hit': True,
'delete_on_hit': True,
'name': 'obsplane'}
'name': 'particles'}
>>> lib.get("obsplane").data # this returns the data as a Pandas data frame
>>> lib.get("particles").data # this returns the data as a Pandas data frame
shower pdg energy x y radius
shower pdg energy x y radius
0 0 211 9.066702e+10 2.449931 -5.913341 7.093710
0 0 211 9.066702e+10 2.449931 -5.913341 7.093710
1 0 22 2.403024e+11 -1.561504 -1.276160 2.024900
1 0 22 2.403024e+11 -1.561504 -1.276160 2.024900
...
@@ -83,7 +83,7 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
...
@@ -83,7 +83,7 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
3 0 211 1.773324e+11 -1.566567 4.172961 4.461556
3 0 211 1.773324e+11 -1.566567 4.172961 4.461556
4 0 211 7.835374e+10 3.152863 -1.049201 3.330416
4 0 211 7.835374e+10 3.152863 -1.049201 3.330416
.. ... ... ... ... ... ...
.. ... ... ... ... ... ...
>>> lib.get("obsplane").astype("arrow") # you can also request the data in a different format
>>> lib.get("particles").astype("arrow") # you can also request the data in a different format
pyarrow.Table
pyarrow.Table
shower: int32 not null
shower: int32 not null
pdg: int32 not null
pdg: int32 not null
...
@@ -91,5 +91,9 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
...
@@ -91,5 +91,9 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
x: double not null
x: double not null
y: double not null
y: double not null
radius: double not null
radius: double not null
>>>lib.get("obsplane").astype("pandas")
>>>lib.get("particles").astype("pandas") # or astype("arrow"), or astype("pandas").to_numpy()
You can locally install the corsika python analysis library from within your corsika
source code directory by `pip3 install --user -e python pyarrow==0.17.0`. Note, the pyarrow
version fix has shown to be needed on some older systems. You may not need this, or you may