From e8e9b31f2ccdb6d1484a7ddb4ce4856668b6f508 Mon Sep 17 00:00:00 2001
From: ralfulrich <ralf.ulrich@kit.edu>
Date: Mon, 21 Jun 2021 17:02:18 +0200
Subject: [PATCH] output docu updates

---
 documentation/output.rst | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/documentation/output.rst b/documentation/output.rst
index 7360cdaa6..88edc4b3e 100644
--- a/documentation/output.rst
+++ b/documentation/output.rst
@@ -1,12 +1,12 @@
 Output
 ======
 
-The format of CORSIKA~8 is designed to allow simple and robust managmenet of large libraries, as well as high reading performance. There is a dedicated python library to help processing data. 
+The format of CORSIKA 8 is designed to allow simple and robust managmenet of large libraries, as well as high reading performance. There is a dedicated python library to help processing data. 
 
 The basic structure of output is structured on the filesystem itself with a couple of subdirectories and files. Each run of CORSIKA~8 creates a library that can contain any number of showers. 
 The format is equally suited for single huge showers as well as for a very high number of very low-energy showers. Each module included in the run can
 produce output inside this directory. The module output is separeted in individual user-named sub-directories, each containing files produced by the module. The file format is either yaml for basic configuration and summary data, or Apache parquet for any other (binary, compressed) 
-data. Parquet is optimal for columnar/tabular data as it is produced by CORSIKA~8. 
+data. Parquet is optimal for columnar/tabular data as it is produced by CORSIKA 8. 
 
 One advantage of this format is that with normal filesystem utilties users can manage the libraries. On all systems there are tools available to 
 directly read/process yaml as well as parquet files. If you, for example, don't need the particle data for space reasons, this is very simple to remove from a library. Individual 
@@ -19,13 +19,13 @@ For example, the output of the "vertical_EAS" example program looks like this:
   vertical_EAS_outputs/
       config.yaml
       summary.yaml
-      obsplane/
+      particles/
           config.yaml
           summary.yaml
           particles.parquet
 
-The "vertical_EAS_outputs" and the "obsplane" are user-defined names and can be arranged/changed. But the type 
-of data is well defined, e.g. in "obsplane" the data from an ObservationPlane object is stored. This is relevant, 
+The "vertical_EAS_outputs" and the "particles" are user-defined names and can be arranged/changed. But the type 
+of data is well defined, e.g. in "particles" the data from an ObservationPlane object is stored. This is relevant, 
 since it allows python to access this data in a controlled way. 
 
 The top level "config.yaml" contains top-level library information:
@@ -58,15 +58,15 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
    'creator': 'CORSIKA8',
    'version': '8.0.0-prealpha'}
    >>> lib.names  # get a list of all registered processes in the library
-   ['obsplane']
+   ['particles']
    >>> lib.summary  # you can also load the summary information
    {'showers': 1,
    'start time': '06/02/2021 23:46:18 HST',
    'end time': '06/02/2021 23:46:30 HST',
    'runtime': 11.13}
-   >>> lib.get("obsplane")  # you can then get the process by its registered name.
-   ObservationPlane('obsplane')
-   >>> lib.get("obsplane").config  # and you can also get its config as well
+   >>> lib.get("particles")  # you can then get the process by its registered name.
+   ObservationPlane('particles')
+   >>> lib.get("particles").config  # and you can also get its config as well
    {'type': 'ObservationPlane',
    'plane': {'center': [0, 0, 6371000],
    'center.units': 'm',
@@ -74,8 +74,8 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
    'x-axis': [1, 0, 0],
    'y-axis': [0, 1, 0],
    'delete_on_hit': True,
-   'name': 'obsplane'}
-   >>> lib.get("obsplane").data  # this returns the data as a Pandas data frame
+   'name': 'particles'}
+   >>> lib.get("particles").data  # this returns the data as a Pandas data frame
       shower  pdg        energy         x         y    radius
    0         0  211  9.066702e+10  2.449931 -5.913341  7.093710
    1         0   22  2.403024e+11 -1.561504 -1.276160  2.024900
@@ -83,7 +83,7 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
    3         0  211  1.773324e+11 -1.566567  4.172961  4.461556
    4         0  211  7.835374e+10  3.152863 -1.049201  3.330416
    ..      ...  ...           ...       ...       ...       ...
-   >>> lib.get("obsplane").astype("arrow")  # you can also request the data in a different format
+   >>> lib.get("particles").astype("arrow")  # you can also request the data in a different format
    pyarrow.Table
    shower: int32 not null
    pdg: int32 not null
@@ -91,5 +91,9 @@ CORSIKA~8 to facilitate analysis and output handling (>>> is python prompt):
    x: double not null
    y: double not null
    radius: double not null
-   >>>lib.get("obsplane").astype("pandas")
+   >>>lib.get("particles").astype("pandas") # or astype("arrow"), or astype("pandas").to_numpy()
 
+You can locally install the corsika python analysis library from within your corsika
+source code directory by `pip3 install --user -e python pyarrow==0.17.0`. Note, the pyarrow 
+version fix has shown to be needed on some older systems. You may not need this, or you may 
+need additional packages, too. 
-- 
GitLab