- Setting up a datastore
- Importing a dataset
- Processing multiple log folders
- Listing and querying datasets
Given the convention of putting all generated data in a Syskit instance log directory, all data that needs to be stored about a given Syskit execution is present in a single folder. Saving that data means copying that folder.
This is the simplest method to save a successful mission's data: copy the whole Syskit log folder to save it.
tools/syskit-log package also offers a way to normalize data in a
datastore. In a datastore, the common data of a Syskit run (i.e. Syskit event
log, component properties and output ports) are converted in a normalized form,
creating a dataset. Datasets are immutable, given an immutable ID and can safely
be copied across machines.
All commands related to stores are under the
syskit ds command. See
syskit help ds
for a list.
Data export and analysis functionality from syskit-log rely on data being converted to a normalized dataset.
Setting up a datastore
A syskit-log datastore is a simple local folder. Just create it.
subcommands may be given a datastore explicitly with the
--store option or,
preferrably, one sets a global datastore using the SYSKIT_LOG_STORE environment
Importing a dataset
To import a dataset, copy the data from your system and process it using
ds import. Using rsync, it would look like
rsync -r --compress REMOTE_URL:/path/to/logs/current . syskit ds import current "Description of this dataset" \ --tags a list of tags to refer to the dataset later
Within the store, datasets themselves are stored in the
core/ folder, under
their full ID. Each dataset has a
syskit-dataset.yml file that contains the
identity information for that set (i.e. the hash of the files are used to create
the set ID) as well as the Syskit event log. A
pocolog folder contains the
output log files, normalized to a single file per stream, named as
All other files that were contained in the original folder(s) are stored either
text/ folder (if they are text files) or in the
Processing multiple log folders
Each Syskit run creates a new dataset folder. During a day of operation, it is often the case that multiple datasets have been created. Let's assume you have copied them all in a single (originally empty) local folder with:
rsync -r --compress REMOTE_URL:/path/to/logs/ .
You may decide to import them all separately in a single run using the
parameter. This will create one dataset per subfolder.
syskit ds import --auto .
Alternatively, all created datasets from the same Syskit app can be imported and
processed together using the
--merge option to
import. It will create a single
dataset that can be analyzed as a single one later.
syskit ds import --merge .
Listing and querying datasets
syskit ds list command will list all datasets currently present in the store,
listed by increasing date (oldest first). The command also accepts ways to restrict
the datasets using its QUERY parameter.
The query is a list of
keyOPvalue arguments, where
key is one of the metadata
keys (as shown by
list without arguments) and
OP is either
= for strict
~ for matching (in which case
value is interpreted as a regular
Metadata of note is
roby:time, which is a timestamp of the form
YYYYMMDD-HHMM. One can for instance show all datasets from August 2020 with
syskit ds list roby:time~202010
Once you narrowed down the list of datasets to show, the
will display all the data streams available within the dataset.
find-streams allows to look for specific data streams.
For instance, to look for all
/base/Time streams do
syskit ds find-streams type=/base/Time
--ds-filter argument allows to filter datasets the same way than
does, i.e. to see all
/base/samples/RigidBodyState streams generated during
syskit ds find-streams type~/base/samples/RigidBodyState --ds-filter roby:time~202010
Opaque types, and types that are
derived from them (e.g. structure that have opaques as fields), are stored under
the type name of the intermediate type and not the original type name. For
/base/samples/RigidBodyState is actually stored as
/base/samples/RigidBodyState_m in the log files. If a call to
does not return any result, check a single dataset to find out whether you
are referring to the right type.
syskit ds help find-streams for more details.