LHCbDIRAC Logs Converter library documentation
Description
A Python library for migrating LHCbDIRAC log files to the new Zstandard + SQLite format and for managing the logs data. The official repository is hosted on GitLab.
Overview
This library takes logs files from finished productions under the ZIP files format and converts them to a new more efficient format based on Zstandard compression + SQLite databse.
The idea is to have a single file database for each sub-production instead of thousands of ZIP files. These SQLite databases use to following schema:
Schema:
create table dict -- Zstandard compression dictionaries
(
id INTEGER not null primary key, -- in-database id
name VARCHAR not null unique, -- name of the dictionary
zstd_id INTEGER, -- Zstandard dictionary id
data BLOB -- Zstandard dictionary data
);
create table job
(
id INTEGER not null primary key, -- the job id (e.g. 00000042)
dirac_id INTEGER, -- the job id in DIRAC
success BOOLEAN -- whether the job was successful or NULL if unknown
);
create table data
(
name VARCHAR not null, -- the name of the log file
job INTEGER not null references job on delete restrict, -- the associated job
dict INTEGER not null references dict on delete restrict, -- the associated dictionary
data BLOB not null, -- the compressed log data
primary key (name, job)
);
Data can be retreived using this library, or can be manually obtained by getting the compressed data, and decompressing it using Zstandard. The decompression requires the dictionary that was used to compress the data.
SELECT data.data, dict.data
FROM data INNER JOIN dict ON data.dict = dict.id
WHERE data.name = 'my-log.txt' AND data.job = 42;
The dictionary data can be NULL, in which case the data can be decompressed directly using a Zstandard library, without any dictionary needed.
For more information on the library usage, see Examples
and API
sections.
Documentation
The official documentation can be found here.
Installation
See the Installation
section in the documentation.
License
This project is under the GPLv3 license. See the LICENSE for more details.