Skip to content

LHCbDIRAC Logs Converter library documentation

pipeline status coverage report

Description

A Python library for migrating LHCbDIRAC log files to the new Zstandard + SQLite format and for managing the logs data. The official repository is hosted on GitLab.

Overview

This library takes logs files from finished productions under the ZIP files format and converts them to a new more efficient format based on Zstandard compression + SQLite databse.

The idea is to have a single file database for each sub-production instead of thousands of ZIP files. These SQLite databases use to following schema:

Schema:

create table dict -- Zstandard compression dictionaries
(
    id INTEGER not null primary key, -- in-database id
    name VARCHAR not null unique,    -- name of the dictionary
    zstd_id INTEGER,                 -- Zstandard dictionary id
    data    BLOB                     -- Zstandard dictionary data
);

create table job
(
    id INTEGER not null primary key, -- the job id (e.g. 00000042)
    dirac_id INTEGER,                -- the job id in DIRAC
    success  BOOLEAN                 -- whether the job was successful or NULL if unknown
);

create table data
(
    name VARCHAR not null,  -- the name of the log file
    job  INTEGER not null references job on delete restrict,  -- the associated job
    dict INTEGER not null references dict on delete restrict, -- the associated dictionary
    data BLOB    not null, -- the compressed log data
    primary key (name, job)
);

Data can be retreived using this library, or can be manually obtained by getting the compressed data, and decompressing it using Zstandard. The decompression requires the dictionary that was used to compress the data.

SELECT data.data, dict.data 
FROM data INNER JOIN dict ON data.dict = dict.id 
WHERE data.name = 'my-log.txt' AND data.job = 42;

The dictionary data can be NULL, in which case the data can be decompressed directly using a Zstandard library, without any dictionary needed.

For more information on the library usage, see Examples and API sections.

Documentation

The official documentation can be found here.

Installation

See the Installation section in the documentation.

License

This project is under the GPLv3 license. See the LICENSE for more details.