Simple API
This package provides the Zstandard compressor and decompressor wrapper classes.
These classes respect a common interface through ZstandardProcessor and can be used interchangeably. The ZstandardTrainer class is also provided through the same API but is a little special.
Classes:
Name | Description |
---|---|
- ZstandardCompressor |
Zstandard compressor wrapper class. |
- ZstandardDecompressor |
Zstandard decompressor wrapper class. |
- ZstandardProcessor |
Zstandard base processor abstract interface. |
- ZstandardTrainer |
Zstandard trainer wrapper class. |
ZstandardCompressor
Bases: ZstandardProcessor[ZstdCompressor]
Wrapper for Zstandard compressor.
Performs data compression using Zstandard, featuring extra functionality: - zstd-context management (auto compressor reinstantiation) - switchable dictionary
Notes
- not thread-safe
- dictionary switching requires reinstantiation of the internal context processor (slow)
Source code in src/lhcbdirac_log/zstd/processors/compressor.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 |
|
ZstandardDecompressor
Bases: ZstandardProcessor[ZstdDecompressor]
Wrapper for Zstandard decompressor.
Performs data decompression using Zstandard, featuring extra functionality: - zstd-context management (auto decompressor reinstantiation) - switchable dictionary
Notes
- not thread-safe
- dictionary switching requires reinstantiation of the internal context processor (slow)
Source code in src/lhcbdirac_log/zstd/processors/decompressor.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
|
process_stream(instream, outstream, insize=-1)
Process the provided stream.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
instream |
BinaryIO
|
the stream to process |
required |
outstream |
BinaryIO
|
the stream to write the processed data to |
required |
insize |
int
|
ignored |
-1
|
Returns:
Type | Description |
---|---|
tuple[int, int]
|
a tuple with the number of bytes read and written |
Source code in src/lhcbdirac_log/zstd/processors/decompressor.py
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 |
|
ZstandardProcessor
Bases: ABC
Zstandard processor abstract base class.
Provides a common interface for Zstandard compressor and decompressor wrappers, and trainer too.
Source code in src/lhcbdirac_log/zstd/processors/processor.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
dict: DictEntry | None
property
writable
Get the dictionary entry.
Returns:
Type | Description |
---|---|
DictEntry | None
|
the dictionary entry or None if no dictionary is set |
dict_name: str | None
property
Get the dictionary name of the processor or None.
Returns:
Type | Description |
---|---|
str | None
|
the dictionary name or None if no dictionary is set |
processor: T
property
Get the internal processor.
Returns:
Type | Description |
---|---|
T
|
the internal processor |
__init__(config=DEFAULT_CONFIG, dict_entry=None)
Initialize the processor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
config |
Config
|
the configuration to use for the processing (default is DEFAULT_CONFIG) |
DEFAULT_CONFIG
|
dict_entry |
DictEntry | None
|
the dictionary entry to use (can be set/changed later) |
None
|
Source code in src/lhcbdirac_log/zstd/processors/processor.py
29 30 31 32 33 34 35 36 37 38 39 40 |
|
process(*data)
abstractmethod
Process the provided data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*data |
bytes | bytearray | memoryview
|
the data to process |
()
|
Returns:
Type | Description |
---|---|
bytes | Iterator[bytes]
|
a bytes object (if single output) or |
bytes | Iterator[bytes]
|
a bytes iterator |
Notes
- empty data will return an empty bytes
Source code in src/lhcbdirac_log/zstd/processors/processor.py
98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
|
process_stream(instream, outstream, insize=-1)
abstractmethod
Process the provided stream.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
instream |
BinaryIO
|
the stream to process |
required |
outstream |
BinaryIO
|
the stream to write the processed data to |
required |
insize |
int
|
the size of the input stream, -1 for unknown |
-1
|
Returns:
Type | Description |
---|---|
tuple[int, int]
|
a tuple with the number of bytes read and written |
Source code in src/lhcbdirac_log/zstd/processors/processor.py
113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
ZstandardTrainer
Bases: ZstandardProcessor[ZstandardProcessor]
Wrapper for Zstandard trainer.
Performs dictionary training from data, using Zstandard.
Notes
- not thread-safe
Source code in src/lhcbdirac_log/zstd/processors/trainer.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
process(*data, dict_id=0, dict_size=0)
Train a dictionary from the provided data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*data |
bytes | bytearray | memoryview
|
the data to use for training |
()
|
dict_id |
int
|
the zstd dictionary id to use (default is 0 = auto) |
0
|
dict_size |
int
|
the final size of the dictionary to train (default is 0 = auto) |
0
|
Returns:
Type | Description |
---|---|
bytes
|
the trained dictionary data |
Raises:
Type | Description |
---|---|
ValueError
|
if the dataset is invalid (too small or empty dataset, or only too big samples...) |
ZstdError
|
if the training fails |
Source code in src/lhcbdirac_log/zstd/processors/trainer.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
|
process_stream(instream, outstream, insize=-1)
Process the provided stream. Not implemented.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
instream |
BinaryIO
|
ignored |
required |
outstream |
BinaryIO
|
ignored |
required |
insize |
int
|
ignored |
-1
|
Returns:
Type | Description |
---|---|
tuple[int, int]
|
a tuple with the number of bytes read and written |
Raises:
Type | Description |
---|---|
NotImplementedError
|
always |
Source code in src/lhcbdirac_log/zstd/processors/trainer.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
|
train(dict_provider, *data, dict_size=0)
Train a dict with the sample data.
The newly trained dictionary is added to the dictionary provider,
and is available through the dict
property.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dict_provider |
DictProvider
|
the destination dictionary provider |
required |
*data |
DataEntry
|
the data to use for training |
()
|
dict_size |
int
|
the final size of the dictionary to train (0: auto) |
0
|
Returns:
Type | Description |
---|---|
bytes
|
the trained dictionary data |
Raises:
Type | Description |
---|---|
ValueError
|
if the dataset is invalid (too small or empty dataset, or only too big samples...) |
ZstdError
|
if the training fails |
Source code in src/lhcbdirac_log/zstd/processors/trainer.py
69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
|