Pycatzao Documentation¶
Welcome to the documentation for Pycatzao!
API Overview¶
Pycatzao is a pure Python library for encoding, decoding and compressing Asterix CAT240 messages.
Below, find the documentation of the methods exposed by the API of Pycatzao grouped under the topics “Decoder”, exposed as:
pycatzao.decode
pycatzao.decode_file
for decoding already existing bytestreams of Asterix CAT240 messsages, “Encoder”, exposed as
pycatzao.encode
pycatzao.make_summary
pycatzao.make_video_header
pycatzao.make_video_message
for compiling new messages, and “Compressor”, exposed as:
pycatzao.compress
pycatzao.compress_file
for compressing (encoded) messages.
Typically, no interactions between methods of different topics is necessary and one can safely study their documentation separately.
Convenient helper functions for inferring the implicit binning scheme or for stacking decoded block are exposed as “Utilities”:
pycatzao.infer_bin_edges
pycatzao.join_blocks
Example
buffer_size = 100_000
join_n_blocks = 1_000_000
# compress data
data_file = "my-data.cat240"
with open("compressed.cat240", "wb") as f:
for msg in pycatzao.compress_file(data_file, buffer_size=buffer_size):
f.write(msg)
# convert to csv
for i, blocks in enumerate(
itertools.batched(
pycatzao.decode_file("compressed.cat240", buffer_size=buffer_size),
join_n_blocks,
)
):
pd.DataFrame(pycatzao.join_blocks(blocks)).to_csv(
f"csv-{i + 1}.csv.gz", header=True, index=False, float_format="%.2f"
)
CAT240 Decoder¶
Decoding of Asterix CAT240 messages.
Use decode()
for decoding a byte sequence of encoded Asterix CAT240 messages. This
function will return trailing bytes that were not decoded such that one can easily
implement loops without the need to split CAT240 messages beforehand. Below, we show a
simple example how to decode data from a file in chunks. (Note that this is just a dull
example to demonstrate how to utilize the returned state, checkout decode_file()
if you actually want to decode a file!)
Example
with open("my-cat240-data.bin", "rb") as f:
state = b""
while chunk := f.read(1024)) != b"":
blocks, state = pycatzao.decode(state + chunk)
for block in blocks:
process_blocks(block) # do somehting meaningful with the data
Here, a decoded block is a dict that contains the payload of the respective CAT240 message, i.e.,
summary: ASCII string to deliver stream meta data and/or labels
for blocks of message type 001 (Video Summary message), and
idx: Message sequence identifier
az: The azimuth in degrees. This value is calculated as the circular mean of START_AZ and END_AZ.
az_cell_size: The azimuthal cell size in degrees. This value is calculated as the difference of START_AZ and END_AZ. Note that both values are often set to non-meaningful values and it is very easy to confuse their range with the sensor resolution. (In contrast, their circular mean, az, is less ambiguous and can be used safely for downstream tasks.)
r: Range of the data points in meters
r_cell_size: The radial cell size in meters. Similar to az_cell_size, don’t confuse this value with the radial sensor resolution which can be (significantly) larger.
amp: Video signal amplitude (aka the “data points”)
for message type 002 (Video message), as well as
sac: System Area Code (SAC)
sic: System Identification Code (SIC)
type: Type of the message, 1 (Video summary message) or 2 (Video message)
tod (if present): Time of Day in seconds (We literally just decode the encoded value. It is beyond the scope of this library to interpret this value, e.g., as an absolute UTC time stamping.)
in both messages.
- pycatzao.decoder.decode(data)¶
Decode CAT240 data.
This functions decodes a given binary blob of encoded Asterix CAT240 messages (type 001 or 002). The data have to start with a new CAT240 message but can end in the middle of one. The bytes of the incomplete message at the end of the input data are returned such that a chunking a long byte sequence by subsequent calls to
decode()
becomes trivial.Example
>>> data = b'\\xf0\\x00\\x13\\xd1...' >>> blocks, state = pycatzao.decode(data[:128]) >>> blocks [{'sac': 7, 'sic': 42, 'type': 1, 'summary': ...}, {'sac': 7, 'sic': 42, 'type': 2, 'idx': 4711, 'az': 5.9, ...}, {'sac': 7, 'sic': 42, 'type': 2, 'idx': 4712, 'az': 6.4, ...}] >>> blocks, state = pycatzao.decode(state + data[128:]) >>> blocks [{'sac': 7, 'sic': 42, 'type': 2, 'idx': 4713, 'az': 6.9, ...}]
- pycatzao.decoder.decode_file(file_name, *, size=-1, buffer_size=-1)¶
Decode a CAT240 file.
This is a helper function that reads and decodes a file with (binary) Asterix CAT240 data.
- Parameters:
file_name (str | pathlib.Path) – Name of the file.
size (int) – Maximum number of bytes to read from the file. If negative, the entire file will be read.
buffer_size (int) – Process file in chunks of this size. If negative, the entire file will be loaded to RAM at once.
- Returns:
Decoded messages (type of generated items is the same as the first return type of
decode()
.)- Return type:
CAT240 Encoder¶
Encoding of Asterix CAT240 messages.
Use encode()
to compile new Asterix CAT240 messages and feed it with the output of
either make_summary()
or make_video_message()
for creating a type 001 or
002 message, respectively. The output is a byte sequence of a single CAT240 message
that can directly be concatenated with others:
Example
payloads = [
pycatzao.make_summary(summary="FoObaR"),
pycatzao.make_video_message(
np.array([255, 0, 255], dtype=np.uint8), # signal at range = [9m, 12m, 15m]
msg_index=1,
header=pycatzao.make_video_header(
start_az=45.0,
end_az=45.0,
cell_offset=9, # (center of) first cell is at 9m
cell_width=3, # cell width is 3m
)
),
]
tod = ... # get ToD in seoncds
blocks = [pycatzao.encode(msg, sac=7, sic=42, tod=tod) for msg in payloads]
print(b"".join(blocks).hex()) # outputs: f00012d108072a01...
- pycatzao.encoder.encode(payload, *, sac, sic, tod=-1)¶
Compiles data into a CAT240 message.
Takes a payload generated by either
make_summary()
ormake_video_message()
and wraps it as an Asterix CAT240 message.
- pycatzao.encoder.make_summary(summary)¶
Create Video Summary.
Creates a type 001 message (Video Summary message) to be passed as the payload to
encode()
.- Parameters:
summary (str) – (ASCII) string to deliver stream meta data and/or labels.
- Returns:
Payload for
encode()
.- Return type:
- Raises:
ValueError – The ASCII representation of the summary has to be not longer than 255 characters.
- pycatzao.encoder.make_video_header(*, start_az, end_az, cell_offset, cell_width)¶
Create Video Header.
Creates a video header (I240/040 or I240/041) for
make_video_message()
.- Parameters:
start_az (float) – Start azimuth in degrees.
end_az (float) – End azimuth in degrees.
cell_offset (float) – Offset of first cell in meters.
cell_width (float) – Cell width in meters. If this value is larger than 500m, the header type is I240/040 (Video Header Nano) and I240/041 (Video Header Femto) otherwise.
- Returns:
The header for func:make_video_message.
- Return type:
- pycatzao.encoder.make_video_message(amp, *, msg_index, header, compress=True)¶
Create Video Message.
Creates a video message (I240/050, I240/051 or I240/052) for
encode()
from a video header.- Parameters:
amp (np.ndarray) – Amplitude array given as an np.ndarray with dtype set to either np.uint8, np.uint16, or np.uint32.
msg_index (int) – Message Sequence Identifier (video record cyclic counter).
header – Header generated by
make_video_header()
.compress (bool) – Compress the amplitude array before compiling the message. Note that this is typically cheap yet very effective and should be enabled unless you know better.
- Returns:
Payload for
encode()
.- Return type:
CAT240 Compressor¶
Compression of Asterix CAT240 messages.
Use compress()
to compress encoded Asterix CAT240 messages or
compress_file()
to compress an entire file. These operations do not change type
001 messages or already compressed messages of type 002.
Examples
# generate uncompressed messages
blocks = [
pycatzao.encode(
pycatzao.make_video_message(..., compress=False),
...
),
]
# compress encoded messages
compressed, _ = pycatzao.compress(blocks)
# compress an uncompressed CAT240 file
with open("compressed.cat240", "wb") as f:
for msg in tqdm(
pycatzao.compress_file("uncompressed.cat240", buffer_size=100_000),
desc="Compressing data",
):
f.write(msg)
- pycatzao.compress.compress(data)¶
Compress Asterix CAT240 data.
Compresses non-compressed Asterix CAT240 type 002 messages. Other messages are returned unchanged. The data have to start with a new CAT240 message but can end in the middle of one. This requirement and the return type is identical to
pycatzao.decoder.decode()
.- Parameters:
data (bytes) – Encoded Asterix CAT240 messages.
- Returns:
Compressed messages and state (see
pycatzao.decoder.decode()
for details on how to use the latter.)- Return type:
- pycatzao.compress.compress_file(file_name, *, buffer_size=-1)¶
Compress an Asterix CAT240 file.
This is a helper function that compresses a file with (binary) Asterix CAT240 data.
- Parameters:
file_name (str | pathlib.Path) – Name of the file.
buffer_size (int) – Process file in chunks of this size. If negative, the entire file will be loaded to RAM at once.
- Returns:
Compressed messages.
- Return type:
Utilities¶
Helper functions for dealing with decoded Asterix CAT240 data.
If you need the decoded data more structured as a table, use join_blocks()
for
this! This function will join (type 002) messages by repeating scalar values and
return a single dict that can be fed directly to, e.g., a pandas.DataFrame
.
Typically, the azimuth and range values follow an implicit binning scheme. Sometimes,
this scheme can be inferred by parsing a handful of messages. We implement a simple
heuristic that tries to infer this scheme in infer_bin_edges()
. Note that the
azimuth bins are circular and that due to this periodicity, the bin content of the first
and last bin might have to be superimposed depending on the context.
- pycatzao.utils.infer_bin_edges(data)¶
Infers binning scheme from CAT240 data.
This function infers the range and azimuthal binning scheme of the given CAT240 data using simple heuristics. The result can directly be fed into
numpy.arange()
andnumpy.linspace()
to get the bin edges in meters and degrees, respectively. The azimuth bins are assuemd to be cyclic whereas for the range bins no upper limit (aka stop) is infered; this value has to be set manually:Example
>>> data = b'\\xf0\\x00\\x13\\xd1...' >>> bins, _ = pycatzao.infer_bin_edges(data) >>> r_edges = np.arange(**bins["r"], stop=3_000) >>> az_edges = np.linspace(**bins["az"])
Note that due to the monotonicity of
numpy.arange()
and the periodicity of the azimuth bins, the first bin edge is negative and the last is larger than 360. Depending on the context one might have to superimpose the corresponding bin entries!- Parameters:
data (bytes) – Encoded CAT240 messages as a binary blob that start with a new message.
- Returns:
The first return value is a dictionary with the inferred binning scheme for range (key “r”) and azimuth (key “az”). If an insufficient number of bytes were provided, None and the full input data are returned. The second return value carries the state of the decoder and should be prepended to the input data for the subsequent call to
infer_bin_edges()
. Note that the inferred binning scheme of previous calls are not taken into account for the current inference.- Return type:
- pycatzao.utils.join_blocks(blocks, *, show_progress=False)¶
Join decoded CAT240 blocks into columns.
Joins the fields tod, az, r, and amp of decoded Asterix CAT240 blocks into a rectangular table by repeating the scalar values (tod and az). The result can directly be fed to, e.g.,
pandas.DataFrame
. Messages of type 001 are skipped.Example
>>> data = b'\\xf0\\x00\\x13\\xd1...' >>> blocks, _ = pycatzao.decode(data) >>> pycatzao.join_blocks(blocks) { 'tod': array([100.0, 100.0, 100.1, 100.1, 100.1, 100.2, ...], dtype=float32), 'az': array([ 5.9, 5.9, 6.4, 6.4, 6.4, 6.9, ...], dtype=float32), 'r': array([ 94.2, 94.2, 89.2, 89.2, 89.2, 94.2, ...], dtype=float32), 'amp': array([248, 250, 127, 125, 130, 255, ...], dtype=uint8)}