Pycatzao Documentation

Welcome to the documentation for Pycatzao!

API Overview

Pycatzao is a pure Python library for encoding, decoding and compressing Asterix CAT240 messages.

Below, find the documentation of the methods exposed by the API of Pycatzao grouped under the topics “Decoder”, exposed as:

  • pycatzao.decode

  • pycatzao.decode_file

for decoding already existing bytestreams of Asterix CAT240 messsages, “Encoder”, exposed as

  • pycatzao.encode

  • pycatzao.make_summary

  • pycatzao.make_video_header

  • pycatzao.make_video_message

for compiling new messages, and “Compressor”, exposed as:

  • pycatzao.compress

  • pycatzao.compress_file

for compressing (encoded) messages.

Typically, no interactions between methods of different topics is necessary and one can safely study their documentation separately.

Convenient helper functions for inferring the implicit binning scheme or for stacking decoded block are exposed as “Utilities”:

  • pycatzao.infer_bin_edges

  • pycatzao.join_blocks

Example

buffer_size = 100_000
join_n_blocks = 1_000_000

# compress data
data_file = "my-data.cat240"
with open("compressed.cat240", "wb") as f:
    for msg in pycatzao.compress_file(data_file, buffer_size=buffer_size):
        f.write(msg)

# convert to csv
for i, blocks in enumerate(
    itertools.batched(
        pycatzao.decode_file("compressed.cat240", buffer_size=buffer_size),
        join_n_blocks,
    )
):
    pd.DataFrame(pycatzao.join_blocks(blocks)).to_csv(
        f"csv-{i + 1}.csv.gz", header=True, index=False, float_format="%.2f"
    )

CAT240 Decoder

Decoding of Asterix CAT240 messages.

Use decode() for decoding a byte sequence of encoded Asterix CAT240 messages. This function will return trailing bytes that were not decoded such that one can easily implement loops without the need to split CAT240 messages beforehand. Below, we show a simple example how to decode data from a file in chunks. (Note that this is just a dull example to demonstrate how to utilize the returned state, checkout decode_file() if you actually want to decode a file!)

Example

with open("my-cat240-data.bin", "rb") as f:
    state = b""
    while chunk := f.read(1024)) != b"":
        blocks, state = pycatzao.decode(state + chunk)
        for block in blocks:
            process_blocks(block)  # do somehting meaningful with the data

Here, a decoded block is a dict that contains the payload of the respective CAT240 message, i.e.,

  • summary: ASCII string to deliver stream meta data and/or labels

for blocks of message type 001 (Video Summary message), and

  • idx: Message sequence identifier

  • az: The azimuth in degrees. This value is calculated as the circular mean of START_AZ and END_AZ.

  • az_cell_size: The azimuthal cell size in degrees. This value is calculated as the difference of START_AZ and END_AZ. Note that both values are often set to non-meaningful values and it is very easy to confuse their range with the sensor resolution. (In contrast, their circular mean, az, is less ambiguous and can be used safely for downstream tasks.)

  • r: Range of the data points in meters

  • r_cell_size: The radial cell size in meters. Similar to az_cell_size, don’t confuse this value with the radial sensor resolution which can be (significantly) larger.

  • amp: Video signal amplitude (aka the “data points”)

for message type 002 (Video message), as well as

  • sac: System Area Code (SAC)

  • sic: System Identification Code (SIC)

  • type: Type of the message, 1 (Video summary message) or 2 (Video message)

  • tod (if present): Time of Day in seconds (We literally just decode the encoded value. It is beyond the scope of this library to interpret this value, e.g., as an absolute UTC time stamping.)

in both messages.

pycatzao.decoder.decode(data)

Decode CAT240 data.

This functions decodes a given binary blob of encoded Asterix CAT240 messages (type 001 or 002). The data have to start with a new CAT240 message but can end in the middle of one. The bytes of the incomplete message at the end of the input data are returned such that a chunking a long byte sequence by subsequent calls to decode() becomes trivial.

Example

>>> data = b'\\xf0\\x00\\x13\\xd1...'
>>> blocks, state = pycatzao.decode(data[:128])
>>> blocks
[{'sac': 7, 'sic': 42, 'type': 1, 'summary': ...},
 {'sac': 7, 'sic': 42, 'type': 2, 'idx': 4711, 'az': 5.9, ...},
 {'sac': 7, 'sic': 42, 'type': 2, 'idx': 4712, 'az': 6.4, ...}]
>>> blocks, state = pycatzao.decode(state + data[128:])
>>> blocks
[{'sac': 7, 'sic': 42, 'type': 2, 'idx': 4713, 'az': 6.9, ...}]
Parameters:

data (bytes) – Encoded CAT240 messages as a binary blob that start with a new message.

Returns:

Decoded message and trailing bytes. The latter carries the state of the decoder and should be prepended to the input data for the subsequent call to decode().

Return type:

tuple[dict, bytes]

pycatzao.decoder.decode_file(file_name, *, size=-1, buffer_size=-1)

Decode a CAT240 file.

This is a helper function that reads and decodes a file with (binary) Asterix CAT240 data.

Parameters:
  • file_name (str | pathlib.Path) – Name of the file.

  • size (int) – Maximum number of bytes to read from the file. If negative, the entire file will be read.

  • buffer_size (int) – Process file in chunks of this size. If negative, the entire file will be loaded to RAM at once.

Returns:

Decoded messages (type of generated items is the same as the first return type of decode().)

Return type:

Generator[dict, None, None]

CAT240 Encoder

Encoding of Asterix CAT240 messages.

Use encode() to compile new Asterix CAT240 messages and feed it with the output of either make_summary() or make_video_message() for creating a type 001 or 002 message, respectively. The output is a byte sequence of a single CAT240 message that can directly be concatenated with others:

Example

payloads = [
    pycatzao.make_summary(summary="FoObaR"),
    pycatzao.make_video_message(
        np.array([255, 0, 255], dtype=np.uint8),  # signal at range = [9m, 12m, 15m]
        msg_index=1,
        header=pycatzao.make_video_header(
            start_az=45.0,
            end_az=45.0,
            cell_offset=9,  # (center of) first cell is at 9m
            cell_width=3,   # cell width is 3m
        )
    ),
]

tod = ...  # get ToD in seoncds
blocks = [pycatzao.encode(msg, sac=7, sic=42, tod=tod) for msg in payloads]
print(b"".join(blocks).hex())  # outputs: f00012d108072a01...
pycatzao.encoder.encode(payload, *, sac, sic, tod=-1)

Compiles data into a CAT240 message.

Takes a payload generated by either make_summary() or make_video_message() and wraps it as an Asterix CAT240 message.

Parameters:
  • payload – The payload to encode.

  • sac (int) – System Area Code (SAC)

  • sic (int) – System Identification Code (SIC)

  • tod (float) – Time of Day (ToD) in seconds. If negative, no ToD is included into the compiled message.

Returns:

A single Asterix CAT240 message.

Return type:

bytes

pycatzao.encoder.make_summary(summary)

Create Video Summary.

Creates a type 001 message (Video Summary message) to be passed as the payload to encode().

Parameters:

summary (str) – (ASCII) string to deliver stream meta data and/or labels.

Returns:

Payload for encode().

Return type:

Any

Raises:

ValueError – The ASCII representation of the summary has to be not longer than 255 characters.

pycatzao.encoder.make_video_header(*, start_az, end_az, cell_offset, cell_width)

Create Video Header.

Creates a video header (I240/040 or I240/041) for make_video_message().

Parameters:
  • start_az (float) – Start azimuth in degrees.

  • end_az (float) – End azimuth in degrees.

  • cell_offset (float) – Offset of first cell in meters.

  • cell_width (float) – Cell width in meters. If this value is larger than 500m, the header type is I240/040 (Video Header Nano) and I240/041 (Video Header Femto) otherwise.

Returns:

The header for func:make_video_message.

Return type:

Any

pycatzao.encoder.make_video_message(amp, *, msg_index, header, compress=True)

Create Video Message.

Creates a video message (I240/050, I240/051 or I240/052) for encode() from a video header.

Parameters:
  • amp (np.ndarray) – Amplitude array given as an np.ndarray with dtype set to either np.uint8, np.uint16, or np.uint32.

  • msg_index (int) – Message Sequence Identifier (video record cyclic counter).

  • header – Header generated by make_video_header().

  • compress (bool) – Compress the amplitude array before compiling the message. Note that this is typically cheap yet very effective and should be enabled unless you know better.

Returns:

Payload for encode().

Return type:

Any

CAT240 Compressor

Compression of Asterix CAT240 messages.

Use compress() to compress encoded Asterix CAT240 messages or compress_file() to compress an entire file. These operations do not change type 001 messages or already compressed messages of type 002.

Examples

# generate uncompressed messages
blocks = [
    pycatzao.encode(
        pycatzao.make_video_message(..., compress=False),
        ...
    ),
]

# compress encoded messages
compressed, _ = pycatzao.compress(blocks)
# compress an uncompressed CAT240 file
with open("compressed.cat240", "wb") as f:
    for msg in tqdm(
        pycatzao.compress_file("uncompressed.cat240", buffer_size=100_000),
        desc="Compressing data",
    ):
        f.write(msg)
pycatzao.compress.compress(data)

Compress Asterix CAT240 data.

Compresses non-compressed Asterix CAT240 type 002 messages. Other messages are returned unchanged. The data have to start with a new CAT240 message but can end in the middle of one. This requirement and the return type is identical to pycatzao.decoder.decode().

Parameters:

data (bytes) – Encoded Asterix CAT240 messages.

Returns:

Compressed messages and state (see pycatzao.decoder.decode() for details on how to use the latter.)

Return type:

tuple[bytes, bytes]

pycatzao.compress.compress_file(file_name, *, buffer_size=-1)

Compress an Asterix CAT240 file.

This is a helper function that compresses a file with (binary) Asterix CAT240 data.

Parameters:
  • file_name (str | pathlib.Path) – Name of the file.

  • buffer_size (int) – Process file in chunks of this size. If negative, the entire file will be loaded to RAM at once.

Returns:

Compressed messages.

Return type:

Generator[bytes, None, None]

Utilities

Helper functions for dealing with decoded Asterix CAT240 data.

If you need the decoded data more structured as a table, use join_blocks() for this! This function will join (type 002) messages by repeating scalar values and return a single dict that can be fed directly to, e.g., a pandas.DataFrame.

Typically, the azimuth and range values follow an implicit binning scheme. Sometimes, this scheme can be inferred by parsing a handful of messages. We implement a simple heuristic that tries to infer this scheme in infer_bin_edges(). Note that the azimuth bins are circular and that due to this periodicity, the bin content of the first and last bin might have to be superimposed depending on the context.

pycatzao.utils.infer_bin_edges(data)

Infers binning scheme from CAT240 data.

This function infers the range and azimuthal binning scheme of the given CAT240 data using simple heuristics. The result can directly be fed into numpy.arange() and numpy.linspace() to get the bin edges in meters and degrees, respectively. The azimuth bins are assuemd to be cyclic whereas for the range bins no upper limit (aka stop) is infered; this value has to be set manually:

Example

>>> data = b'\\xf0\\x00\\x13\\xd1...'
>>> bins, _ = pycatzao.infer_bin_edges(data)
>>> r_edges = np.arange(**bins["r"], stop=3_000)
>>> az_edges = np.linspace(**bins["az"])

Note that due to the monotonicity of numpy.arange() and the periodicity of the azimuth bins, the first bin edge is negative and the last is larger than 360. Depending on the context one might have to superimpose the corresponding bin entries!

Parameters:

data (bytes) – Encoded CAT240 messages as a binary blob that start with a new message.

Returns:

The first return value is a dictionary with the inferred binning scheme for range (key “r”) and azimuth (key “az”). If an insufficient number of bytes were provided, None and the full input data are returned. The second return value carries the state of the decoder and should be prepended to the input data for the subsequent call to infer_bin_edges(). Note that the inferred binning scheme of previous calls are not taken into account for the current inference.

Return type:

tuple[dict | None, bytes]

pycatzao.utils.join_blocks(blocks, *, show_progress=False)

Join decoded CAT240 blocks into columns.

Joins the fields tod, az, r, and amp of decoded Asterix CAT240 blocks into a rectangular table by repeating the scalar values (tod and az). The result can directly be fed to, e.g., pandas.DataFrame. Messages of type 001 are skipped.

Example

>>> data = b'\\xf0\\x00\\x13\\xd1...'
>>> blocks, _ = pycatzao.decode(data)
>>> pycatzao.join_blocks(blocks)
{ 'tod': array([100.0, 100.0, 100.1, 100.1, 100.1, 100.2, ...], dtype=float32),
   'az': array([  5.9,   5.9,   6.4,   6.4,   6.4,   6.9, ...], dtype=float32),
    'r': array([ 94.2,  94.2,  89.2,  89.2,  89.2,  94.2, ...], dtype=float32),
  'amp': array([248,   250,   127,   125,   130,   255,   ...], dtype=uint8)}
Parameters:
  • blocks (Iterable[dict]) – Decoded blocks as returned by, e.g, decode() or decode_file().

  • show_progress (bool) – Show progress bar.

Returns:

Columns of a (rectangular) table.

Return type:

dict