Utilities

haxr.utilities.load_cycle(data: h5py.File, k: int) pandas.DataFrame[source]

Load one radar cycle from an open HDF5 radar file.

The cycle boundaries are stored as inclusive cell-index ranges in cycle/first and cycle/last. This function slices the per-cell datasets (tod, az1, az2, r1, r2, amp) for cycle k and returns them as a DataFrame indexed by the original cell indices.

Column names are annotated with units if the underlying dataset defines an HDF5 attribute unit (e.g. tod (second)). Two derived columns are added: az is the circular mean of az1 and az2 (degrees, rounded to 3 decimals) and r is the mid-range between r1 and r2 (rounded to 2 decimals). The derived column names use the units from az1 and r1. For the distinction between cycles and frames, see Cycle vs frame. To iterate frames, use iter_frames() and load_frames().

Parameters:
  • data – Open h5py.File containing cycle/first, cycle/last, tod, az1, az2, r1, r2, and amp.

  • k – Cycle index into cycle/first and cycle/last.

Returns:

A pandas.DataFrame containing the raw per-cell fields plus the derived columns az and r for cycle k.

Raises:
  • IndexError – If k is out of bounds for cycle/first or cycle/last.

  • KeyError – If required datasets are missing, or if az1 or r1 lack the unit attribute needed to name the derived columns.

Example

from haxr import Store
from haxr.utilities import load_cycle

with Store(base_url=..., cache_dir=...) as store:
    chunk = store.get_chunk(...)
    with store.open(chunk.radar_file) as f:
        df = load_cycle(f, 0)
haxr.utilities.load_frames(data: h5py.File, k: int, *, n: int = 1) tuple[pandas.DataFrame, int][source]

Load up to n frames starting at cycle index k.

A frame is a sparse, non-overlapping subset of cycles as produced by iter_frames() (see ref:Cycle vs frame <cycle-vs-frame>). This function loads the frames via load_cycle(), concatenates them into a single DataFrame, and adds a 1-based frame column (1..m) identifying which loaded frame each row belongs to.

Parameters:
  • data – Open h5py.File containing the cycle datasets required by load_cycle().

  • k – Start cycle index (interpreted as the first frame).

  • n – Maximum number of frames to load (must be >= 1).

Returns:

A tuple (df, m) where df is the concatenated DataFrame and m is the number of frames actually loaded (m <= n). The returned DataFrame contains all columns from load_cycle() plus the frame column.

Raises:
  • ValueError – If no frames are loaded (e.g. k is out of range), because pandas.concat() has no objects to concatenate.

  • IndexError – If k (or a subsequent frame index) is out of bounds.

  • KeyError – If required datasets are missing from data.

Example

from haxr import Store
from haxr.utilities import load_frames

with Store(base_url=..., cache_dir=...) as store:
    chunk = store.get_chunk(...)
    with store.open(chunk.radar_file) as f:

        # load 5 frames starting at cycle `k = 100`
        df, m = load_frames(f, k=100, n=5)

        first_frame = df[df["frame"] == 1]
        last_frame  = df[df["frame"] == m]
        ...
haxr.utilities.arg_next_frame(data: h5py.File, k: int) int | None[source]

Return the index of the next frame after frame k.

The radar stream is stored as overlapping cycles with inclusive cell-index bounds in cycle/first and cycle/last. A frame is defined here as the next cycle that starts strictly after the current cycle ends. This helper therefore returns the first cycle index whose start index is greater than cycle/last[k].

See Cycle vs Frame for the underlying definition. This is a low-level primitive used by iter_frames() and load_frames().

Parameters:
  • data – Open h5py.File containing the datasets cycle/first and cycle/last.

  • k – Current cycle index (interpreted as the current frame).

Returns:

The integer index of the next frame, or None if no cycle starts after cycle/last[k].

Raises:
  • IndexError – If k is out of bounds for cycle/last.

  • KeyError – If cycle/first or cycle/last is missing from data.

haxr.utilities.iter_frames(data: h5py.File, k: int, n: int | None = None) collections.abc.Iterator[int][source]

Iterate cycle indices starting at k, jumping by frames.

Consecutive radar cycles overlap heavily. This iterator yields a sparse subset of cycle indices by repeatedly applying arg_next_frame(), i.e., it advances to the first cycle whose start index is strictly greater than the end index of the previously yielded cycle.

See Cycle vs Frame for why adjacent cycles overlap and how frames are selected.

Parameters:
  • data – Open h5py.File containing cycle/first and cycle/last.

  • k – Start cycle index.

  • n – Maximum number of indices to yield (including k). If None (default), iterate until no next frame exists.

Yields:

Cycle indices k, k2, k3, ... corresponding to successive frames.

Raises:
  • KeyError – If cycle/first or cycle/last is missing from data.

  • IndexError – If k (or a subsequent index) is out of bounds.

Example

from haxr import Store
from haxr.utilities import iter_frames, load_cycle

with Store(base_url=..., cache_dir=...) as store:
    chunk = store.get_chunk(...)
    with store.open(chunk.radar_file) as f:

        # iterate 5 frames starting at cycle `k = 100`
        for k in iter_frames(f, k=100, n=5):
            df = load_cycle(f, k)
            ...

For loading several frames into one DataFrame, use load_frames().

haxr.utilities.fill_histogram(*, az: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]], r: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]], weights: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]] | None = None, az_edges: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]] | None = None, r_edges: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]] | None = None) tuple[numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]]][source]

Fill a 2D histogram in polar coordinates (azimuth, range).

This is a thin wrapper around numpy.histogram2d(). If weights is provided, it is passed through as the per-sample weight array; a canonical choice is the per cell amplitude value. If weights is None, the histogram contains counts.

If az_edges or r_edges is None, bin edges are inferred from the data. The helper infer_az_edges() returns a schema compatible with numpy.linspace() (keys start, stop, num), and infer_r_edges() returns a schema compatible with numpy.arange() (keys start, step). These schemas are used to construct az_edges and r_edges for this function.

Parameters:
  • az – Azimuth samples (degrees).

  • r – Range samples.

  • weights – Optional per-sample weights (same length as az and r).

  • az_edges – Optional azimuth bin edges. If None, inferred via infer_az_edges().

  • r_edges – Optional range bin edges. If None, inferred via infer_r_edges().

Returns:

A tuple (hist, az_edges, r_edges) as returned by numpy.histogram2d(), where hist.shape == (len(az_edges) - 1, len(r_edges) - 1).

Raises:
haxr.utilities.infer_r_edges(*, r: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]]) dict[str, float][source]

Infer a range-bin schema for histogramming.

The returned schema is meant to be used with numpy.arange() to construct range bin edges. It contains start and step only (no stop), because the appropriate upper bound depends on the use case. A common choice is stop = float(np.max(r)) + step (as done in fill_histogram()).

The step size is estimated as the median of np.diff(r) and the first edge is placed half a step before the minimum value.

Warning

This function uses heuristics; the inferred edges may be incorrect for irregularly sampled or noisy inputs. If the inferred edges look unreasonable, prefer computing them explicitly.

Parameters:

r – Range samples with approximately constant spacing.

Returns:

A dict with keys start and step suitable for:

schema = infer_r_edges(r=r)
stop = float(np.max(r)) + schema["step"]
r_edges = numpy.arange(stop=stop, **schema)

Raises:

ValueError – If r is empty or cannot be reduced (e.g. min / diff fails).

haxr.utilities.infer_az_edges(*, az: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]]) dict[str, float | int][source]

Infer an azimuth-bin schema for histogramming.

The returned schema is meant to be used with numpy.linspace() to construct azimuth bin edges that cover one full turn. The edges are chosen such that the interval [0, 360] is fully covered and a wrap-around bin spanning the discontinuity at 360/0 is represented naturally (e.g., a bin 355 .. 5 degrees).

This helper is primarily intended for fill_histogram(). If inference fails (e.g., the azimuth step size cannot be determined reliably), an empty dict is returned.

Warning

This function uses heuristics; the inferred edges may be incorrect for irregularly sampled or noisy inputs. If the inferred edges look unreasonable, prefer computing them explicitly.

Parameters:

az – 1D azimuth samples in degrees, typically in acquisition order.

Returns:

A schema dict with keys start, stop, and num suitable for:

schema = infer_az_edges(az=az)
az_edges = numpy.linspace(**schema)

The returned start may be negative and stop may be greater than 360 to ensure wrap-around coverage.

Raises:

ValueError – If az is empty or cannot be reduced (e.g. max / diff fails).

haxr.utilities.histogram_to_cartesian_meshgrid(hist: numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]], az_edges: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]], r_edges: collections.abc.Sequence[float] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float32]] | numpy.ndarray[tuple[int], numpy.dtype[numpy.float64]]) tuple[numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]], numpy.ndarray[tuple[int, int], numpy.dtype[numpy.float64]]][source]

Convert a polar 2D histogram into a Cartesian meshgrid for plotting.

Given histogram values binned in polar coordinates (azimuth, range) together with their bin edges, this function computes Cartesian coordinate grids x and y for the bin corners. The result is suitable for visualizing the polar histogram in a Cartesian plot (e.g., via matplotlib.axes.Axes.pcolormesh()).

Azimuth edges are interpreted as degrees and converted to radians. The Cartesian mapping follows the usual radar convention (azimuth measured from North, increasing clockwise): x = r * sin(az) (East) and y = r * cos(az) (North).

Typically, hist, az_edges and r_edges come from fill_histogram().

Parameters:
  • hist – 2D histogram values binned over azimuth and range.

  • az_edges – 1D azimuth bin edges in degrees.

  • r_edges – 1D range bin edges.

Returns:

A tuple (x, y, values) where x and y are 2D arrays of bin-corner coordinates with shape (len(az_edges), len(r_edges)) and values are the corresponding per-bin histogram values suitable for plotting against these edges.

Example

hist, az_edges, r_edges = fill_histogram(az=az, r=r, weights=amp)
x, y, values = histogram_to_cartesian_meshgrid(hist, az_edges, r_edges)

fig, ax = plt.subplots()
ax.pcolormesh(x, y, values, shading="auto")
ax.set_aspect("equal", adjustable="box")