imgalz.utils¶

imgalz.utils.box_ops¶

imgalz.utils.box_ops.expand_box(xyxy: ndarray | Tuple[float, float, float, float], ratio: float | Tuple[float, float], w: int, h: int) → ndarray[source]¶

Expand bounding box size by given ratio and clip it within image dimensions.

Parameters:

xyxy (array-like) – Bounding box coordinates in format [xmin, ymin, xmax, ymax].
ratio (float or tuple of float) – Expansion ratio. - If float, both width and height are scaled by this ratio. - If tuple of two floats, width and height are scaled separately.
w (int) – Image width, used to clip bounding box.
h (int) – Image height, used to clip bounding box.

Returns:

Expanded and clipped bounding box in format [xmin, ymin, xmax, ymax].

Return type:

np.ndarray

imgalz.utils.box_ops.ltwh2xyxy(boxes: ndarray | list) → ndarray[source]¶

Convert bounding boxes from [xmin, ymin, w, h, …] format to [xmin, ymin, xmax, ymax, …] format. Supports arbitrary dimensional input as long as last dimension >= 4. Only converts the first 4 elements in the last dimension.

Parameters:: boxes (Union[np.ndarray, list]) – Input boxes, shape (…, >=4).
Returns:: Converted boxes, same shape as input.
Return type:: np.ndarray

Example

>>> ltwh2xyxy([10, 20, 20, 20])
array([10, 20, 30, 40])

>>> ltwh2xyxy([[10, 20, 20, 20, 0], [5, 5, 10, 10, 1]])
array([[10, 20, 30, 40, 0],
       [ 5,  5, 15, 15, 1]])

>>> ltwh2xyxy(np.array([[[10,20,20,20],[5,5,10,10]], [[1,2,2,2],[6,7,2,2]]]))
array([[[10, 20, 30, 40],
        [ 5,  5, 15, 15]],

[[ 1, 2, 3, 4],
[ 6, 7, 8, 9]]])

imgalz.utils.box_ops.nms(boxes, probs, overlapThresh=0.3)[source]¶

imgalz.utils.box_ops.xywh2xyxy(boxes: ndarray | list) → ndarray[source]¶

Convert bounding boxes from [x_center, y_center, w, h, …] format to [xmin, ymin, xmax, ymax, …] format. Supports arbitrary dimensional input as long as last dimension >= 4. Only converts the first 4 elements in the last dimension.

Parameters:: boxes (Union[np.ndarray, list]) – Input boxes, shape (…, >=4).
Returns:: Converted boxes, same shape as input.
Return type:: np.ndarray

Example

>>> xywh2xyxy([50, 50, 20, 20])
array([40., 40., 60., 60.])

>>> xywh2xyxy([[50, 50, 20, 20, 0], [10, 10, 4, 6, 1]])
array([[40., 40., 60., 60.,  0.],
       [ 8.,  7., 12., 13.,  1.]])

imgalz.utils.box_ops.xywh2xyxyxyxy(center)[source]¶

Convert oriented bounding boxes (OBB) from [cx, cy, w, h, angle] format to 4 corner points [x1, y1, x2, y2, x3, y3, x4, y4].

Parameters:: center (np.ndarray) – Input array of shape (…, 5), last dimension is [cx, cy, w, h, angle in degrees].
Returns:: Output array of shape (…, 8), each element is [x1, y1, x2, y2, x3, y3, x4, y4].
Return type:: np.ndarray

Example

>>> box = np.array([100, 100, 40, 20, 45])
>>> xyxy = xywh2xyxyxyxy(box)
>>> print(xyxy.shape)  # (8,)

>>> batch_boxes = np.random.rand(2, 3, 5) * 100
>>> xyxy_batch = xywh2xyxyxyxy(batch_boxes)
>>> print(xyxy_batch.shape)  # (2, 3, 8)

imgalz.utils.box_ops.xyxy2ltwh(boxes: ndarray | list) → ndarray[source]¶

Convert bounding boxes from [xmin, ymin, xmax, ymax, …] format to [xmin, ymin, width, height, …] format. Supports arbitrary dimensional input as long as last dimension >= 4. Only converts the first 4 elements per box, others remain unchanged.

Parameters:: boxes (Union[np.ndarray, list]) – Input boxes, shape (…, >=4).
Returns:: Converted boxes, same shape as input.
Return type:: np.ndarray

Example

>>> xyxy2ltwh([10, 20, 30, 40])
array([10., 20., 20., 20.])

>>> xyxy2ltwh([[10, 20, 30, 40, 0], [5, 5, 15, 15, 1]])
array([[10., 20., 20., 20., 0.],
       [ 5.,  5., 10., 10., 1.]])

imgalz.utils.box_ops.xyxy2xywh(boxes: list | ndarray) → ndarray[source]¶

Convert bounding boxes from [xmin, ymin, xmax, ymax] format to [x_center, y_center, width, height] format. Supports arbitrary dimensional input as long as the last dimension is at least 4. Keeps any additional trailing elements unchanged.

Parameters:: boxes (list or np.ndarray) – Input boxes with shape (…, >=4).
Returns:: Converted boxes with the same shape as input.
Return type:: np.ndarray

Example

>>> xyxy2xywh([10, 20, 30, 40])
array([20., 30., 20., 20.])

>>> xyxy2xywh([[10, 20, 30, 40, 1], [5, 5, 15, 15, 2]])
array([[20., 30., 20., 20.,  1.],
       [10., 10., 10., 10.,  2.]])

imgalz.utils.common¶

imgalz.utils.common.is_url(url: str, check: bool = False) → bool[source]¶

Validate if the given string is a URL and optionally check if the URL exists online.

Parameters:

url (str) – The string to be validated as a URL.
check (bool, optional) – If True, performs an additional check to see if the URL exists online.

Returns:

True for a valid URL. If ‘check’ is True, also returns True if the URL exists online.

Return type:

(bool)

Examples

>>> valid = is_url("https://www.example.com")
>>> valid_and_exists = is_url("https://www.example.com", check=True)

imgalz.utils.common.is_valid_image(path: str | Path) → bool[source]¶

Checks whether the given file is a valid image by attempting to open and verify it.

Parameters:: path (Union[str, Path]) – Path to the image file.
Returns:: True if the image is valid, False otherwise.
Return type:: bool
Raises:: None – All exceptions are caught internally and False is returned.

imgalz.utils.common.url_to_image(url: str, readFlag: int = 1, headers=None) → ndarray | None[source]¶

Download an image from a URL and decode it into an OpenCV image.

Parameters:

url (str) – URL of the image to download.
readFlag (int, optional) – Flag specifying the color type of a loaded image. Defaults to cv2.IMREAD_COLOR.

Returns:

Decoded image as a numpy array if successful, else None.

Return type:

Optional[np.ndarray]

imgalz.utils.file_utils¶

Recursively lists files in a directory, filtering by file extension and substring in filename.

Parameters:

base_path (Union[str, Path]) – Directory path to search for files.
valid_exts (Optional[Union[str, List[str], tuple]], optional) – File extensions to filter by (e.g., ‘.jpg’, [‘.png’, ‘.jpg’]). Case insensitive. If None, no filtering by extension. Defaults to None.
contains (Optional[str], optional) – Substring that filenames must contain. If None, no filtering by substring. Defaults to None.

Yields:

Iterator[str] – Full file paths matching the criteria.

imgalz.utils.file_utils.read_csv(csv_path: str | Path, delimiter: str = ',', skip_empty_lines: bool = True) → List[List[str]][source]¶

Reads a CSV file and returns its content as a list of rows.

Parameters:

csv_path (Union[str, Path]) – Path to the CSV file.
delimiter (str, optional) – Delimiter used in the CSV file. Defaults to ‘,’.
skip_empty_lines (bool, optional) – Whether to skip empty lines. Defaults to True.

Returns:

A list of rows, where each row is a list of strings.

Return type:

List[List[str]]

imgalz.utils.file_utils.read_json(json_path: str | Path, mode: Literal['all', 'line'] = 'all') → List[Any][source]¶

Reads JSON content from a file.

Supports reading the entire file as a JSON object or reading line-by-line for JSONL (JSON Lines) formatted files.

Parameters:

json_path (Union[str, Path]) – The path to the JSON file.
mode (Literal['all', 'line'], optional) – The mode to read the file. - ‘all’: Read the entire file as a single JSON object. - ‘line’: Read the file line by line, each line being a JSON object. Defaults to ‘all’.

Returns:

A list of JSON-parsed Python objects. For ‘all’ mode, the list will contain the root JSON object(s).: For ‘line’ mode, the list will contain one object per line.

Return type:

List[Any]

imgalz.utils.file_utils.read_pkl(pkl_path: str | Path) → Any[source]¶

Reads a pickle file and returns the deserialized data.

Parameters:: pkl_path (Union[str, Path]) – Path to the pickle file.
Returns:: The deserialized Python object stored in the pickle file.
Return type:: Any

imgalz.utils.file_utils.read_txt(txt_path: str | Path) → List[str][source]¶

Reads a text file and returns a list of lines without trailing newline characters.

Parameters:: txt_path (Union[str, Path]) – Path to the text file.
Returns:: List of lines with trailing newline characters removed.
Return type:: List[str]

imgalz.utils.file_utils.read_yaml(yaml_path: str | Path) → Any[source]¶

Reads and parses a YAML file.

Parameters:: yaml_path (Union[str, Path]) – Path to the YAML file.
Returns:: The parsed Python object from the YAML file, usually a dict or list.
Return type:: Any

imgalz.utils.file_utils.read_yolo_txt(txt_path: str | Path, width: int, height: int)[source]¶

Read YOLO-format annotation file and convert boxes to [x1, y1, x2, y2, class_id] format.

Parameters:

txt_path (str or Path) – Path to the YOLO annotation text file.
width (int or float) – Width of the image the boxes are relative to.
height (int or float) – Height of the image the boxes are relative to.

Returns:

Array of shape (N, 5), where each row is [x1, y1, x2, y2, class_id].

Return type:

np.ndarray

Example

>>> boxes = read_yolo_txt("label.txt", 640, 480)

imgalz.utils.file_utils.save_csv(csv_path: str | Path, info: List[List[Any]], mode: Literal['w', 'a'] = 'w', header: List[str] | None = None) → None[source]¶

Saves a 2D list to a CSV file.

Parameters:

csv_path (Union[str, Path]) – Path to the CSV file.
info (List[List[Any]]) – Data to write, each sublist is a row.
mode (Literal['w', 'a'], optional) – Write mode. - ‘w’: Overwrite the file. - ‘a’: Append to the file. Defaults to ‘w’.
header (Optional[List[str]], optional) – Optional column headers. Will be written as the first line if provided and mode is ‘w’. Defaults to None.

Returns:

None

imgalz.utils.file_utils.save_json(json_path: str | Path, info: Any, indent: int = 4, mode: Literal['w', 'a'] = 'w', with_return_char: bool = False) → None[source]¶

Saves a Python object to a JSON file.

Parameters:

json_path (Union[str, Path]) – Path to the JSON file to write.
info (Any) – The Python object to serialize as JSON.
indent (int, optional) – Number of spaces to use for indentation. Defaults to 4.
mode (Literal['w', 'a'], optional) – File write mode. - ‘w’: Overwrite the file. - ‘a’: Append to the file. Defaults to ‘w’.
with_return_char (bool, optional) – Whether to append a newline character at the end. Defaults to False.

Returns:

None

imgalz.utils.file_utils.save_pkl(pkl_path: str | Path, pkl_data: Any) → None[source]¶

Saves Python object data to a pickle file.

Parameters:

pkl_path (Union[str, Path]) – Path to the pickle file to write.
pkl_data (Any) – Python object to serialize and save.

Returns:

None

imgalz.utils.file_utils.save_txt(txt_path: str | Path, info: List[str], mode: str = 'w') → None[source]¶

Saves a list of strings to a text file, adding a newline character after each line.

Parameters:

txt_path (Union[str, Path]) – Path to the text file.
info (List[str]) – List of strings to write, each string will be one line.
mode (str, optional) – File open mode, defaults to write mode ‘w’.

imgalz.utils.file_utils.save_yaml(yaml_path: str | Path, data: Any, header: str = '') → None[source]¶

Saves any YAML-serializable Python data (dict, list, etc.) to a YAML file.

Parameters:

yaml_path (str or Path) – The path to the output YAML file.
data (Any) – The Python data to save (dict, list, etc.).
header (str) – Optional header string to prepend to the file.

imgalz.utils.file_utils.save_yolo_txt(box: ndarray, cls: ndarray | Sequence[int], save_path: str | Path, format: Literal['xywh', 'xyxy'] = 'xyxy', is_normalized: bool = False, width: int | None = None, height: int | None = None) → None[source]¶

Save bounding boxes in YOLO format to a .txt file.

Parameters:

box (np.ndarray) – Array of shape (N, 4), each row is either: - [x1, y1, x2, y2] if format=’xyxy’ - [cx, cy, w, h] if format=’xywh’
cls (np.ndarray or list) – Array/list of class IDs, shape (N,).
save_path (str or Path) – Output path for the YOLO-format .txt file.
format (str) – Format of input boxes: “xywh” or “xyxy”.
input_is_normalized (bool) – True if input box values are already normalized.
width (int, optional) – Image width (required if input_is_normalized=False).
height (int, optional) – Image height (required if input_is_normalized=False).

imgalz.utils.visualization¶

class imgalz.utils.visualization.VideoReader(filename: str, cache_capacity: int = 10, step: int = 1)[source]¶

Bases: object

current_frame() → Any | None[source]¶

property fourcc¶

property fps¶

property frame_cnt¶

get_frame(frame_id: int) → Any | None[source]¶

property height¶

next() → Any¶

property position¶

read() → Any | None[source]¶

property resolution¶

property step¶

property width¶

imgalz.utils.visualization.compute_color_for_labels(label)[source]¶

imgalz.utils.visualization.cv_imshow(title: str, image: ndarray, color_type: Literal['bgr', 'rgb'] = 'bgr', delay: int = 0, size: int | List | None = None) → bool | None[source]¶

Display an image in a window. Converts color if needed. Optionally resizes the image to fit screen if too large.

Parameters:

title (str) – Window title or filename prefix if saving.
image (np.ndarray) – Image array.
color_type (Literal['bgr', 'rgb'], optional) – Input image color space. Defaults to ‘bgr’.
delay (int, optional) – Delay in milliseconds for display. If 0, waits indefinitely.Defaults to 0.

imgalz.utils.visualization.draw_bbox(img: ndarray, box: List[float] | ndarray, score: float = 1.0, obj_id: str | None = None, line_thickness: int | None = None, label_format: str = '{score:.2f} {id}', txt_color: Tuple[int, int, int] = (255, 255, 255), box_color: List[int] | Tuple[int, int, int] = [255, 0, 0]) → ndarray[source]¶

Draws a bounding box with optional label on the image.

Parameters:

img (np.ndarray) – The image on which to draw.
box (List[float] or np.ndarray) – Bounding box in [x1, y1, x2, y2] format.
score (float, optional) – Confidence score for the object.
obj_id (int, optional) – Object ID or class index.
line_thickness (int, optional) – Line thickness of the box.
label_format (str, optional) – Format string for label. Use ‘{score}’ and ‘{id}’.
txt_color (Tuple[int, int, int], optional) – Text color in BGR format.
box_color (List[int] or Tuple[int, int, int], optional) – Box color in BGR.

Returns:

Image with bounding box and label drawn.

Return type:

np.ndarray

imgalz.utils.visualization.draw_keypoints(image: ndarray, keypoints: ndarray, skeleton: list, kpt_color: ndarray, limb_color: ndarray, image_shape: tuple | None = None, radius: int = 5, draw_limb: bool = True, conf_threshold: float = 0.3)[source]¶

Draw keypoints and skeletons on the image.

Parameters:

image (np.ndarray) – Input image.
keypoints (np.ndarray) – Keypoints array with shape (17, 3), format [x, y, conf].
skeleton (list) – List of index pairs defining limb connections.
kpt_color (np.ndarray) – Color array for each keypoint.
limb_color (np.ndarray) – Color array for each limb.
image_shape (tuple) – Optional, (h, w). Defaults to image.shape[:2].
radius (int) – Radius of keypoint circles.
draw_limb (bool) – Whether to draw connecting lines between keypoints.
conf_threshold (float) – Minimum confidence to render a keypoint or limb.

Returns:

same shape as input image, dtype uint8.

Return type:

np.ndarray

Example

>>> from imgalz.utils.dataset_info import CocoConfig
>>> import numpy as np
>>> image = np.ones((480, 640, 3), dtype=np.uint8) * 255
>>> keypoints = np.random.rand((17,3))
>>> kpts[:, 0] *= 640  # x
>>> kpts[:, 1] *= 640  # y
>>> kpts[:, 2] = 1.0   # conf
>>> skeleton = CocoConfig.skeleton
>>> kpt_color = CocoConfig.kpt_color
>>> limb_color = CocoConfig.limb_color
>>> draw_keypoints(image, keypoints, skeleton, kpt_color, limb_color)

imgalz.utils.visualization.draw_masks(masks: ndarray, colors: list | ndarray, image: ndarray, alpha: float = 0.5) → ndarray[source]¶

Overlay multiple binary masks onto an image with given colors and alpha blending.

Parameters:

masks (np.ndarray) – Boolean or float masks of shape (N, H, W), each mask is in [0, 1] or {0, 1}.
colors (np.ndarray|list) – RGB colors of shape (N, 3), each color is [R, G, B] in [0, 255].
image (np.ndarray) – Original image of shape (H, W, 3), dtype uint8, values in [0, 255].
alpha (float, optional) – Opacity of each mask, between 0 (transparent) and 1 (opaque). Default is 0.5.

Returns:

Image with masks overlaid, same shape as input image, dtype uint8.

Return type:

np.ndarray

Example

>>> output = draw_masks(masks, colors, image, alpha=0.5)
>>> cv2.imshow("Masked", output)

imgalz.utils.visualization.imread(path: str | Path, flags: int = 1) → ndarray[source]¶

Reads an image from a file, supporting paths with non-ASCII characters.

Parameters:

path (Union[str, Path]) – Path to the image file.
flags (int, optional) – Flags specifying the color type of a loaded image. Defaults to cv2.IMREAD_COLOR.

Returns:

The loaded image array.

Return type:

np.ndarray

imgalz.utils.visualization.imwrite(filename: str | Path, img: ndarray) → bool[source]¶

Saves an image to a file, supporting paths with non-ASCII characters.

Parameters:

filename (Union[str, Path]) – Path to save the image.
img (np.ndarray) – Image data array.

Returns:

True if the image is successfully saved, False otherwise.

Return type:

bool

imgalz.utils.image_filter¶

class imgalz.utils.image_filter.ImageFilter(image_dir, save_dir, hash='ahash', threshold=5, max_workers=8)[source]¶

Bases: object

A utility class for detecting and filtering duplicate or similar images based on perceptual or MinHash-based hashing.

Parameters:

image_dir (Union[str, Path]) – Path to the directory containing input images to be filtered.
save_dir (Union[str, Path]) – Path where filtered (non-duplicate) images will be saved.
hash (str) – Hashing method to use. Supported options are: - ‘ahash’: Average Hash - ‘phash’: Perceptual Hash - ‘dhash’: Difference Hash - ‘whash’: Wavelet Hash - ‘minhash’: MinHash (for scalable set similarity)
threshold (int) – Similarity threshold to determine duplicates. For non-Minhash methods, this is a Hamming distance threshold.
max_workers (int) – Maximum number of threads for parallel image hashing.

Example

```python from imgalz import ImageFilter

deduper = ImageFilter(: image_dir=”/path/to/src”, save_dir=”/path/to/dst”, hash=”ahash”, threshold=5, max_workers=8

) deduper.run() ```

hash_exts = ('.jpg', '.jpeg', '.png', '.bmp', '.webp', '.tiff', '.gif')¶

run()[source]¶

imgalz.utils.dataset_info¶

class imgalz.utils.dataset_info.CocoConfig[source]¶

Bases: object

category = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush', 'text']¶

kpt_color = array([[ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0], [255, 128, 0], [255, 128, 0], [255, 128, 0], [255, 128, 0], [255, 128, 0], [255, 128, 0], [ 51, 153, 255], [ 51, 153, 255], [ 51, 153, 255], [ 51, 153, 255], [ 51, 153, 255], [ 51, 153, 255]], dtype=uint8)¶

limb_color = array([[ 51, 153, 255], [ 51, 153, 255], [ 51, 153, 255], [ 51, 153, 255], [255, 51, 255], [255, 51, 255], [255, 51, 255], [255, 128, 0], [255, 128, 0], [255, 128, 0], [255, 128, 0], [255, 128, 0], [ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0], [ 0, 255, 0]], dtype=uint8)¶

palette = array([[255, 128, 0], [255, 153, 51], [255, 178, 102], [230, 230, 0], [255, 153, 255], [153, 204, 255], [255, 102, 255], [255, 51, 255], [102, 178, 255], [ 51, 153, 255], [255, 153, 153], [255, 102, 102], [255, 51, 51], [153, 255, 153], [102, 255, 102], [ 51, 255, 51], [ 0, 255, 0]], dtype=uint8)¶

skeleton = [[16, 14], [14, 12], [17, 15], [15, 13], [12, 13], [6, 12], [7, 13], [6, 7], [6, 8], [7, 9], [8, 10], [9, 11], [2, 3], [1, 2], [1, 3], [2, 4], [3, 5], [4, 6], [5, 7]]¶