pipert.utils.structures

Package Contents

Classes

Boxes

This structure stores a list of boxes as a Nx4 torch.Tensor.

BoxMode

Enum of different ways to represent a box.

ImageList

Structure that holds a list of images (of possibly

Instances

This class represents a list of instances in an image.

Keypoints

Stores keypoint annotation data. GT Instances have a gt_keypoints property

PolygonMasks

This class stores the segmentation masks for all objects in one image, in the form of polygons.

Functions

pairwise_iou(boxes1: pipert.utils.structures.boxes.Boxes, boxes2: pipert.utils.structures.boxes.Boxes) → torch.Tensor

Given two lists of boxes of size N and M,

heatmaps_to_keypoints(maps: torch.Tensor, rois: torch.Tensor) → torch.Tensor

Args:

rasterize_polygons_within_box(polygons: List[np.ndarray], box: numpy.ndarray, mask_size: int) → torch.Tensor

Rasterize the polygons into a mask image and

class pipert.utils.structures.Boxes(tensor: torch.Tensor)[source]

This structure stores a list of boxes as a Nx4 torch.Tensor. It supports some common methods about boxes (area, clip, nonempty, etc), and also behaves like a Tensor (support indexing, to(device), .device, and iteration over all boxes)

Attributes:

tensor: float matrix of Nx4.

BoxSizeType
clone(self)pipert.utils.structures.boxes.Boxes

Clone the Boxes.

Returns:

Boxes

to(self, device: str)pipert.utils.structures.boxes.Boxes
area(self)torch.Tensor

Computes the area of all the boxes.

Returns:

torch.Tensor: a vector with areas of each box.

clip(self, box_size: BoxSizeType)None

Clip (in place) the boxes by limiting x coordinates to the range [0, width] and y coordinates to the range [0, height].

Args:

box_size (height, width): The clipping box’s size.

nonempty(self, threshold: int = 0)torch.Tensor

Find boxes that are non-empty. A box is considered empty, if either of its side is no larger than threshold.

Returns:
Tensor:

a binary vector which represents whether each box is empty (False) or non-empty (True).

__getitem__(self, item: Union[int, slice, torch.BoolTensor])pipert.utils.structures.boxes.Boxes
Returns:

Boxes: Create a new Boxes by indexing.

The following usage are allowed: 1. new_boxes = boxes[3]: return a Boxes which contains only one box. 2. new_boxes = boxes[2:10]: return a slice of boxes. 3. new_boxes = boxes[vector], where vector is a torch.BoolTensor

with length = len(boxes). Nonzero elements in the vector will be selected.

Note that the returned Boxes might share storage with this Boxes, subject to Pytorch’s indexing semantics.

__len__(self)int
__repr__(self)str

Return repr(self).

inside_box(self, box_size: BoxSizeType, boundary_threshold: int = 0)torch.Tensor
Args:

box_size (height, width): Size of the reference box. boundary_threshold (int): Boxes that extend beyond the reference box

boundary by more than boundary_threshold are considered “outside”.

Returns:

a binary vector, indicating whether each box is inside the reference box.

get_centers(self)torch.Tensor

Returns: The box centers in a Nx2 array of (x, y).

scale(self, scale_x: float, scale_y: float)None

Scale the box with horizontal and vertical scaling factors

static cat(boxes_list: List[‘Boxes’])pipert.utils.structures.boxes.Boxes

Concatenates a list of Boxes into a single Boxes

Arguments:

boxes_list (list[Boxes])

Returns:

Boxes: the concatenated Boxes

property device(self)str
__iter__(self)Iterator[torch.Tensor]

Yield a box as a Tensor of shape (4,) at a time.

class pipert.utils.structures.BoxMode[source]

Bases: enum.Enum

Enum of different ways to represent a box.

Attributes:

XYXY_ABS: (x0, y0, x1, y1) in absolute floating points coordinates.

The coordinates in range [0, width or height].

XYWH_ABS: (x0, y0, w, h) in absolute floating points coordinates. XYXY_REL: (x0, y0, x1, y1) in range [0, 1]. They are relative to the size of the image. XYWH_REL: (x0, y0, w, h) in range [0, 1]. They are relative to the size of the image.

XYXY_ABS = 0
XYWH_ABS = 1
XYXY_REL = 2
XYWH_REL = 3
static convert(box: _RawBoxType, from_mode: pipert.utils.structures.boxes.BoxMode, to_mode: pipert.utils.structures.boxes.BoxMode)_RawBoxType
Args:

box: can be a 4-tuple, 4-list or a Nx4 array/tensor. from_mode, to_mode (BoxMode)

Returns:

The converted box of the same type.

pipert.utils.structures.pairwise_iou(boxes1: pipert.utils.structures.boxes.Boxes, boxes2: pipert.utils.structures.boxes.Boxes)torch.Tensor[source]

Given two lists of boxes of size N and M, compute the IoU (intersection over union) between __all__ N x M pairs of boxes. The box order must be (xmin, ymin, xmax, ymax).

Args:

boxes1,boxes2 (Boxes): two Boxes. Contains N & M boxes, respectively.

Returns:

Tensor: IoU, sized [N,M].

class pipert.utils.structures.ImageList(tensor: torch.Tensor, image_sizes: List[Tuple[int, int]])[source]

Bases: object

Structure that holds a list of images (of possibly varying sizes) as a single tensor. This works by padding the images to the same size, and storing in a field the original sizes of each image

Attributes:

image_sizes (list[tuple[int, int]]): each tuple is (h, w)

__len__(self)int
__getitem__(self, idx: Union[int, slice])torch.Tensor

Access the individual image in its original size.

Returns:

Tensor: an image of shape (H, W) or (C_1, …, C_K, H, W) where K >= 1

to(self, *args: Any, **kwargs: Any)pipert.utils.structures.image_list.ImageList
static from_tensors(tensors: Sequence[torch.Tensor], size_divisibility: int = 0, pad_value: float = 0.0)pipert.utils.structures.image_list.ImageList
Args:
tensors: a tuple or list of torch.Tensors, each of shape (Hi, Wi) or

(C_1, …, C_K, Hi, Wi) where K >= 1. The Tensors will be padded with pad_value so that they will have the same shape.

size_divisibility (int): If size_divisibility > 0, also adds padding to ensure

the common height and width is divisible by size_divisibility

pad_value (float): value to pad

Returns:

an ImageList.

class pipert.utils.structures.Instances(image_size: Tuple[int, int], **kwargs: Any)[source]

This class represents a list of instances in an image. It stores the attributes of instances (e.g., boxes, masks, labels, scores) as “fields”. All fields must have the same __len__ which is the number of instances.

All other (non-field) attributes of this class are considered private: they must start with ‘_’ and are not modifiable by a user.

Some basic usage:

  1. Set/Get a field: instances.gt_boxes = Boxes(…) print(instances.pred_masks) print(‘gt_masks’ in instances)

  2. len(instances) returns the number of instances

  3. Indexing: instances[indices] will apply the indexing on all the fields and returns a new Instances. Typically, indices is a binary vector of length num_instances, or a vector of integer indices.

property image_size(self)Tuple[int, int]

Returns: tuple: height, width

__setattr__(self, name: str, val: Any)None

Implement setattr(self, name, value).

__getattr__(self, name: str)Any
set(self, name: str, value: Any)None

Set the field named name to value. The length of value must be the number of instances, and must agree with other existing fields in this object.

has(self, name: str)bool

Returns: bool: whether the field called name exists.

remove(self, name: str)None

Remove the field called name.

get(self, name: str)Any

Returns the field called name.

get_fields(self)Dict[str, Any]
Returns:

dict: a dict which maps names (str) to data of the fields

Modifying the returned dict will modify this instance.

to(self, device: str)pipert.utils.structures.instances.Instances

Returns: Instances: all fields are called with a to(device), if the field has this method.

__getitem__(self, item: Union[int, slice, torch.BoolTensor])pipert.utils.structures.instances.Instances
Args:

item: an index-like object and will be used to index all the fields.

Returns:

If item is a string, return the data in the corresponding field. Otherwise, returns an Instances where all fields are indexed by item.

__len__(self)int
static cat(instance_lists: List[‘Instances’])pipert.utils.structures.instances.Instances
Args:

instance_lists (list[Instances])

Returns:

Instances

__str__(self)str

Return str(self).

__repr__(self)str

Return repr(self).

class pipert.utils.structures.Keypoints(keypoints: Union[torch.Tensor, np.ndarray, List[List[float]]])[source]

Stores keypoint annotation data. GT Instances have a gt_keypoints property containing the x,y location and visibility flag of each keypoint. This tensor has shape (N, K, 3) where N is the number of instances and K is the number of keypoints per instance.

The visibility flag follows the COCO format and must be one of three integers: * v=0: not labeled (in which case x=y=0) * v=1: labeled but not visible * v=2: labeled and visible

__len__(self)int
to(self, *args: Any, **kwargs: Any)pipert.utils.structures.keypoints.Keypoints
to_heatmap(self, boxes: torch.Tensor, heatmap_size: int)torch.Tensor
Arguments:

boxes: Nx4 tensor, the boxes to draw the keypoints to

Returns:
heatmaps:

A tensor of shape (N, K) containing an integer spatial label in the range [0, heatmap_size**2 - 1] for each keypoint in the input.

valid:

A tensor of shape (N, K) containing whether each keypoint is in the roi or not.

__getitem__(self, item: Union[int, slice, torch.BoolTensor])pipert.utils.structures.keypoints.Keypoints

Create a new Keypoints by indexing on this Keypoints.

The following usage are allowed:

  1. new_kpts = kpts[3]: return a Keypoints which contains only one instance.

  2. new_kpts = kpts[2:10]: return a slice of key points.

  3. new_kpts = kpts[vector], where vector is a torch.ByteTensor with length = len(kpts). Nonzero elements in the vector will be selected.

Note that the returned Keypoints might share storage with this Keypoints, subject to Pytorch’s indexing semantics.

__repr__(self)str

Return repr(self).

pipert.utils.structures.heatmaps_to_keypoints(maps: torch.Tensor, rois: torch.Tensor)torch.Tensor[source]
Args:

maps (Tensor): (#ROIs, #keypoints, POOL_H, POOL_W) rois (Tensor): (#ROIs, 4)

Extract predicted keypoint locations from heatmaps. Output has shape (#rois, #keypoints, 4) with the last dimension corresponding to (x, y, logit, prob) for each keypoint.

Converts a discrete image coordinate in an NxN image to a continuous keypoint coordinate. We maintain consistency with keypoints_to_heatmap by using the conversion from Heckbert 1990: c = d + 0.5, where d is a discrete coordinate and c is a continuous coordinate.

class pipert.utils.structures.PolygonMasks(polygons: List[List[Union[torch.Tensor, np.ndarray]]])[source]

This class stores the segmentation masks for all objects in one image, in the form of polygons.

Attributes:

polygons: list[list[ndarray]]. Each ndarray is a float64 vector representing a polygon.

to(self, *args: Any, **kwargs: Any)pipert.utils.structures.masks.PolygonMasks
get_bounding_boxes(self)pipert.utils.structures.boxes.Boxes

Returns: Boxes: tight bounding boxes around polygon masks.

nonempty(self)torch.Tensor

Find masks that are non-empty.

Returns:
Tensor:

a BoolTensor which represents whether each mask is empty (False) or not (True).

__getitem__(self, item: Union[int, slice, List[int], torch.BoolTensor])pipert.utils.structures.masks.PolygonMasks

Support indexing over the instances and return a PolygonMasks object. item can be:

  1. An integer. It will return an object with only one instance.

  2. A slice. It will return an object with the selected instances.

  3. A list[int]. It will return an object with the selected instances, correpsonding to the indices in the list.

  4. A vector mask of type BoolTensor, whose length is num_instances. It will return an object with the instances whose mask is nonzero.

__iter__(self)Iterator[List[torch.Tensor]]

Yields: list[ndarray]: the polygons for one instance. Each Tensor is a float64 vector representing a polygon.

__repr__(self)str

Return repr(self).

__len__(self)int
crop_and_resize(self, boxes: torch.Tensor, mask_size: int)torch.Tensor

Crop each mask by the given box, and resize results to (mask_size, mask_size). This can be used to prepare training targets for Mask R-CNN.

Args:

boxes (Tensor): Nx4 tensor storing the boxes for each mask mask_size (int): the size of the rasterized mask.

Returns:

Tensor: A bool tensor of shape (N, mask_size, mask_size), where N is the number of predicted boxes for this image.

pipert.utils.structures.rasterize_polygons_within_box(polygons: List[np.ndarray], box: numpy.ndarray, mask_size: int)torch.Tensor[source]

Rasterize the polygons into a mask image and crop the mask content in the given box. The cropped mask is resized to (mask_size, mask_size).

This function is used when generating training targets for mask head in Mask R-CNN. Given original ground-truth masks for an image, new ground-truth mask training targets in the size of mask_size x mask_size must be provided for each predicted box. This function will be called to produce such targets.

Args:

polygons (list[ndarray[float]]): a list of polygons, which represents an instance. box: 4-element numpy array mask_size (int):

Returns:

Tensor: BoolTensor of shape (mask_size, mask_size)

pipert.utils.structures.__all__[source]