`pipert.utils.structures`¶

Submodules¶

Package Contents¶

Classes¶

`Boxes`	This structure stores a list of boxes as a Nx4 torch.Tensor.
`BoxMode`	Enum of different ways to represent a box.
`ImageList`	Structure that holds a list of images (of possibly
`Instances`	This class represents a list of instances in an image.
`Keypoints`	Stores keypoint annotation data. GT Instances have a gt_keypoints property
`PolygonMasks`	This class stores the segmentation masks for all objects in one image, in the form of polygons.

Functions¶

`pairwise_iou`(boxes1: pipert.utils.structures.boxes.Boxes, boxes2: pipert.utils.structures.boxes.Boxes) → torch.Tensor	Given two lists of boxes of size N and M,
`heatmaps_to_keypoints`(maps: torch.Tensor, rois: torch.Tensor) → torch.Tensor	Args:
`rasterize_polygons_within_box`(polygons: List[np.ndarray], box: numpy.ndarray, mask_size: int) → torch.Tensor	Rasterize the polygons into a mask image and

class pipert.utils.structures.Boxes(tensor: torch.Tensor)[source]¶

This structure stores a list of boxes as a Nx4 torch.Tensor. It supports some common methods about boxes (area, clip, nonempty, etc), and also behaves like a Tensor (support indexing, to(device), .device, and iteration over all boxes)

Attributes:: tensor: float matrix of Nx4.

BoxSizeType¶

clone(self) → pipert.utils.structures.boxes.Boxes ¶

Clone the Boxes.

Returns:: Boxes

to(self, device: str) → pipert.utils.structures.boxes.Boxes ¶

area(self) → torch.Tensor¶

Computes the area of all the boxes.

Returns:: torch.Tensor: a vector with areas of each box.

clip(self, box_size: BoxSizeType) → None ¶

Clip (in place) the boxes by limiting x coordinates to the range [0, width] and y coordinates to the range [0, height].

Args:: box_size (height, width): The clipping box’s size.

nonempty(self, threshold: int = 0) → torch.Tensor¶

Find boxes that are non-empty. A box is considered empty, if either of its side is no larger than threshold.

Returns:

Tensor:: a binary vector which represents whether each box is empty (False) or non-empty (True).

__getitem__(self, item: Union[int, slice, torch.BoolTensor]) → pipert.utils.structures.boxes.Boxes ¶

Returns:: Boxes: Create a new Boxes by indexing.

The following usage are allowed: 1. new_boxes = boxes[3]: return a Boxes which contains only one box. 2. new_boxes = boxes[2:10]: return a slice of boxes. 3. new_boxes = boxes[vector], where vector is a torch.BoolTensor

with length = len(boxes). Nonzero elements in the vector will be selected.

Note that the returned Boxes might share storage with this Boxes, subject to Pytorch’s indexing semantics.

__len__(self) → int ¶

__repr__(self) → str ¶: Return repr(self).

inside_box(self, box_size: BoxSizeType, boundary_threshold: int = 0) → torch.Tensor¶

Args:: box_size (height, width): Size of the reference box. boundary_threshold (int): Boxes that extend beyond the reference box

boundary by more than boundary_threshold are considered “outside”.
Returns:: a binary vector, indicating whether each box is inside the reference box.

get_centers(self) → torch.Tensor¶: Returns: The box centers in a Nx2 array of (x, y).

scale(self, scale_x: float, scale_y: float) → None ¶: Scale the box with horizontal and vertical scaling factors

static cat(boxes_list: List[‘Boxes’]) → pipert.utils.structures.boxes.Boxes ¶

Concatenates a list of Boxes into a single Boxes

Arguments:: boxes_list (list[Boxes])
Returns:: Boxes: the concatenated Boxes

property device(self) → str ¶

__iter__(self) → Iterator[torch.Tensor]¶: Yield a box as a Tensor of shape (4,) at a time.

class pipert.utils.structures.BoxMode[source]¶

Bases: enum.Enum

Enum of different ways to represent a box.

Attributes:

XYXY_ABS: (x0, y0, x1, y1) in absolute floating points coordinates.
The coordinates in range [0, width or height].

XYWH_ABS: (x0, y0, w, h) in absolute floating points coordinates. XYXY_REL: (x0, y0, x1, y1) in range [0, 1]. They are relative to the size of the image. XYWH_REL: (x0, y0, w, h) in range [0, 1]. They are relative to the size of the image.

XYXY_ABS = 0¶

XYWH_ABS = 1¶

XYXY_REL = 2¶

XYWH_REL = 3¶

static convert(box: _RawBoxType, from_mode: pipert.utils.structures.boxes.BoxMode, to_mode: pipert.utils.structures.boxes.BoxMode) → _RawBoxType¶

Args:: box: can be a 4-tuple, 4-list or a Nx4 array/tensor. from_mode, to_mode (BoxMode)
Returns:: The converted box of the same type.

pipert.utils.structures.pairwise_iou(boxes1: pipert.utils.structures.boxes.Boxes, boxes2: pipert.utils.structures.boxes.Boxes) → torch.Tensor[source]¶

Given two lists of boxes of size N and M, compute the IoU (intersection over union) between __all__ N x M pairs of boxes. The box order must be (xmin, ymin, xmax, ymax).

Args:: boxes1,boxes2 (Boxes): two Boxes. Contains N & M boxes, respectively.
Returns:: Tensor: IoU, sized [N,M].

class pipert.utils.structures.ImageList(tensor: torch.Tensor, image_sizes: List[Tuple[int, int]])[source]¶

Bases: object

Structure that holds a list of images (of possibly varying sizes) as a single tensor. This works by padding the images to the same size, and storing in a field the original sizes of each image

Attributes:: image_sizes (list[tuple[int, int]]): each tuple is (h, w)

__len__(self) → int ¶

__getitem__(self, idx: Union[int, slice]) → torch.Tensor¶

Access the individual image in its original size.

Returns:: Tensor: an image of shape (H, W) or (C_1, …, C_K, H, W) where K >= 1

to(self, *args: Any, **kwargs: Any) → pipert.utils.structures.image_list.ImageList ¶

static from_tensors(tensors: Sequence[torch.Tensor], size_divisibility: int = 0, pad_value: float = 0.0) → pipert.utils.structures.image_list.ImageList ¶

Args:

tensors: a tuple or list of torch.Tensors, each of shape (Hi, Wi) or: (C_1, …, C_K, Hi, Wi) where K >= 1. The Tensors will be padded with pad_value so that they will have the same shape.
size_divisibility (int): If size_divisibility > 0, also adds padding to ensure: the common height and width is divisible by size_divisibility

pad_value (float): value to pad

Returns:

an ImageList.

class pipert.utils.structures.Instances(image_size: Tuple[int, int], **kwargs: Any)[source]¶

This class represents a list of instances in an image. It stores the attributes of instances (e.g., boxes, masks, labels, scores) as “fields”. All fields must have the same __len__ which is the number of instances.

All other (non-field) attributes of this class are considered private: they must start with ‘_’ and are not modifiable by a user.

Some basic usage:

Set/Get a field: instances.gt_boxes = Boxes(…) print(instances.pred_masks) print(‘gt_masks’ in instances)
len(instances) returns the number of instances
Indexing: instances[indices] will apply the indexing on all the fields and returns a new Instances. Typically, indices is a binary vector of length num_instances, or a vector of integer indices.

property image_size(self) → Tuple[int, int]¶: Returns: tuple: height, width

__setattr__(self, name: str, val: Any) → None ¶: Implement setattr(self, name, value).

__getattr__(self, name: str) → Any¶

set(self, name: str, value: Any) → None ¶: Set the field named name to value. The length of value must be the number of instances, and must agree with other existing fields in this object.

has(self, name: str) → bool ¶: Returns: bool: whether the field called name exists.

remove(self, name: str) → None ¶: Remove the field called name.

get(self, name: str) → Any¶: Returns the field called name.

get_fields(self) → Dict[str, Any]¶

Returns:: dict: a dict which maps names (str) to data of the fields

Modifying the returned dict will modify this instance.

to(self, device: str) → pipert.utils.structures.instances.Instances ¶: Returns: Instances: all fields are called with a to(device), if the field has this method.

__getitem__(self, item: Union[int, slice, torch.BoolTensor]) → pipert.utils.structures.instances.Instances ¶

Args:: item: an index-like object and will be used to index all the fields.
Returns:: If item is a string, return the data in the corresponding field. Otherwise, returns an Instances where all fields are indexed by item.

__len__(self) → int ¶

static cat(instance_lists: List[‘Instances’]) → pipert.utils.structures.instances.Instances ¶

Args:: instance_lists (list[Instances])
Returns:: Instances

__str__(self) → str ¶: Return str(self).

__repr__(self) → str ¶: Return repr(self).

class pipert.utils.structures.Keypoints(keypoints: Union[torch.Tensor, np.ndarray, List[List[float]]])[source]¶

Stores keypoint annotation data. GT Instances have a gt_keypoints property containing the x,y location and visibility flag of each keypoint. This tensor has shape (N, K, 3) where N is the number of instances and K is the number of keypoints per instance.

The visibility flag follows the COCO format and must be one of three integers: * v=0: not labeled (in which case x=y=0) * v=1: labeled but not visible * v=2: labeled and visible

__len__(self) → int ¶

to(self, *args: Any, **kwargs: Any) → pipert.utils.structures.keypoints.Keypoints ¶

to_heatmap(self, boxes: torch.Tensor, heatmap_size: int) → torch.Tensor¶

Arguments:

boxes: Nx4 tensor, the boxes to draw the keypoints to

Returns:

heatmaps:: A tensor of shape (N, K) containing an integer spatial label in the range [0, heatmap_size**2 - 1] for each keypoint in the input.
valid:: A tensor of shape (N, K) containing whether each keypoint is in the roi or not.

__getitem__(self, item: Union[int, slice, torch.BoolTensor]) → pipert.utils.structures.keypoints.Keypoints ¶

Create a new Keypoints by indexing on this Keypoints.

The following usage are allowed:

new_kpts = kpts[3]: return a Keypoints which contains only one instance.
new_kpts = kpts[2:10]: return a slice of key points.
new_kpts = kpts[vector], where vector is a torch.ByteTensor with length = len(kpts). Nonzero elements in the vector will be selected.

Note that the returned Keypoints might share storage with this Keypoints, subject to Pytorch’s indexing semantics.

__repr__(self) → str ¶: Return repr(self).

pipert.utils.structures.heatmaps_to_keypoints(maps: torch.Tensor, rois: torch.Tensor) → torch.Tensor[source]¶

Args:: maps (Tensor): (#ROIs, #keypoints, POOL_H, POOL_W) rois (Tensor): (#ROIs, 4)

Extract predicted keypoint locations from heatmaps. Output has shape (#rois, #keypoints, 4) with the last dimension corresponding to (x, y, logit, prob) for each keypoint.

Converts a discrete image coordinate in an NxN image to a continuous keypoint coordinate. We maintain consistency with keypoints_to_heatmap by using the conversion from Heckbert 1990: c = d + 0.5, where d is a discrete coordinate and c is a continuous coordinate.

class pipert.utils.structures.PolygonMasks(polygons: List[List[Union[torch.Tensor, np.ndarray]]])[source]¶

This class stores the segmentation masks for all objects in one image, in the form of polygons.

Attributes:: polygons: list[list[ndarray]]. Each ndarray is a float64 vector representing a polygon.

to(self, *args: Any, **kwargs: Any) → pipert.utils.structures.masks.PolygonMasks ¶

get_bounding_boxes(self) → pipert.utils.structures.boxes.Boxes ¶: Returns: Boxes: tight bounding boxes around polygon masks.

nonempty(self) → torch.Tensor¶

Find masks that are non-empty.

Returns:

Tensor:: a BoolTensor which represents whether each mask is empty (False) or not (True).

__getitem__(self, item: Union[int, slice, List[int], torch.BoolTensor]) → pipert.utils.structures.masks.PolygonMasks ¶

Support indexing over the instances and return a PolygonMasks object. item can be:

An integer. It will return an object with only one instance.
A slice. It will return an object with the selected instances.
A list[int]. It will return an object with the selected instances, correpsonding to the indices in the list.
A vector mask of type BoolTensor, whose length is num_instances. It will return an object with the instances whose mask is nonzero.

__iter__(self) → Iterator[List[torch.Tensor]]¶: Yields: list[ndarray]: the polygons for one instance. Each Tensor is a float64 vector representing a polygon.

__repr__(self) → str ¶: Return repr(self).

__len__(self) → int ¶

crop_and_resize(self, boxes: torch.Tensor, mask_size: int) → torch.Tensor¶

Crop each mask by the given box, and resize results to (mask_size, mask_size). This can be used to prepare training targets for Mask R-CNN.

Args:: boxes (Tensor): Nx4 tensor storing the boxes for each mask mask_size (int): the size of the rasterized mask.
Returns:: Tensor: A bool tensor of shape (N, mask_size, mask_size), where N is the number of predicted boxes for this image.

pipert.utils.structures.rasterize_polygons_within_box(polygons: List[np.ndarray], box: numpy.ndarray, mask_size: int) → torch.Tensor[source]¶

Rasterize the polygons into a mask image and crop the mask content in the given box. The cropped mask is resized to (mask_size, mask_size).

This function is used when generating training targets for mask head in Mask R-CNN. Given original ground-truth masks for an image, new ground-truth mask training targets in the size of mask_size x mask_size must be provided for each predicted box. This function will be called to produce such targets.

Args:: polygons (list[ndarray[float]]): a list of polygons, which represents an instance. box: 4-element numpy array mask_size (int):
Returns:: Tensor: BoolTensor of shape (mask_size, mask_size)

pipert.utils.structures.__all__[source]¶

pipert.utils.structures¶

Submodules¶

Package Contents¶

Classes¶

Functions¶

`pipert.utils.structures`¶