7.6.7. Horizon 算子¶

horizon_plugin_pytorch.nn.functional.filter(*inputs: Union[Tuple[torch.Tensor], Tuple[horizon_plugin_pytorch.qtensor.QTensor]], threshold: float, idx_range: Optional[Tuple[int, int]] = None) → List[List[torch.Tensor]]¶

Filter.

The output order is different with bpu, because that the compiler do some optimization and slice input following complex rules, which is hard to be done by plugin.

All inputs are filtered along HW by the max value within a range in channel dim of the first input. Each NCHW input is splited, transposed and flattened to List[Tensor[H * W, C]] first. If input is QTensor, the output will be dequantized.

参数

inputs – Data in NCHW format. Each input shold have the same size in N, H, W. The output will be selected according to the first input.
threshold – Threshold, the lower bound of output.
idx_range – The index range of values counted in compare of the first input. Defaults to None which means use all the values.

返回

A list with same length of batch size, and each element contains:

max_value: Flattened max value within idx_range in channel dim.
max_idx: Flattened max value index in channel dim.
coord: The original coordinates of the output data in the input data in the shape of [M, (h, w)].
(multi) data: Filtered data in the shape of [M, C].

返回类型

Union[List[List[Tensor]], List[List[QTensor]]]

horizon_plugin_pytorch.nn.functional.point_pillars_preprocess(points_list: List[torch.Tensor], pc_range: torch.Tensor, voxel_size: torch.Tensor, max_voxels: int, max_points_per_voxel: int, use_max: bool, norm_range: torch.Tensor, norm_dims: torch.Tensor) → Tuple[torch.Tensor, torch.Tensor]¶

Preprocess PointPillars.

参数

points_list – [(M1, ndim), (M2, ndim),…], List of PointCloud data.
pc_range – (6,), indicate voxel range, format: [x_min, y_min, z_min, x_max, y_max, z_max]
voxel_size – (3,), xyz, indicate voxel size.
max_voxels – Indicate maximum voxels.
max_points_per_voxel – Indicate maximum points contained in a voxel.
use_max – Whether to use max_voxels, for deploy should be True.
norm_range – Feature range, like [x_min, y_min, z_min, …, x_max, y_max, z_max, …].
norm_dims – Dims to do normalize.

返回

(features, coords), encoded feature and coordinates in (idx, z, y, x) format.

返回类型

(Tensor, Tensor)

class horizon_plugin_pytorch.nn.BgrToYuv444(channel_reversal: bool = False)¶

Convert image color format from bgr to yuv444.

参数: channel_reversal – Color channel order, set to True when used on RGB input. Defaults to False.

forward(input: torch.Tensor)¶: Forward pass of BgrToYuv444.

class horizon_plugin_pytorch.nn.Correlation(kernel_size: int = 1, max_displacement: int = 1, stride1: int = 1, stride2: int = 1, pad_size: int = 0, is_multiply: bool = True)¶

Perform multiplicative patch comparisons between two feature maps.

Correlation performs multiplicative patch comparisons between two feature maps. Given two multi-channel feature maps \(f_{1}, f_{2}\), with \(w\), \(h\), and \(c\) being their width, height, and number of channels, the correlation layer lets the network compare each patch from \(f_{1}\) with each patch from \(f_{2}\).

For now we consider only a single comparison of two patches. The ‘correlation’ of two patches centered at \(x_{1}\) in the first map and \(x_{2}\) in the second map is then defined as:

\[c(x_{1}, x_{2}) = \sum_{o \in [-k,k] \times [-k,k]} <f_{1}(x_{1} + o), f_{2}(x_{2} + o)>\]

for a square patch of size \(K:=2k+1\).

Note that the equation above is identical to one step of a convolution in neural networks, but instead of convolving data with a filter, it convolves data with other data. For this reason, it has no training weights.

Computing \(c(x_{1}, x_{2})\) involves \(c * K^{2}\) multiplications. Comparing all patch combinations involves \(w^{2}*h^{2}\) such computations.

Given a maximum displacement \(d\), for each location \(x_{1}\) it computes correlations \(c(x_{1}, x_{2})\) only in a neighborhood of size \(D:=2d+1\), by limiting the range of \(x_{2}\). We use strides \(s_{1}, s_{2}\), to quantize \(x_{1}\) globally and to quantize \(x_{2}\) within the neighborhood centered around \(x_{1}\).

The final output is defined by the following expression:: \[out[n, q, i, j] = c(x_{i, j}, x_{q})\]

where \(i\) and \(j\) enumerate spatial locations in \(f_{1}\), and \(q\) denotes the \(q^{th}\) neighborhood of \(x_{i,j}\).

参数

kernel_size – kernel size for Correlation must be an odd number
max_displacement – Max displacement of Correlation
stride1 – stride1 quantize data1 globally
stride2 – stride2 quantize data2 within neighborhood centered around data1
pad_size – pad for Correlation
is_multiply – operation type is either multiplication or subduction, only support True now

forward(data1: Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor], data2: Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]) → torch.Tensor¶

Forward for Horizon Correlation.

参数

data1 – shape of [N,C,H,W]
data2 – shape of [N,C,H,W]

返回

output

返回类型

Tensor

class horizon_plugin_pytorch.nn.DetectionPostProcessV1(num_classes: int, box_filter_threshold: float, class_offsets: List[int], use_clippings: bool, image_size: Tuple[int, int], nms_threshold: float, pre_nms_top_k: int, post_nms_top_k: int, nms_padding_mode: Optional[str] = None, nms_margin: float = 0.0, use_stable_sort: Optional[bool] = None, bbox_min_hw: Tuple[float, float] = (0, 0), exp_overwrite: Callable = <built-in method exp of type object>, input_shift: int = 4)¶

Post process for object detection models.

This operation is implemented on BPU, thus is expected to be faster than cpu implementation. This operation requires input scale=1 / 2 ** input_shift, or a rescale will be applied to the input data. So you can manually set the output scale of previous op (Conv2d for example) to 1 / 2 ** input_shift to avoid the rescale and get best performance and accuracy.

Major differences with DetectionPostProcess:: 1. Each anchor will generate only one pred bbox totally, but in DetectionPostProcess each anchor will generate one bbox for each class (num_classes bboxes totally).

2. NMS has a margin param, box2 will only be supressed by box1 when box1.score - box2.score > margin (box1.score > box2.score in DetectionPostProcess).

3. A offset can be added to the output class indices ( using class_offsets).

参数

num_classes – Class number.
box_filter_threshold – Default threshold to filter box by max score.
class_offsets – Offset to be added to output class index for each branch.
use_clippings – Whether clip box to image size. If input is padded, you can clip box to real content by providing image size.
image_size – Fixed image size in (h, w), set to None if input have different sizes.
nms_threshold – IoU threshold for nms.
nms_margin – Only supress box2 when box1.score - box2.score > nms_margin
pre_nms_top_k – Maximum number of bounding boxes in each image before nms.
post_nms_top_k – Maximum number of output bounding boxes in each image.
nms_padding_mode – The way to pad bbox to match the number of output bounding bouxes to post_nms_top_k, can be None, “pad_zero” or “rollover”.
bbox_min_hw – Minimum height and width of selected bounding boxes.
exp_overwrite – Overwrite the exp func in box decode.
input_shift – Customize input shift of quantized DPP.

forward(data: List[torch.Tensor], anchors: List[torch.Tensor], image_sizes: Tuple[int, int] = None) → torch.Tensor¶

Forward pass of DetectionPostProcessV1.

参数

data – (N, (4 + num_classes) * anchor_num, H, W)
anchors – (N, anchor_num * 4, H, W)
image_sizes – Defaults to None.

返回

list of (bbox (x1, y1, x2, y2), score, class_idx).

返回类型

List[Tuple[Tensor, Tensor, Tensor]]

class horizon_plugin_pytorch.nn.MultiScaleDeformableAttention(embed_dims: int = 256, num_heads: int = 8, num_levels: int = 4, num_points: int = 4, im2col_step: int = 64, dropout: float = 0.1, batch_first: bool = False, value_proj_ratio: float = 1.0)¶

An attention module used in Deformable-Detr.

Deformable DETR: Deformable Transformers for End-to-End Object Detection..

参数

embed_dims – The embedding dimension of Attention. Default: 256.
num_heads – Parallel attention heads. Default: 8.
num_levels – The number of feature map used in Attention. Default: 4.
num_points – The number of sampling points for each query in each head. Default: 4.
im2col_step – The step used in image_to_column. Default: 64.
dropout – A Dropout layer on inp_identity. Default: 0.1.
batch_first – Key, Query and Value are shape of (batch, n, embed_dim) or (n, batch, embed_dim). Default to False.
value_proj_ratio – The expansion ratio of value_proj. Default: 1.0.

forward(query: Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor], key: Optional[Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]] = None, value: Optional[Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]] = None, identity: Optional[Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]] = None, query_pos: Optional[Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]] = None, key_padding_mask: Optional[torch.Tensor] = None, reference_points: Optional[Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor]] = None, spatial_shapes: Optional[torch.Tensor] = None) → torch.Tensor¶

Forward Function of MultiScaleDeformAttention.

参数

query – Query of Transformer with shape (num_query, bs, embed_dims).
key – The key tensor with shape (num_key, bs, embed_dims).
value – The value tensor with shape (num_key, bs, embed_dims).
identity – The tensor used for addition, with the same shape as query. Default None. If None, query will be used.
query_pos – The positional encoding for query. Default: None.
key_padding_mask – ByteTensor for query, with shape [bs, num_key].
reference_points – The normalized reference points with shape (bs, num_query, num_levels, 2), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (bs, num_query, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.
spatial_shapes – Spatial shape of features in different levels. int tensor with shape (num_levels, 2), last dimension represents (h, w).

返回

the same shape with query.

返回类型

Tensor

class horizon_plugin_pytorch.nn.PointPillarsScatter(output_shape=None)¶

forward(voxel_features: torch.Tensor, coords: torch.Tensor, output_shape: Optional[Union[torch.Tensor, list, tuple]] = None) → torch.Tensor¶

Forward of Horizon PointPillarsScatter.

参数

voxel_features – [M, …], dimention after M will be flattened.
coords – [M, (n, …, y, x)], only indices on N, H and W are used.
output_shape – Expected output shape. Defaults to None.

返回

The NCHW pseudo image.

返回类型

Tensor

class horizon_plugin_pytorch.nn.RcnnPostProcess(image_size: Tuple[int, int] = (1024, 1024), nms_threshold: float = 0.3, box_filter_threshold: float = 0.1, num_classes: int = 1, post_nms_top_k: int = 100, delta_mean: List[float] = (0.0, 0.0, 0.0, 0.0), delta_std: List[float] = (1.0, 1.0, 1.0, 1.0))¶

Post Process of RCNN output.

Given bounding boxes and corresponding scores and deltas, decodes bounding boxes and performs NMS. In details, it consists of:

Argmax on multi-class scores
Filter out those belows the given threshold
Non-linear Transformation, convert box deltas to original image coordinates
Bin-sort remaining boxes on score
Apply class-aware NMS and return the firstnms_output_box_num of boxes

参数

image_size – a int tuple of (h, w), for fixed image size
nms_threshold – bounding boxes of IOU greater than nms_threshold will be suppressed
box_filter_threshold – bounding boxes of scores less than box_filter_threshold will be discarded
num_classes – total number of classes
post_nms_top_k – number of bounding boxes after NMS in each image
delta_mean – a float list of size 4
delta_std – a float list of size 4

forward(boxes: List[torch.Tensor], scores: torch.Tensor, deltas: torch.Tensor, image_sizes: Optional[torch.Tensor] = None)¶

Forward of RcnnPostProcess.

参数

boxes – list of box of shape [box_num, (x1, y1, x2, y2)]. can be Tensor(float), QTensor(float, int)
scores – shape is [num_batch * num_box, num_classes + 1, 1, 1,], dtype is float32
deltas – shape is [num_batch * num_box, (num_classes + 1) * 4, 1, 1,], dtype is float32
image_sizes – shape is [num_batch, 2], dtype is int32, for dynamic image size, can be None. Defaults to None

返回

output data in format: [x1, y1, x2, y2, score, class_index], dtype is float32 if the output boxes number is less than post_nms_top_k, they are padded with -1.0

返回类型

Tensor[num_batch, post_nms_top_k, 6]

class horizon_plugin_pytorch.nn.SoftmaxBernoulli2(dim: int = None, max_value_only: Optional[bool] = False)¶

SoftmaxBernoulli2 is designed to run on Bernoulli2.

This operator is considered hacky and should not been used by most users.

The calculation logic of this operator is as follows roughly:

y = exp(x - x.max(dim)) / sum(exp(x - x.max(dim)), dim)

The output of this operator is float type and cannot be fed into other quantized operators.

In the FLOAT phase, users can set qconfig to this operator as usual. However, there are some peculiarities in QAT and QUANTIZED inference phases. Please read the following carefully.

In the QAT phase, the operator only applies fake quantization to exp(x), then computes the division in the float domain and returns the unfakequantized(float) result directly. This operator will ignore the qconfig set by users or propagated from the parent module. However, to integrate this into the workflow of converting QAT models to QUANTIZED models, a reasonable qconfig is needed.

In the QUANTIZED inference phase, the operator retrieves the result of exp(x) from a lookup table and computes the division in the float domain.

When max_value_only is set to True, the maximum value of softmax along dim will be returned, which is equal to max(softmax(x, dim), dim). We combine softmax and max in this op because the hbdk compiler requires it to optimize performance without the effort of graph analysis. This argument is only intended for this specific purpose and should not be used in other cases.

参数

dim – The dimension along which Softmax will be computed. only supports dim=1.
max_value_only – If True, return the max value along dim, if False, equal to normal softmax. Refer to the above for more information.

horizon_plugin_pytorch.bgr2centered_gray(input: torch.Tensor) → torch.Tensor¶

Convert color space.

Convert images from BGR format to centered gray

参数: input – input image in BGR format of shape [N, 3, H, W], ranging 0~255
返回: centered gray image of shape [N, 1, H, W], ranging -128~127
返回类型: Tensor

horizon_plugin_pytorch.bgr2centered_yuv(input: torch.Tensor, swing: str = 'studio') → torch.Tensor¶

Convert color space.

Convert images from BGR format to centered YUV444 BT.601

参数

input – input image in BGR format, ranging 0~255
swing – “studio” for YUV studio swing (Y: -112~107, U, V: -112~112). “full” for YUV full swing (Y, U, V: -128~127). default is “studio”

返回

centered YUV image

返回类型

Tensor

horizon_plugin_pytorch.bgr2gray(input: torch.Tensor) → torch.Tensor¶

Convert color space.

Convert images from BGR format to gray

参数: input – input image in BGR format of shape [N, 3, H, W], ranging 0~255
返回: gray image of shape [N, 1, H, W], ranging 0~255
返回类型: Tensor

horizon_plugin_pytorch.bgr2rgb(input: torch.Tensor) → torch.Tensor¶

Convert color space.

Convert images from BGR format to RGB

参数: input – image in BGR format with shape [N, 3, H, W]
返回: image in RGB format with shape [N, 3, H, W]
返回类型: Tensor

horizon_plugin_pytorch.bgr2yuv(input: torch.Tensor, swing: str = 'studio') → torch.Tensor¶

Convert color space.

Convert images from BGR format to YUV444 BT.601

参数

input – input image in BGR format, ranging 0~255
swing – “studio” for YUV studio swing (Y: 16~235, U, V: 16~240). “full” for YUV full swing (Y, U, V: 0~255). default is “studio”

返回

YUV image

返回类型

Tensor

horizon_plugin_pytorch.centered_yuv2bgr(input: horizon_plugin_pytorch.qtensor.QTensor, swing: str = 'studio', mean: Union[List[float], torch.Tensor] = (128.0,), std: Union[List[float], torch.Tensor] = (128.0,), q_scale: Union[float, torch.Tensor] = 0.0078125) → horizon_plugin_pytorch.qtensor.QTensor¶

Convert color space.

Convert images from centered YUV444 BT.601 format to transformed and quantized BGR. Only use this operator in the quantized model. Insert it after QuantStub. Pass the scale of QuantStub to the q_scale argument and set scale of QuantStub to 1 afterwards.

参数

input – Input images in centered YUV444 BT.601 format, centered by the pyramid with -128.
swing – “studio” for YUV studio swing (Y: -112~107, U, V: -112~112). “full” for YUV full swing (Y, U, V: -128~127). default is “studio”
mean – BGR mean, a list of float, or torch.Tensor, can be a scalar [float], or [float, float, float] for per-channel mean.
std – BGR standard deviation, a list of float, or torch.Tensor, can be a scalar [float], or [float, float, float] for per-channel std.
q_scale – BGR quantization scale.

返回

Transformed and quantized image in BGR color, dtype is qint8.

返回类型

QTensor

horizon_plugin_pytorch.centered_yuv2rgb(input: horizon_plugin_pytorch.qtensor.QTensor, swing: str = 'studio', mean: Union[List[float], torch.Tensor] = (128.0,), std: Union[List[float], torch.Tensor] = (128.0,), q_scale: Union[float, torch.Tensor] = 0.0078125) → horizon_plugin_pytorch.qtensor.QTensor¶

Convert color space.

Convert images from centered YUV444 BT.601 format to transformed and quantized RGB. Only use this operator in the quantized model. Insert it after QuantStub. Pass the scale of QuantStub to the q_scale argument and set scale of QuantStub to 1 afterwards.

参数

input – Input images in centered YUV444 BT.601 format, centered by the pyramid with -128.
swing – “studio” for YUV studio swing (Y: -112~107, U, V: -112~112). “full” for YUV full swing (Y, U, V: -128~127). default is “studio”
mean – RGB mean, a list of float, or torch.Tensor, can be a scalar [float], or [float, float, float] for per-channel mean.
std – RGB standard deviation, a list of float, or torch.Tensor, can be a scalar [float], or [float, float, float] for per-channel std.
q_scale – RGB quantization scale.

返回

Transformed and quantized image in RGB color, dtype is qint8.

返回类型

QTensor

horizon_plugin_pytorch.rgb2bgr(input: torch.Tensor) → torch.Tensor¶

Convert color space.

Convert images from RGB format to BGR

参数: input – image in RGB format with shape [N, 3, H, W]
返回: image in BGR format with shape [N, 3, H, W]
返回类型: Tensor

horizon_plugin_pytorch.rgb2centered_gray(input: torch.Tensor) → torch.Tensor¶

Convert color space.

Convert images from RGB format to centered gray

参数: input – input image in RGB format of shape [N, 3, H, W], ranging 0~255
返回: centered gray image of shape [N, 1, H, W], ranging -128~127
返回类型: Tensor

horizon_plugin_pytorch.rgb2centered_yuv(input: torch.Tensor, swing: str = 'studio') → torch.Tensor¶

Convert color space.

Convert images from RGB format to centered YUV444 BT.601

参数

input – input image in RGB format, ranging 0~255
swing – “studio” for YUV studio swing (Y: -112~107, U, V: -112~112). “full” for YUV full swing (Y, U, V: -128~127). default is “studio”

返回

centered YUV image

返回类型

Tensor

horizon_plugin_pytorch.rgb2gray(input: torch.Tensor) → torch.Tensor¶

Convert color space.

Convert images from RGB format to gray

参数: input – input image in RGB format of shape [N, 3, H, W], ranging 0~255
返回: gray image of shape [N, 1, H, W], ranging 0~255
返回类型: Tensor

horizon_plugin_pytorch.rgb2yuv(input: torch.Tensor, swing: str = 'studio') → torch.Tensor¶

Convert color space.

Convert images from RGB format to YUV444 BT.601

参数

input – input image in RGB format, ranging 0~255
swing – “studio” for YUV studio swing (Y: 16~235, U, V: 16~240). “full” for YUV full swing (Y, U, V: 0~255). default is “studio”

返回

YUV image

返回类型

Tensor