10.6.4. models¶
Models widely used in upper module in HAT.
10.6.4.1. models¶
Position encoding with sine and cosine functions. |
|
Position embedding with learnable embedding weights. |
10.6.4.1.1. backbones¶
A module of EfficientNet. |
|
A module of efficientnet. |
|
A module of efficientnet_lite. |
|
A module of adjusted swin transformer, running faster on bpu. |
|
Module of MixVarGENet. |
|
A module of mobilenetv1. |
|
A module of mobilenetv2. |
|
A module of resnet18. |
|
A module of resnet50. |
|
A module of resnetv2. |
|
A module of resnet50V2. |
|
A module of vargconvnet. |
|
A module of VarGDarkNet53. |
|
A module of vargnetv2. |
|
A module of TinyVargNetv2. |
|
CocktailVargNetV2. |
10.6.4.1.1.1. contrib¶
ResNet-18 model from "Deep Residual Learning for Image Recognition". |
|
ResNet-50 model from "Deep Residual Learning for Image Recognition". |
10.6.4.1.2. base_modules¶
A sequential container which extends nn.Sequential to support dict or nn.Module arguments. |
|
Extend features. |
|
Encode bounding box in XYWH ways (proposed in RCNN). |
|
Do dequant to data. |
|
Encode gt and matching results to separate bbox and class labels. |
|
Encode bounding box in XYWH ways (proposed in RCNN). |
|
One hot class encoder. |
|
RCNN keypoints detection label encoder. |
|
RCNN bin detection label encoder. |
|
RCNN vehicle ground line label encoder. |
|
RCNN 3d label encoder. |
|
Encode gt and matching results to track labels. |
|
Class wise track id encoder. |
|
Encode gt and matching results to separate bbox and class labels. |
|
Generate person position label from matched boxes. |
|
Bounding box classification label matcher by max iou. |
|
Ignore region matcher by max overlap (intersection over area of ignore region). |
|
Position embedding with learnable embedding weights. |
|
Do quant to data. |
|
Resize multi stride preds to specific size. |
|
Crop and Resize feature from feature_map. |
|
An attention module used in BEVFormer. |
|
An attention module used in Detr3d. |
|
An attention module used in BEVFormer based on Deformable-Detr. |
|
An attention module used in Detr3d. |
|
Base class for TransformerEncoder and TransformerDecoder in vision transformer. |
10.6.4.1.2.1. postprocess¶
Post process for anchor-based object detection models. |
|
Apply argmax of data in pred_dict. |
|
Apply argmax of data in pred_dict. |
|
Apply max of data in pred_dict. |
|
Apply run length encoding of data in pred_dict. |
10.6.4.1.2.2. target¶
BBox Target Generator for detection task. |
|
Proposal Target Generator for two-stage task. |
|
Generate heatmap target for 3D detection. |
|
Reshape target data in label_dict to specific shape. |
10.6.4.1.3. losses¶
Calculate cross entropy loss of multi stride output. |
|
The losses of cross-entropy with label smooth. |
|
The losses of cross-entropy with soft target. |
|
Crossentropy loss with image-specfic class weighted map within batch. |
|
CE loss with online hard negative mining and auto average factor. |
|
Calculate cross entropy with task weight. |
|
Sigmoid focal loss. |
|
Focal Loss. |
|
Guassian focal loss. |
|
Modified focal loss. Exactly the same as CornerNet, |
|
Generalized Intersection over Union Loss. |
|
Elementwise L1 Hinge Loss. |
|
Elementwise L2 Hinge Loss. |
|
Weighted Squared ElementWiseHingeLoss. |
|
Smooth L1 Loss. |
|
LnNorm loss. |
|
MSE (mean squared error) loss with clip value. |
|
Segmentation loss wrapper. |
|
Calculate multi-losses with same prediction and target. |
|
Calculate multi-losses with multi-preds and correspondence targets. |
|
Multiple Stride Losses. |
|
Smooth L1 Loss. |
|
The loss module of YOLOv3. |
10.6.4.1.4. structures¶
Graph model used to construct multitask model structure. |
|
The basic structure of classifier. |
|
The basic structure of ClassifierHbirInfer. |
|
The basic structure of encoder decoder. |
|
The basic structure of EncoderDecoderHbirInfer. |
|
The basic structure of motion forecasting. |
|
The basic structure of MotionForecastingHbirInfer. |
|
The basic structure of segmentor. |
|
The basic structure of segmentor. |
|
The segmentor structure that inputs image metas into postprocess. |
|
The basic structure of SegmentorHbirInfer. |
|
The basic structure of bev. |
|
The basic structure of ViewFusionHbirInfer. |
|
The basic structure of ViewFusion4DHbirInfer. |
10.6.4.1.4.1. detectors¶
The basic structure of CenterPoint. |
|
The basic structure of CenterPointHbirInfer. |
|
The basic structure of detr. |
|
The basic structure of DetrHbirInfer. |
|
The basic structure of detr3d. |
|
The basic structure of Detr3dHbirInfer. |
|
The basic structure of fcos. |
|
The basic structure of FCOSHbirInfer. |
|
The basic structure of fcos3d. |
|
The basic structure of FCOS3DHbirInfer. |
|
The basic structure of PointPillars. |
|
The basic structure of PointPillarsDetectorHbirInfer. |
|
The basic structure of retinanet. |
|
The basic structure of RetinaNetHbirInfer. |
|
The basic structure of yolov3. |
|
The basic structure of YOLOHbirInfer. |
10.6.4.1.4.2. disparity_pred¶
The basic structure of StereoNet. |
|
The basic structure of StereoNetPlus. |
|
The basic structure of StereoNetHbirInfer. |
10.6.4.1.4.3. keypoints¶
HeatmapKeypointModel is a model for keypoint detection using heatmaps. |
|
The basic structure of HeatmapKeypointHbirInfer. |
10.6.4.1.4.4. lane_pred¶
The basic structure of GaNet. |
|
The basic structure of GaNetHbirInfer. |
10.6.4.1.4.5. lidar_multitask¶
The basic structure of LidarMultiTask. |
|
The basic structure of LidarMultiTaskHbirInfer. |
10.6.4.1.4.6. opticalflow¶
The basic structure of PWCNet. |
|
The basic structure of PwcNetHbirInfer. |
10.6.4.1.4.7. track_pred¶
The basic structure of Motr. |
|
The basic structure of MotrHbirInfer. |
10.6.4.1.5. model_convert¶
Define the process of convert float model to qat model. |
|
Define the process of fusing bn in a QAT model. |
|
Define the process of convert float model to calibration model. |
|
Load the checkpoint from file to model and return the checkpoint. |
|
Load the Mean-teacher model checkpoint. |
|
Convert torch module to compile wrap module. |
|
Compile model(nn.Module) by torch.compile() in torch>=2.0. |
|
Convert Reparameterized model to deploy mode. |
|
Split graph model in deploy mode. |
|
Mapping input key in graph model for deploy mode. |
|
Fix qscale of weight while calibration or qat stage. |
|
Load hbir module from file. |
|
Convert pipeline for QAT Fuse BN case. |
|
Convert pipeline for QAT Fuse BN case. |
10.6.4.1.6. necks¶
Weighted Bi-directional Feature Pyramid Network(BiFPN). |
|
Unet segmentation neck structure. |
|
Upper neck module for segmentation. |
|
Path Aggregation Network for Instance Segmentation. |
|
Path Aggregation Network with BasicVargNetBlock or BasicMixVargNetBlock. |
|
FPN for RetinaNet. |
|
Second FPN modules. |
|
Unet neck module. |
|
Necks module of yolov3. |
|
Necks module of yolov3. |
10.6.4.1.6.1. pointpillars¶
Basic module of PointPillarsHead. |
|
PointPillars Loss Module. |
|
PointPillars PostProcess Module. |
|
Batch voxelization. |
|
Point Pillars preprocess, include voxelization and extend features. |
10.6.4.1.6.2. carfusion_keypoints¶
Decode heatmap prediction to landmark coordinates. |
|
Deconder Head consists of multi deconv layers. |
10.6.4.1.6.3. centerpoint¶
Bbox coder for CenterPoint. |
|
The CenterPoint Decoder. |
|
CenterPointHead module. |
|
Generate centerpoint targets for bev task. |
|
Generate CenterPoint targets. |
|
CenterPoint loss module. |
|
CenterPoint PostProcess Module. |
|
Centerpoint preprocess, include voxelization and features encoder. |
10.6.4.1.6.4. deeplab¶
Head Module for FCN. |
10.6.4.1.6.5. detr¶
Compute an assignment between targets and predictions. |
|
This class computes the loss for DETR. |
|
Implements the DETR transformer head. |
|
Convert model's output into the format expected by evaluation. |
|
Implements the DETR transformer. |
10.6.4.1.6.6. detr3d¶
Detr3d decoder module. |
|
Detr3d Transfomer module. |
|
Detr3d Head module. |
|
The Detr3d PostProcess. |
|
Generate detr3d targets. |
10.6.4.1.6.7. fcn¶
FCN Decoder. |
|
Head Module for FCN. |
|
Generate Target for FCN. |
10.6.4.1.6.8. fcos¶
Generate cls and reg targets for FCOS in training stage. |
|
Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X. |
|
Generate fcos-style cls and reg targets for RPNHead and HingeLoss. |
|
Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X. |
|
|
|
|
|
Decoder for FCOS+RCNN Architecture. |
|
|
|
The basic structure of FCOSDocoderForFilter. |
|
The basic structure of FCOSDocoderForFilterHbir. |
|
FCOS loss wrapper. |
|
VehicleSide Task FCOS Loss wrapper. |
|
A modified Filter used for post-processing of FCOS. |
|
A modified Filter used for post-processing of FCOS with cone invasion. |
|
Filter used for post-processing of FCOS. |
|
Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>. |
|
Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>. |
|
Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>. |
10.6.4.1.6.9. fcos3d¶
Bounding box coder for FCOS3D. |
|
Loss for FCOS3D. |
|
Post-process for FOCS3D. |
|
Generate cls/reg targets for FCOS3D in training stage. |
10.6.4.1.6.10. ganet¶
Decoder for ganet, convert the output of the model to a prediction result in original image. |
|
A basic head module of ganet. |
|
The loss module of YOLOv3. |
|
Neck for ganet. |
|
Target for ganet, generate info using training from label. |
10.6.4.1.6.11. lidar¶
Lidar 3D Anchor Generator by stride. |
|
Box3d Coder for Lidar. |
|
TargetAssigner for Lidar. |
10.6.4.1.6.12. lidar_multitask¶
Segmentation decoder structure of lidar. |
|
Detection decoder structure of lidar. |
10.6.4.1.6.13. motion_forecasting¶
Implements the Densetnt head. |
|
Generate Densetnt loss. |
|
postprocess for densetnt. |
|
Generate densetnt targets. |
|
Implements the vectornet encoder. |
10.6.4.1.6.14. motr¶
This class computes the loss for Motr. |
|
Implements the MOTR head. |
|
Implements the motr deformable transformer. |
|
10.6.4.1.6.15. petr¶
PETR decoder module. |
|
Petr Transformer module. |
|
Petr Head module. |
10.6.4.1.6.16. pwcnet¶
A basic head of PWCNet. |
|
A extra features module of PWCNet. |
10.6.4.1.6.17. retinanet¶
An anchor-based head used in RetinaNet. |
|
The postprocess of RetinaNet. |
10.6.4.1.6.18. seg¶
Semantic Segmentation Decoder. |
|
Semantic Segmentation Decoder. |
|
Head Module for segmentation task. |
|
Generate training targets for Seg task. |
|
Coordinate Conv more detail ref to https://arxiv.org/pdf/1807.03247.pdf. |
|
FRCNNSegHead module for segmentation task. |
10.6.4.1.6.19. stereonet¶
A basic head of StereoNet. |
|
An advanced head for StereoNet. |
|
A extra features module of stereonet. |
|
A basic post process for StereoNet. |
|
An advanced post process for StereoNet. |
10.6.4.1.6.20. view_fusion¶
The IPM view transform for converting image view to bev view. |
|
The Lift-Splat-Shoot view transform for converting image view to bev view. |
|
The GKT view transform for converting image view to bev view. |
|
Cross-View Fusion Transformer model for computer vision tasks. |
|
Auxiliary head module for the CFTTransformer. |
|
The segmentation decoder structure of bev. |
|
The detection decoder structure of bev. |
|
The basic structure of BevDetDecoderInfer. |
|
The basic encoder structure of bev. |
|
The bev Backbone using varg block. |
|
Simple Add Temporal fusion for bev feats. |
10.6.4.1.6.21. yolo¶
Anchors generator for yolov3. |
|
Filter used for post-processing of YOLOv3 |
|
Heads module of yolov3. |
|
Encode gt and matching results for yolov3. |
|
Bounding box classification label matcher by max iou. |
|
The postprocess of YOLOv3. |
|
The postprocess of YOLOv3 Hbir. |
10.6.4.2. API Reference¶
- class hat.models.embeddings.PositionEmbeddingLearned(num_pos_feats=256, row_num_embed=50, col_num_embed=50)¶
Position embedding with learnable embedding weights.
- 参数
num_pos_feats – The feature dimension for each position along x-axis or y-axis. The final returned dimension for each position is 2 times of this value.
row_num_embed – The dictionary size of row embeddings. Default 50.
col_num_embed – The dictionary size of col embeddings. Default 50.
- forward(mask)¶
Forward function for LearnedPositionalEncoding.
- 参数
mask (Tensor) – ByteTensor mask. Non-zero values representing ignored positions, while zero values means valid positions for this image. Shape [bs, h, w].
- 返回
- Returned position embedding with shape
[bs, num_feats*2, h, w].
- 返回类型
pos (Tensor)
- class hat.models.embeddings.PositionEmbeddingSine(num_pos_feats: int = 64, temperature: int = 10000, normalize: bool = False, scale: Optional[float] = None, offset: float = 0.0)¶
Position encoding with sine and cosine functions.
See End-to-End Object Detection with Transformers for details.
- 参数
num_pos_feats – The feature dimension for each position along x-axis or y-axis. Note the final returned dimension for each position is 2 times of this value.
temperature – The temperature used for scaling the position embedding. Default 10000.
normalize – Whether to normalize the position embedding. Default False.
scale – A scale factor that scales the position embedding. The scale will be used only when normalize is True. Default 2*pi.
- forward(mask)¶
Forward function for SinePositionalEncoding.
- 参数
mask (Tensor) – ByteTensor mask. Non-zero values representing ignored positions, while zero values means valid positions for this image. Shape [bs, h, w].
- 返回
- Returned position embedding with shape
[bs, num_feats*2, h, w].
- 返回类型
pos (Tensor)
- class hat.models.backbones.efficientnet.EfficientNet(model_type: str, coefficient_params: tuple, num_classes: int, bn_kwargs: Optional[dict] = None, bias: bool = False, drop_connect_rate: Optional[float] = None, depth_division: int = 8, activation: str = 'relu', use_se_block: bool = False, blocks_args: Sequence[Dict] = (BlockArgs(kernel_size=3, num_repeat=1, in_filters=32, out_filters=16, expand_ratio=1, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=2, in_filters=16, out_filters=24, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=2, in_filters=24, out_filters=40, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=3, in_filters=40, out_filters=80, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=3, in_filters=80, out_filters=112, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25), BlockArgs(kernel_size=5, num_repeat=4, in_filters=112, out_filters=192, expand_ratio=6, id_skip=True, strides=2, se_ratio=0.25), BlockArgs(kernel_size=3, num_repeat=1, in_filters=192, out_filters=320, expand_ratio=6, id_skip=True, strides=1, se_ratio=0.25)), include_top: bool = True, flat_output: bool = True, input_channels: int = 3, resolution: int = 0, split_expand_conv: bool = False, quant_input: bool = True)¶
A module of EfficientNet.
- 参数
model_type (str) – Select to use which EfficientNet(B0-B7 or lite0-4), for EfficientNet model, model_type must be one of: [‘b0’, ‘b1’, ‘b2’, ‘b3’, ‘b4’, ‘b5’, ‘b6’, ‘b7’], for EfficientNet-lite model, model_type must be one of: [‘lite0’, ‘lite1’, ‘lite2’, ‘lite3’, ‘lite4’].
coefficient_params (tuple) – Parameter coefficients of EfficientNet, include: width_coefficient(float): scaling coefficient for net width. depth_coefficient(float): scaling coefficient for net depth. default_resolution(int): default input image size. dropout_rate(float): dropout rate for final classifier layer. num_classes (int): Num classes of output layer.
bn_kwargs (dict) – Dict for Bn layer.
bias (bool) – Whether to use bias in module.
drop_connect_rate (float) – Dropout rate at skip connections.
depth_division (int) – Depth division, Defaults to 8.
activation (str) – Activation layer, defaults to ‘relu’.
use_se_block (bool) – Whether to use SEBlock in module.
blocks_args (list) – A list of BlockArgs to MBConvBlock modules.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
input_channels (int) – Input channels of first conv.
split_expand_conv (bool) – Whether split expand conv into two conv. Set to true when expand conv is too large to deploy on xj3.
quant_input (bool) – Whether quant input.
- forward(inputs)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- hat.models.backbones.efficientnet.efficientnet(model_type, **kwargs)¶
A module of efficientnet.
- hat.models.backbones.efficientnet.efficientnet_lite(model_type, **kwargs)¶
A module of efficientnet_lite.
- class hat.models.backbones.horizon_swin_transformer.HorizonSwinTransformer(depth_list: List[int], num_heads: List[int], num_classes: int = 1000, patch_size: Union[int, Tuple[int, int]] = 4, in_channels: int = 3, embedding_dims: int = 96, window_size: int = 7, mlp_ratio: float = 4.0, qkv_bias: bool = True, qk_scale: Optional[float] = None, dropout_ratio: float = 0.0, attention_dropout_ratio: float = 0.0, drop_path_ratio: float = 0.0, patch_norm: bool = True, out_indices: Sequence[int] = (0, 1, 2, 3), frozen_stages: int = - 1, include_top: bool = True, flat_output: bool = True)¶
A module of adjusted swin transformer, running faster on bpu.
- 参数
depth_list – Depths of each Swin Transformer stage. for swin_T, the numbers could be [2, 2, 6, 2]. for swin_S, swin_B, or swin_L, the numbers could be [2, 2, 18, 2].
num_heads – Number of attention head of each stage. for swin_T or swin_S, the numbers could be [3, 6, 12, 24]. for swin_B, the numbers could be [4, 8, 16, 32]. for swin_L, the numbers could be [6, 12, 24, 48].
num_classes – Num classes of output layer.
patch_size – Patch size. Default: 4.
in_channels – Number of input image channels. Default: 3.
embedding_dims – Number of linear projection output channels. for swin_T or swin_S, the numbers could be 96. for swin_B, the number could be 128. for swin_L, the number could be 192.
window_size – Window size. Default: 7.
mlp_ratio – Ratio of mlp hidden dim to embedding dim. Default: 4.
qkv_bias – Whether to add a learnable bias to query, key, value. Default: True.
qk_scale – Override default qk scale of head_dim ** -0.5 if set.
dropout_ratio – Dropout rate. Default: 0.
attention_dropout_ratio – Attention dropout rate. Default: 0.
drop_path_ratio – Stochastic depth rate. Default: 0.
patch_norm – Whether to add normalization after patch embedding. Default: True.
out_indices – Output from which stages.
frozen_stages – Stages to be frozen (stop grad and set eval mode). Default: -1. -1 means not freezing any parameters.
include_top – Whether to include output layer. Default: True.
flat_output – Whether to view the output tensor. Default: True.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- init_weights(m)¶
Initialize the weights in backbone.
- class hat.models.backbones.mixvargenet.MixVarGENet(net_config: List[hat.models.backbones.mixvargenet.MixVarGENetConfig], num_classes: int, bn_kwargs: dict, output_list: Union[List[int], Tuple[int]] = (), disable_quanti_input: bool = False, fc_filter: int = 1024, include_top: bool = True, flat_output: bool = True, bias: bool = False, input_channels: int = 3, input_sequence_length: int = 1, input_resize_scale: Optional[int] = None, warping_module: Optional[torch.nn.modules.module.Module] = None)¶
Module of MixVarGENet.
- 参数
net_config (List[MixVarGENetConfig]) – network setting.
num_classes (int) – Num classes.
bn_kwargs (dict) – Kwargs of bn layer.
output_list (List[int]) – Output id of net_config blocks. The output of these block will be the output of this net. Set output_list as [] would export all block’s output.
disable_quanti_input (bool) – whether quanti input.
fc_filter (int) – the out_channels of the last_conv.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
bias (bool) – Whehter to use bias.
input_channels (int) – Input image channels, first conv input channels is input_channels times input_sequence_length.
input_resize_scale – This will resize the input image with the scale value.
- forward(x, uv_map=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- process_sequence_input(x: List) Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor] ¶
Process sequence input with cat.
- class hat.models.backbones.mobilenetv1.MobileNetV1(num_classes: int, bn_kwargs: dict, alpha: float = 1.0, bias: bool = True, dw_with_relu: bool = True, include_top: bool = True, flat_output: bool = True)¶
A module of mobilenetv1.
- 参数
num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
alpha (float) – Alpha for mobilenetv1.
bias (bool) – Whether to use bias in module.
dw_with_relu (bool) – Whether to use relu in dw conv.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.backbones.mobilenetv2.MobileNetV2(num_classes, bn_kwargs: dict, alpha: float = 1.0, bias: bool = True, include_top: bool = True, flat_output: bool = True, use_dw_as_avgpool: bool = False)¶
A module of mobilenetv2.
- 参数
num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
alpha (float) – Alpha for mobilenetv2.
bias (bool) – Whether to use bias in module.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
use_dw_as_avgpool (bool) – Whether to replace AvgPool with DepthWiseConv
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.backbones.resnet.ResNet18(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True, top_layer: Optional[torch.nn.modules.module.Module] = None, quant_input=True, dequant_output=True)¶
A module of resnet18.
- 参数
num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
bias (bool) – Whether to use bias in module.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
- class hat.models.backbones.resnet.ResNet50(num_classes: int, bn_kwargs: dict, bias: bool = True, include_top: bool = True, flat_output: bool = True, stride_change: bool = False, top_layer: Optional[torch.nn.modules.module.Module] = None, quant_input=True, dequant_output=True)¶
A module of resnet50.
- 参数
num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
bias (bool) – Whether to use bias in module.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
- class hat.models.backbones.resnet.ResNet50V2(num_classes: int, group_base: int, bn_kwargs: dict, bias: bool = True, extend_features: bool = False, include_top: bool = True, flat_output: bool = True)¶
A module of resnet50V2.
- 参数
num_classes – Num classes of output layer.
group_base – Group base for ExtendVarGNetFeatures.
bn_kwargs – Dict for BN layer.
bias – Whether to use bias in module.
extend_features – Whether to extend features.
include_top – Whether to include output layer.
flat_output – Whether to view the output tensor.
- class hat.models.backbones.resnet.ResNetV2(num_classes: int, basic_block: torch.nn.modules.module.Module, expansion: int, unit: list, channels_list: list, group_base: int, bn_kwargs: dict, bias: bool = True, extend_features: bool = False, include_top: bool = True, flat_output: bool = True)¶
A module of resnetv2.
- 参数
num_classes – Num classes of output layer.
basic_block – Basic block for resnet.
expansion – expansion of channels in basic_block.
unit – Unit num for each block.
channels_list – Channels for each block.
group_base – Group base for ExtendVarGNetFeatures.
bn_kwargs – Dict for BN layer.
bias – Whether to use bias in module.
extend_features – Whether to extend features.
include_top – Whether to include output layer.
flat_output – Whether to view the output tensor.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.backbones.vargconvnet.VargConvNet(num_classes: int, bn_kwargs: dict, channels_list: list, repeats: list, group_list: int, factor_list: int, out_channels: int = 1024, bias: bool = True, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, deep_stem: bool = True)¶
A module of vargconvnet.
- 参数
num_classes – Num classes of output layer.
bn_kwargs – Dict for BN layer.
channels_list – List for output channels
repeats – Depth of each stage.
group_list – Group of each stage.
factor_list – Factor for each stage.
out_channels – Output channels.
bias – Whether to use bias in module.
include_top – Whether to include output layer.
flat_output – Whether to view the output tensor.
input_channels – Input channels of first conv.
deep_stem – Whether use deep stem.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.backbones.vargdarknet.VarGDarkNet53(max_channels: int, bn_kwargs: dict, num_classes: int, include_top: bool = True, flat_output: bool = True)¶
A module of VarGDarkNet53.
- 参数
max_channels – Max channels.
bn_kwargs – Dict for BN layer.
num_classes – Number classes of output layer.
include_top – Whether to include output layer.
flat_output – Whether to view the output tensor.
- class hat.models.backbones.vargnetv2.CocktailVargNetV2(bn_kwargs: dict, model_type: str = 'VargNetV2', alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, disable_quanti_input: bool = False, flat_output: bool = True, input_channels: int = 3, head_factor: int = 1, input_resize_scale: Optional[int] = None, top_layer: Optional[torch.nn.modules.module.Module] = None)¶
CocktailVargNetV2.
对 VargNetV2 进行了简单魔改. 主要是去掉对 num_classes 作为 args 的要求和支持 top_layer 自定义.
TODO(ziyang01.wang) 重构计划, 将相应的修改吸收到 VargNetV2 中.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.backbones.vargnetv2.TinyVargNetV2(num_classes, bn_kwargs: dict, alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: Optional[int] = None, channel_list: Tuple[int] = (32, 32, 64, 128, 256), units: Optional[Tuple[int]] = None)¶
A module of TinyVargNetv2.
- 参数
num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
alpha (float) – Alpha for tinyvargnetv2.
group_base (int) – Group base for tinyvargnetv2.
factor (int) – Factor for channel expansion in basic block.
bias (bool) – Whether to use bias in module.
extend_features (bool) – Whether to extend features.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
input_channels (int) – Input channels of first conv.
input_sequence_length (int) – Length of input sequence.
head_factor (int) – Factor for channels expansion of stage1(mod2).
input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval
channel_list (tuple) – Number of channels in each stage.
- class hat.models.backbones.vargnetv2.VargNetV2(num_classes, bn_kwargs: dict, model_type: str = 'VargNetV2', alpha: float = 1.0, group_base: int = 8, factor: int = 2, bias: bool = True, extend_features: bool = False, disable_quanti_input: bool = False, include_top: bool = True, flat_output: bool = True, input_channels: int = 3, input_sequence_length: int = 1, head_factor: int = 1, input_resize_scale: Optional[int] = None, channel_list: Tuple[int] = (32, 32, 64, 128, 256), units: Optional[Tuple[int]] = None)¶
A module of vargnetv2.
- 参数
num_classes (int) – Num classes of output layer.
bn_kwargs (dict) – Dict for BN layer.
model_type (str) – Choose to use VargNetV2 or TinyVargNetV2.
alpha (float) – Alpha for vargnetv2.
group_base (int) – Group base for vargnetv2.
factor (int) – Factor for channel expansion in basic block.
bias (bool) – Whether to use bias in module.
extend_features (bool) – Whether to extend features.
include_top (bool) – Whether to include output layer.
flat_output (bool) – Whether to view the output tensor.
input_channels (int) – Input channels of first conv.
input_sequence_length (int) – Length of input sequence.
head_factor (int) – Factor for channels expansion of stage1(mod2).
input_resize_scale (int) – Narrow_model need resize input 0.65 scale, While int_infer or visualize or eval
channel_list (tuple) – Number of channels in each stage.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- process_sequence_input(x: List) Union[torch.Tensor, horizon_plugin_pytorch.qtensor.QTensor] ¶
Process sequence input with cat.
- hat.models.backbones.contrib.resnet.resnet18(pretrained_path=None, **kwargs)¶
ResNet-18 model from “Deep Residual Learning for Image Recognition”.
- 参数
pretrained (bool) – If True, returns a model pre-trained on ImageNet.
path (str) – The path of pretrained model.
- hat.models.backbones.contrib.resnet.resnet50(pretrained_path=None, **kwargs)¶
ResNet-50 model from “Deep Residual Learning for Image Recognition”.
- 参数
pretrained (bool) – If True, returns a model pre-trained on ImageNet
path (str) – The path of pretrained model.
- class hat.models.base_modules.extend_container.ExtSequential(modules: Iterable[torch.nn.modules.module.Module])¶
A sequential container which extends nn.Sequential to support dict or nn.Module arguments.
Same as nn.Sequential, ExtSequential can only forward one input argument:
input -> module1 -> input -> module2 -> input …
- 参数
modules – list/tuple of nn.Module instance.
- class hat.models.base_modules.basic_vargnet_module.ExtendVarGNetFeatures(prev_channel, channels, num_units, group_base, bn_kwargs, factor=2.0, dropout_kwargs=None)¶
Extend features.
- 参数
prev_channel (int) – Input channels.
channels (list) – Channels of output featuers.
num_units (list) – The number of units of each extend stride.
group_base (int) – The number of channels per group.
bn_kwargs (dict) – BatchNormEx kwargs.
factor (float, optional) – Channel factor, by default 2.0
dropout_kwargs (dict, optional) – QuantiDropout kwargs, None means do not use drop, by default None
- forward(features)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.bbox_decoder.XYWHBBoxDecoder(legacy_bbox: Optional[bool] = False, reg_mean: Optional[Tuple] = (0.0, 0.0, 0.0, 0.0), reg_std: Optional[Tuple] = (1.0, 1.0, 1.0, 1.0))¶
Encode bounding box in XYWH ways (proposed in RCNN).
- 参数
( (reg_std) – obj:’bool’, optional): Whether to represent bbox in legacy way. Default is False.
( – obj:’bool’, tuple): Mean value to be subtracted from bbox regression task in each coordinate.
( – obj:’bool’, tuple): Standard deviation value to be divided from bbox regression task in each coordinate.
- forward(boxes: torch.Tensor, boxes_delta: torch.Tensor) torch.Tensor ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.dequant_module.DequantModule(data_names: List)¶
Do dequant to data.
- 参数
data_names – A list of data names that need dequantization.
- forward(pred_dict: Mapping, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.label_encoder.ClassWiseTrackIdEncoder(num_classes: int, exclude_background: Optional[bool] = False)¶
Class wise track id encoder.
- 参数
num_classes – Number of classes, including background class.
exclude_background – Whether to exclude background class in the returned label (usually class 0).
- forward(track_id: torch.Tensor, cls_label: torch.Tensor) torch.Tensor ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.label_encoder.MatchLabelGroundLineEncoder(limit_reg_length: bool = False, cls_use_pos_only: bool = False, cls_on_hard: bool = False, reg_on_hard: bool = False)¶
RCNN vehicle ground line label encoder.
This class encodes gt and matching results to separate bbox and class labels.
- 参数
limit_reg_length – Whether to limit the length of regression.
cls_use_pos_only – Whether to use positive labels only during encoding. Default is False.
cls_on_hard – Whether to classification on hard label only. Default is False.
reg_on_hard – Whether to regression on hard label only. Default is False.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, gt_flanks: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor) Dict[str, torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- static get_intersections_to_vertical(points, coord_x1, coord_x2)¶
Intersection coordinates.
- class hat.models.base_modules.label_encoder.MatchLabelSepEncoder(bbox_encoder: Optional[torch.nn.modules.module.Module] = None, class_encoder: Optional[torch.nn.modules.module.Module] = None, cls_use_pos_only: Optional[bool] = False, cls_on_hard: Optional[bool] = False, reg_on_hard: Optional[bool] = False)¶
Encode gt and matching results to separate bbox and class labels.
- 参数
bbox_encoder – BBox label encoder
class_encoder – Class label encoder
cls_use_pos_only – Whether to use positive labels only during encoding.
reg_on_hard – Regression on hard label only.
cls_on_hard – Classification on hard label only.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor] ¶
Encode gt and matching results to separate bbox and class labels.
- 参数
boxes (torch.Tensor) – (B, N, 4), batched predicted boxes
gt_boxes (torch.Tensor) – (B, M, 5+), batched ground truth boxes, might be padded.
match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box
match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box
ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box
- class hat.models.base_modules.label_encoder.MatchLabelTrackEncoder(track_use_pos_only: bool = True, track_on_hard: bool = False, track_label_encoder: Optional[torch.nn.modules.module.Module] = None)¶
Encode gt and matching results to track labels.
- 参数
track_use_pos_only – Whether to use positive labels only during encoding.
track_on_hard – Whether use neg class bbox for track.
track_label_encoder – Track label encoder.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor] ¶
- 参数
boxes (torch.Tensor) – (B, N, 4), batched predicted boxes
gt_boxes (torch.Tensor) – (B, M, 6+), batched ground truth boxes (x1, y1, x2, y2, cls, track_id, …), might be padded.
match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box
match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box
ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box
- class hat.models.base_modules.label_encoder.MultiClassMatchLabelSepEncoder(bbox_encoder: Optional[torch.nn.modules.module.Module] = None, class_encoder: Optional[torch.nn.modules.module.Module] = None, bg_in_label: bool = True)¶
Encode gt and matching results to separate bbox and class labels.
- 参数
bbox_encoder – BBox label encoder
class_encoder – Class label encoder
bg_in_label – Whether the background in label index 0. Default to True.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor] ¶
- 参数
boxes (torch.Tensor) – (B, N, 4), batched predicted boxes
gt_boxes (torch.Tensor) – (B, M, 5+), batched ground truth boxes, might be padded.
match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box
match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box
ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box
- class hat.models.base_modules.label_encoder.OneHotClassEncoder(num_classes: int, class_agnostic_neg: Optional[bool] = False, exclude_background: Optional[bool] = False)¶
One hot class encoder.
- 参数
num_classes – Number of classes, including background class.
class_agnostic_neg – Whether the negative label shoud be class agnostic. If not, hard instances will remain the original values. Otherwise, all negative labels will be set to -1.
exclude_background – Whether to exclude background class in the returned label (usually class 0).
- forward(cls_label: torch.Tensor) torch.Tensor ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.label_encoder.PersonPositionLabelFromMatch(dms_position_classes_weight: int, oms_position_classes_weight: int)¶
Generate person position label from matched boxes.
- 参数
dms_position_classes_weight – DMS position class weight.
oms_position_classes_weight – OMS position class weight.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, **kwargs)¶
Forward.
- 参数
boxes – (B, N, 4), batched predicted boxes
gt_boxes – (B, M, 7), batched ground truth boxes, might be padded if gt_box is different in each sample. (B, M, 0:3): gt boxes coordinates (B, M, 4): gt boxes class label. (B, M, 5): gt boxes label type. 0: DMS; 1: OMS. (B, M, 6): gt boxes position class label.
match_pos_flag – (B, N), matched result of each predicted box, Entries with value 1 represents positive in matching, 0 for neg and -1 for ignore.
match_gt_id – (B, N), matched gt box index of each predicted box
- class hat.models.base_modules.label_encoder.RCNN3DLabelFromMatch(feat_h: int, feat_w: int, kps_num: int, gauss_threshold: float, gauss_3d_threshold: float, gauss_depth_threshold: float, undistort_depth_uv: bool = False, roi_expand_param: Optional[float] = 1.0)¶
RCNN 3d label encoder.
- 参数
feat_h – Roi featuremap’s height.
feat_w – Roi featuremap’s width.
kps_num – number of keypoints to be predicted, its value must be 1 due to the center of box.
gauss_threshold – a threshold of score_map.
gauss_3d_threshold – a threshold of 3d offset reg map.
gauss_depth_threshold – a threshold of depth reg map.
gauss_dim_threshold – a threshold of 3d dim reg map.
undistort_depth_uv – whether depth label is undistort into depth_u/v.
roi_expand_param – a ratio of rois which need to be expanded.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor)¶
Forward.
The idea of top-down keypoint detection approach is adopted here.
- 参数
boxes – (B, N, 4), batched predicted boxes
gt_boxes – (B, M, 5+), batched ground truth boxes, might be padded.
match_pos_flag – (B, N), matched result of each predicted box, Entries with value 1 represents positive in matching, 0 for neg and -1 for ignore.
match_gt_id – (B, N), matched gt box index of each predicted box
- class hat.models.base_modules.label_encoder.RCNNBinDetLabelFromMatch(roi_h_zoom_scale, roi_w_zoom_scale, feature_h, feature_w, num_classes, cls_on_hard, allow_low_quality_heatmap=False)¶
RCNN bin detection label encoder.
Bin detection is the detection task in the areas of bins which are parents boxes.
Get label by anchor match. For example if anchor is matched by A, then A’s class label is GT’s class label.
- 参数
roi_h_zoom_scale – Zoom scale of roi’s height.
roi_w_zoom_scale – Zoom scale of roi’s width.
feature_h – Roi featuremap’s height.
feature_w – Roi featuremap’s width.
num_classes – Num of classes.
cls_on_hard – Classification on hard label only.
allow_low_quality_heatmap – Whether to allow low quality heatmap.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None)¶
Forward.
- 参数
boxes – With shape (N, 4+) or (B, N, 4+), where 4 represents (x1, y1, x2, y2)
gt_boxes – With shape (B, num_gt_box, 5+), where 5 represents (x1, y1, x2, y2, class_id)
match_pos_flag – With shape (B, num_anchors), value 1: pos, 0: neg, -1: ignore
match_gt_id – With shape (B, num_anchors), the best matched gt box id, -1 means unavailable
ig_flag – With shape (B, N), ignore matched result of each predicted box
- 返回
- With shape (B, num_anchors, 1)
match_pos_flag > 0: label > 0 or label <0, depends on roi label match_pos_flag == 0: label == 0 match_pos_flag < 0: label < 0
label_map: With shape (B * num_anchors, num_classes, w, h) offset: With shape (B * num_anchors, 4, w, h) mask: With shape (B * num_anchors, num_classes)
- 返回类型
non_neg_match_label
- get_label(anchors, gt_anchor_box)¶
Get label.
- 参数
anchors – With shape (B, num_anchors, 4+)
gt_anchor_box – With shape (B, num_anchors, 5)
- 返回
With shape (B, num_anchors, w, h) relative_box: gt_box(subbox) relative to rois(anchors)
labelmap: With shape (B, num_anchors, num_classes, w, h) offset: With shape (B, num_anchors, 4, w, h)
- 返回类型
labelmap_onehot_label
- class hat.models.base_modules.label_encoder.RCNNKPSLabelFromMatch(feat_h: int, feat_w: int, kps_num: int, ignore_labels: Tuple[int], roi_expand_param: Optional[float] = 1.0, gauss_threshold: Optional[float] = 0.6)¶
RCNN keypoints detection label encoder.
- 参数
feat_h – the height of the output feature.
feat_w – the width of the output feature.
kps_num – number of keypoints to be predicted.
ignore_labels – GT labels of keypoints which need to be ignored.
roi_expand_param – a ratio of rois which need to be expanded.
gauss_threshold – a threshold of score_map.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor)¶
Forward.
The idea of top-down keypoint detection approach is adopted here.
- 参数
boxes – (B, N, 4), batched predicted boxes
gt_boxes – (B, M, 5+), batched ground truth boxes, might be padded.
match_pos_flag – (B, N), matched result of each predicted box, Entries with value 1 represents positive in matching, 0 for neg and -1 for ignore.
match_gt_id – (B, N), matched gt box index of each predicted box
- get_score_map(center, sigma_x=1.6, sigma_y=1.6, bin_offset=0.5)¶
Get score map by gauss.
The output of this module is a score map whose shape like (feat_h * feat_w,).
- 参数
center – The projected coordinates of keypoints.
sigma_x – Gauss sigma_x.
sigma_y – Gauss sigma_y.
bin_offset – the offset of bins.
- class hat.models.base_modules.label_encoder.XYWHBBoxEncoder(legacy_bbox: Optional[bool] = False, reg_mean: Optional[Tuple] = (0.0, 0.0, 0.0, 0.0), reg_std: Optional[Tuple] = (1.0, 1.0, 1.0, 1.0))¶
Encode bounding box in XYWH ways (proposed in RCNN).
- 参数
legacy_bbox – Whether to represent bbox in legacy way.
reg_mean – Mean value to be subtracted from bbox regression task in each coordinate.
reg_std – Standard deviation value to be divided from bbox regression task in each coordinate.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor) torch.Tensor ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.matcher.IgRegionMatcher(num_classes: int, ig_region_overlap: float, legacy_bbox: Optional[bool] = False, exclude_background: Optional[bool] = False)¶
Ignore region matcher by max overlap (intersection over area of ignore region).
- 参数
num_classes – Number of classes, including background class.
ig_region_overlap – Boxes whose IoA with an ignore region greater than
ig_region_overlap
is regarded as ignored.legacy_bbox – Whether to add 1 while computing box border.
exclude_background – Whether to clip off the label corresponding to background class (indexed as 0) in output flag.
- forward(boxes: torch.Tensor, ig_regions: torch.Tensor, ig_regions_num: torch.Tensor) torch.Tensor ¶
- 参数
boxes – Box tensor with shape (B, N, 4) or (N, 4) when boxes are identical in the hole batch.
ig_regions – Ignore region tensor with shape (B, M, 5+). In one sample, if the number of ig regions is less than M, the first M entries should be filled with real data, and others padded with arbitrary values.
ig_regions_num – Ignore region num tensor in shape (B). The actual number of ig regions for each sample. Cannot be greater than M.
- 返回
- Flag tensor with shape (B, self._num_classes - 1) when
self._exclude_background is True, or otherwise (B, self._num_classes). The range of the output is {0, 1}. Entries with value 1 are matched with ignore regions.
- class hat.models.base_modules.matcher.MaxIoUMatcher(pos_iou: float, neg_iou: float, allow_low_quality_match: bool = True, low_quality_match_iou: float = 0.1, legacy_bbox: bool = False, overlap_type: str = 'iou', clip_gt_before_matching: bool = False)¶
Bounding box classification label matcher by max iou.
- 参数
pos_iou – Boxes whose IOU larger than
pos_iou_thresh
is regarded as positive samples for classification.neg_iou – Boxes whose IOU smaller than
neg_iou_thresh
is regarded as negative samples for classification.allow_low_quality_match – Whether to allow low quality match. Default is True.
low_quality_match_iou – The iou thresh for low quality match. Low quality match will happens if any ground truth box is not matched to any boxes. Default is 0.1.
legacy_bbox – Whether to add 1 while computing box border. Default is False.
overlap_type – Overlap type for the calculation of correspondence, can be either “ioa” or “iou”. Default is “iou”.
clip_gt_before_matching – Whether to clip ground truth boxes to image shape before matching. Default is False.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, gt_boxes_num: torch.Tensor, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, torch.Tensor] ¶
- 参数
boxes – Box tensor with shape (B, N, 4) or (N, 4) when boxes are identical in the whole batch.
gt_boxes – GT box tensor with shape (B, M, 5+). In one sample, if the number of gt boxes is less than M, the first M entries should be filled with real data, and others padded with arbitrary values.
gt_box_num – GT box num tensor with shape (B). The actual number of gt boxes for each sample. Cannot be greater than M.
im_hw – Image HW tensor with shape (B, 2), the height and width value of each input image.
- 返回
- flag tensor with shape (B, N). Entries with value
1 represents positive in matching, 0 for neg and -1 for ignore.
- matched_gt_id: matched_gt_id tensor in (B, anchor_num).
The best matched gt box id. -1 means unavailable.
- 返回类型
flag
- class hat.models.base_modules.position_encoding.LearnedPositionalEncoding(num_feats: int, row_num_embed: int = 50, col_num_embed: int = 50)¶
Position embedding with learnable embedding weights.
- 参数
num_feats – The feature dimension for each position along x-axis or y-axis. The final returned dimension for each position is 2 times of this value.
row_num_embed – The dictionary size of row embeddings. Default 50.
col_num_embed – The dictionary size of col embeddings. Default 50.
- forward(mask: torch.Tensor) torch.Tensor ¶
Forward function for LearnedPositionalEncoding.
- 参数
mask – ByteTensor mask. Non-zero values representing ignored positions, while zero values means valid positions for this image. Shape [bs, h, w].
- 返回
- Returned position embedding with shape
[bs, num_feats*2, h, w].
- 返回类型
pos
- class hat.models.base_modules.quant_module.QuantModule(scale: Optional[float] = None)¶
Do quant to data.
- 参数
scale – Sacle value of quantization.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.resize_parser.ResizeParser(resize_kwargs: Mapping, data_name: Optional[str] = None, resized_data_name: Optional[str] = None, use_plugin_interpolate: bool = False, dequant_out: bool = True)¶
Resize multi stride preds to specific size.
e.g. segmentation, depth, flow an so on.
- 参数
data_name – name of original data to resize.
resized_data_name – name of data after resize. None means update in data_name inplace.
resize_kwargs – key args of resize.
use_plugin_interpolate – whether use horizon_plugin_pytorch.nn.Interpolate.
dequant_out – whether dequant output when use_plugin_interpolate is True.
- forward(preds: Union[torch.Tensor, Sequence, Mapping])¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.roi_feat_extractors.MultiScaleRoIAlign(*args, **kwargs)¶
- forward(featmaps: List[torch.Tensor], boxes: Union[torch.Tensor, List[torch.Tensor]], **kwargs)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.roi_feat_extractors.RoiCropResize(in_strides: List[int], target_stride: int, output_size: Tuple[int, int], roi_box: List[int], resize_mode: str = 'bilinear')¶
Crop and Resize feature from feature_map.
- 参数
in_strides – the strides of input feature maps
target_stride – the target stride of roi_resize will use.
output_size – the output size of roi_resize, (h,w).
roi_box – the crop region of roi_resize, [x1,y1,x2,y2].
resize_mode – the interpolate method, by default “bilinear”, support “bilinear” and “nearest”.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.transformer_attentions.BevDeformableTemporalAttention(embed_dims: int = 256, num_heads: int = 8, num_levels: int = 4, num_points: int = 4, num_bev_queue: int = 2, im2col_step: int = 64, bev_h: int = 200, bev_w: int = 200, dropout: float = 0.1, batch_first: bool = False, qv_cat: bool = True)¶
An attention module used in BEVFormer.
- 参数
embed_dims – The embedding dimension of Attention. Default: 256.
num_heads – Parallel attention heads. Default: 64.
num_levels – The number of feature map used in Attention. Default: 4.
num_points – The number of sampling points for each query in each head. Default: 4.
im2col_step – The step used in image_to_column. Default: 64.
dropout – A Dropout layer on inp_identity. Default: 0.1.
batch_first – Key, Query and Value are shape of (batch, n, embed_dim) or (n, batch, embed_dim). Default to False.
qv_cat – if True to concat query and value.
- forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, identity: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, reference_points: Optional[torch.Tensor] = None, spatial_shapes: Optional[torch.Tensor] = None, level_start_index: Optional[torch.Tensor] = None, pre_bev_feat: Optional[torch.Tensor] = None, pre_ref_points: Optional[torch.Tensor] = None, start_of_sequence: Optional[torch.Tensor] = None, **kwargs: Any) torch.Tensor ¶
Forward Function of MultiScaleDeformAttention.
- 参数
query – Query of Transformer with shape (num_query, bs, embed_dims).
key – The key tensor with shape (num_key, bs, embed_dims).
value – The value tensor with shape (num_key, bs, embed_dims).
identity – The tensor used for addition, with the same shape as query. Default None. If None, query will be used.
query_pos – The positional encoding for query. Default: None.
key_pos – The positional encoding for key. Default None.
reference_points – The normalized reference points with shape (bs, num_query, num_levels, 2), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.
spatial_shapes – Spatial shape of features in different levels. With shape (num_levels, 2), last dimension represents (h, w).
level_start_index – The start index of each level. A tensor has shape
(num_levels, )
and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].pre_bev_feat – Previous frame’s BEV feat.
pre_ref_points – refernce_points in current frame to previous frame.
- 返回
forwarded results with shape [num_query, bs, embed_dims].
- init_weights() None ¶
Initialize for Parameters of Module.
- class hat.models.base_modules.transformer_attentions.BevSpatialCrossAtten(pc_range: List[float], deformable_attention: torch.nn.modules.module.Module, embed_dims: int = 256, num_refs: int = 4, num_cams: int = 6, dropout: float = 0.1)¶
An attention module used in Detr3d.
- 参数
pc_range – point cloud range.
deformable_attention – Module for deformable cross attn.
embed_dims – The embedding dimension of Attention. Default: 256.
num_refs – Number of reference points in head dimension. Default: 4.
num_cams – The number of cameras. Default: 6.
num_points – The number of sampling points for each query in each head. Default: 4.
dropout – A Dropout layer on inp_identity. Default: 0..
- forward(query: torch.Tensor, key: Optional[torch.Tensor] = None, value: Optional[torch.Tensor] = None, identity: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, bev_reference_points: Optional[torch.Tensor] = None, mlvl_feats_spatial_shapes: Optional[torch.Tensor] = None, mlvl_feats_level_start_index: Optional[torch.Tensor] = None, **kwargs: Any) torch.Tensor ¶
Forward Function of Detr3DCrossAtten.
- 参数
query – Query of Transformer with shape (num_query, bs, embed_dims).
key – The key tensor with shape (num_key, bs, embed_dims).
value – The value tensor with shape (num_key, bs, embed_dims). (B, N, C, H, W)
query_pos – The positional encoding for query. Default: None.
bev_reference_points – The normalized reference points with shape (bs, num_query, 4), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.
mlvl_feats_spatial_shapes – Spatial shape of features in different level. With shape (num_levels, 2), last dimension represent (h, w).
mlvl_feats_level_start_index – The start index of each level. A tensor has shape (num_levels) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].
residual – The tensor used for addition, with the same shape as x. Default None. If None, x will be used.
- 返回
forwarded results with shape [num_query, bs, embed_dims].
- 返回类型
Tensor
- init_weights() None ¶
Initialize for Parameters of Module.
- class hat.models.base_modules.transformer_attentions.MSDeformableAttention3D(embed_dims: int = 256, num_heads: int = 8, num_levels: int = 4, num_points: int = 8, im2col_step: int = 64, batch_first: bool = True)¶
An attention module used in BEVFormer based on Deformable-Detr. <https://arxiv.org/pdf/2010.04159.pdf>`_.
- 参数
embed_dims – The embedding dimension of Attention. Default: 256.
num_heads – Parallel attention heads. Default: 64.
num_levels – The number of feature map used in Attention. Default: 4.
num_points – The number of sampling points for each query in each head. Default: 4.
im2col_step – The step used in image_to_column. Default: 64.
batch_first – Key, Query and Value are shape of (batch, n, embed_dim) or (n, batch, embed_dim). Default to False.
- forward(query: torch.Tensor, value: Optional[torch.Tensor] = None, spatial_shapes: Optional[torch.Tensor] = None, reference_points: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, level_start_index: Optional[torch.Tensor] = None) torch.Tensor ¶
Forward Function of MultiScaleDeformAttention.
- 参数
query – Query of Transformer with shape ( bs, num_query, embed_dims).
value – The value tensor with shape (bs, num_key, embed_dims).
query_pos – The positional encoding for query. Default: None.
reference_points – The normalized reference points with shape (bs, num_query, num_levels, 2), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.
spatial_shapes – Spatial shape of features in different levels. With shape (num_levels, 2), last dimension represents (h, w).
level_start_index – The start index of each level. A tensor has shape
(num_levels, )
and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].
- 返回
forwarded results with shape [num_query, bs, embed_dims].
- 返回类型
Tensor
- init_weights() None ¶
Initialize for Parameters of Module.
- class hat.models.base_modules.transformer_attentions.ObjectDetr3DCrossAtten(embed_dims: int = 256, num_heads: int = 8, num_levels: int = 4, num_points: int = 4, im2col_step: int = 64, pc_range: Optional[List[float]] = None, dropout: float = 0.1, batch_first: bool = False)¶
An attention module used in Detr3d.
- 参数
embed_dims – The embedding dimension of Attention. Default: 256.
num_heads – Parallel attention heads. Default: 64.
num_levels – The number of feature map used in Attention. Default: 4.
num_points – The number of sampling points for each query in each head. Default: 4.
im2col_step – The step used in image_to_column. Default: 64.
dropout – A Dropout layer on inp_identity. Default: 0..
- forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, identity: Optional[torch.Tensor] = None, query_pos: Optional[torch.Tensor] = None, reference_points: Optional[torch.Tensor] = None, bev_feat_shapes: Optional[torch.Tensor] = None, bev_feat_level_start_index: Optional[torch.Tensor] = None, **kwargs)¶
Forward Function of Detr3DCrossAtten.
- 参数
query – Query of Transformer with shape (num_query, bs, embed_dims).
key – The key tensor with shape (num_key, bs, embed_dims).
value – The value tensor with shape (num_key, bs, embed_dims). (B, N, C, H, W)
residual – The tensor used for addition, with the same shape as x. Default None. If None, x will be used.
query_pos – The positional encoding for query. Default: None.
key_pos – The positional encoding for key. Default None.
reference_points – The normalized reference points with shape (bs, num_query, 4), all elements is range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area. or (N, Length_{query}, num_levels, 4), add additional two dimensions is (w, h) to form reference boxes.
level_start_index – The start index of each level. A tensor has shape (num_levels) and can be represented as [0, h_0*w_0, h_0*w_0+h_1*w_1, …].
- 返回
forwarded results with shape [num_query, bs, embed_dims].
- 返回类型
Tensor
- init_weights()¶
Initialize for Parameters of Module.
- class hat.models.base_modules.transformer_bricks.TransformerLayerSequence(transformerlayers: torch.nn.modules.module.Module, num_layers: int = 3)¶
Base class for TransformerEncoder and TransformerDecoder in vision transformer. As base-class of Encoder and Decoder in vision transformer.
- 参数
transformerlayer – Module of transformerlayer in TransformerCoder. Default: None.
num_layers – The number of TransformerLayer. Default: 3.
- class hat.models.base_modules.postprocess.anchor_postprocess.AnchorPostProcess(input_key: Hashable, num_classes: int, class_offsets: List[int], use_clippings: bool, image_hw: Tuple[int, int], nms_iou_threshold: float, pre_nms_top_k: int, post_nms_top_k: int, nms_margin: float = 0.0, box_filter_threshold: float = 0.0, nms_padding_mode: Optional[str] = None, bbox_min_hw: Tuple[float, float] = (0, 0), input_shift: int = 4, use_stable_sort: Optional[bool] = None)¶
Post process for anchor-based object detection models.
This operation is implemented on BPU, thus is expected to be faster than cpu implementation. Only supported on bernoulli2.
This operation requires input_scale = 1 / 2 ** 4, or a rescale will be applied to the input data. So you can manually set the output scale of previous op (Conv2d for example) to 1 / 2 ** 4 to avoid the rescale and get best performance and accuracy.
- Major differences with DetectionPostProcess:
1. Each anchor will generate only one pred bbox totally, but in DetectionPostProcess each anchor will generate one bbox for each class (num_classes bboxes totally). 2. NMS has a margin param, box2 will only be supressed by box1 when box1.score - box2.score > margin (box1.score > box2.score in DetectionPostProcess). 3. A offset can be added to the output class indices ( using class_offsets).
- 参数
input_key – Hashable object used to query detection output from input.
num_classes – Class number. Should be the number of foreground classes.
box_filter_threshold – Default threshold to filter box by max score.
class_offsets – Offset to be added to output class index for each branch.
strides – input_size / feature_size in (h, w).
use_clippings – Whether clip box to image size. If input is padded, you can clip box to real content by providing image size.
image_size – Fixed image size in (h, w), set to None if input have different sizes.
nms_threshold – IoU threshold for nms.
nms_margin – Only supress box2 when box1.score - box2.score > nms_margin
pre_nms_top_k – Maximum number of bounding boxes in each image before nms.
post_nms_top_k – Maximum number of output bounding boxes in each image.
nms_padding_mode – The way to pad bbox to match the number of output bounding bouxes to post_nms_top_k, can be None, “pad_zero” or “rollover”.
bbox_min_hw – Minimum height and width of selected bounding boxes.
input_shift – Customize input shift of quantized DPP.
use_stable_sort – Whether use stable sort after post-process, default as None.
- forward(anchors: List[torch.Tensor], head_out: Dict[str, List[torch.Tensor]], im_hw: Optional[Tuple[int, int]] = None) List[List[Tuple[torch.Tensor]]] ¶
Forward method.
The output keyed by “pred_boxes_out” is the float version of “pred_boxes”, which is used in qat&pt inference.
- class hat.models.base_modules.postprocess.argmax_postprocess.ArgmaxPostprocess(data_name: str, dim: int, keepdim: bool = False)¶
Apply argmax of data in pred_dict.
- 参数
data_name (str) – name of data to apply argmax.
dim (int) – the dimension to reduce.
keepdim (bool) – whether the output tensor has dim retained or not.
- forward(pred_dict: Mapping, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.postprocess.argmax_postprocess.HorizonAdasClsPostProcessor(data_name: str, dim: int, keep_dim: bool = True, march: str = 'bayes')¶
Apply argmax of data in pred_dict.
- 参数
data_name (str) – name of data to apply argmax.
dim (int) – the dimension to reduce.
keepdim (bool) – whether the output tensor has dim retained or not.
- forward(pred_cls: Mapping, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.postprocess.max_postprocess.MaxPostProcess(data_names: list, out_names: List[List[str]], dim: int, keepdim: bool = False, return_indices: bool = True)¶
Apply max of data in pred_dict.
- 参数
data_names – names of data to apply max.
out_names – out names of data after max, order is related to data_names.
dim – the dimension to reduce.
keepdim – whether the output tensor has dim retained or not.
return_indices – whether return indices corresponding to max.
- forward(pred_dict: Mapping, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.postprocess.rle_postprocess.RLEPostprocess(data_name: str, dtype: torch.dtype)¶
Apply run length encoding of data in pred_dict.
Compress dense output with patches of identical value by run length encoding, e.g., for semantic segmentation result. Note that current plugin rle only support for value processed by argmax.
- 参数
data_name (str) – name of data to apply run length encoding.
dtype (torch.dtype) – The value field dtype in compressed result. !!! Note: Not compressed results dtype. Result dtype is int64 !!! Support torch.int8 or torch.int16. if input is torch.max indices out, dtype must be torch.int16 if value dtype = torch.int8, num dtype is uint8, max num is 255 if value dtype = torch.int16, num dtype is uint16, max num is 65535
- forward(pred_dict: Mapping, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.target.bbox_target.BBoxTargetGenerator(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, sampler: Optional[torch.nn.modules.module.Module] = None)¶
BBox Target Generator for detection task.
BBoxTargetGenerator wraps matchers, sampler and an encoder to generate training target by firstly matching predictions with ground truths to build correspondences and generating training target for each prediction.
The detail of matching and label encoding are implemented in matcher classes.
- 参数
matcher – Matcher defines how the matching between predictions and ground truths actually works.
label_encoder – Label encoder defines how to generate training target for each prediction given ground truths and correspondences.
ig_region_matcher – Ignore region matcher is used to generate ignore flags for each pred box according to its overlap with input ignore regions.
sampler – Sampler defines how to do sample on the bbox and target. If provide, will do sample on the boxes according to the match state. Default to None.
- forward(boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes_num: Optional[torch.Tensor] = None, ig_regions: Optional[Union[torch.Tensor, List[torch.Tensor]]] = None, ig_regions_num: Optional[torch.Tensor] = None, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, Dict[str, torch.Tensor]] ¶
- 参数
boxes – Box tensor with shape (B, N, 4) or a list of anchor tensors each with shape (B, N*4, H, W), where each tensor corresponds to anchors of one feature stride. B stands for batch size, N the number of boxes for each sample, H the height and W the width.
gt_boxes – GT box tensor with shape (B, M1, 5+), or a list of B 2d tensors with 5+ as the size of the last dim. For the former ase, in one sample, if the number of gt boxes is less than M1, the first M1 entries should be filled with real data, and others padded with arbitrary values.
gt_box_num – If provided, it is the gt box num tensor with shape (B,), the actual number of gt boxes of each sample. Cannot be greater than M1.
ig_regions – Ignore region tensor with shape (B, M2, 5+), or a list of B 2d tensors with 5+ as the size of the last dim. For the former case, in one sample, if the number of ig regions is less than M2, the first M2 entries should be filled with real data, and others padded with arbitrary values.
ig_regions_num – If provided, it is ignore region num tensor in shape (B,), the actual number of ig regions of each sample. Cannot be greater than M2.
- class hat.models.base_modules.target.bbox_target.ProposalTarget(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, add_gt_bbox_to_proposal: bool = False, only_use_gt_rois: bool = False, sampler: Optional[torch.nn.modules.module.Module] = None)¶
Proposal Target Generator for two-stage task.
ProposalTarget Generator wraps matchers, sampler and an encoder to generate training target by firstly matching predictions with ground truths to build correspondences and generating training target for each proposal. If sampler is given, the final proposal bbox would be sampled.
- 参数
matcher – same as BBoxTargetGenerator.
label_encoder – same as BBoxTargetGenerator.
ig_region_matcher – same as BBoxTargetGenerator.
add_gt_bbox_to_proposal – If add gt_bboxes to the pred boxes as positive proposal boxes. Default to False.
sampler – same as BBoxTargetGenerator.
- class hat.models.base_modules.target.bbox_target.ProposalTargetGroundLine(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, add_gt_bbox_to_proposal: bool = False, only_use_gt_rois: bool = False, sampler: Optional[torch.nn.modules.module.Module] = None)¶
- forward(boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes: torch.Tensor, gt_flanks: torch.Tensor, gt_boxes_num: torch.Tensor = None, gt_flanks_num: torch.Tensor = None, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, Dict[str, torch.Tensor]] ¶
- 参数
gt_flanks – GT flanks tensor with shape (B, M1, 9), or a list of B 2d tensors with 9 as the size of the last dim.
- class hat.models.base_modules.target.bbox_target.ProposalTargetTrack(matcher: torch.nn.modules.module.Module, label_encoder: torch.nn.modules.module.Module, ig_region_matcher: Optional[torch.nn.modules.module.Module] = None, add_gt_bbox_to_proposal: bool = False, only_use_gt_rois: bool = False, sampler: Optional[torch.nn.modules.module.Module] = None)¶
- forward(boxes: Union[torch.Tensor, List[torch.Tensor]], gt_boxes: Union[torch.Tensor, List[torch.Tensor]], num_seq: int, seq_len: torch.Tensor, gt_boxes_num: Optional[torch.Tensor] = None, ig_regions: Optional[Union[torch.Tensor, List[torch.Tensor]]] = None, ig_regions_num: Optional[torch.Tensor] = None, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, Dict[str, torch.Tensor]] ¶
Proposal target for track2d.
Proposal target for track2d. What different with the ProposalTarget is that the track2d need num_seq and seq_len info.
- 参数
num_seq – Number of video sequence in the batch.
seq_len – A tensor with shape (num_seq,), represent each sequence length in the batch.
- class hat.models.base_modules.target.heatmap_roi_3d_target.HeatMap3DTargetGenerator(num_classes: int, normalize_depth: bool, focal_length_default: float, min_box_edge: int, max_depth: int, max_objs: int, classid_map: dict, down_stride: Optional[int] = 4, undistort_2dcenter: Optional[bool] = False, undistort_depth_uv: Optional[bool] = False, input_padding: Optional[list] = None, depth_min_option: Optional[bool] = False)¶
Generate heatmap target for 3D detection.
Note that computation is performed on cpu currently instead of gpu.
- 参数
num_classes – Number of classes.
normalize_depth – Whether to normalize depth.
focal_length_default – Default focal length.
min_box_edge – Minimum box edge.
max_depth – Maximum depth.
max_objs – Maximum number of objects.
down_stride – Down stride of heatmap.
undistort_2dcenter – Whether to undistort 2D center.
undistort_depth_uv – Whether to undistort depth uv.
input_padding – Padding of input image.
depth_min_option – Whether to use depth min option.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.base_modules.target.reshape_target.ReshapeTarget(data_name: str, shape: Optional[Sequence] = None)¶
Reshape target data in label_dict to specific shape.
- 参数
data_name (str) – name of original data to reshape.
shape (Sequence) – the new shape.
- class hat.models.losses.cross_entropy_loss.CEWithHardMining(use_sigmoid: bool = False, ignore_index: int = - 1, norm_type: str = 'none', reduction: str = 'mean', loss_weight: float = 1.0, class_weight: Optional[torch.Tensor] = None, hard_neg_mining_cfg: Optional[Dict] = None)¶
CE loss with online hard negative mining and auto average factor.
- 参数
use_sigmoid – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.
ignore_index – Specifies a target value that is ignored and does not contribute to the loss.
norm_type – Normalization method, can be “fg_elt”, in which normalization factor is the number of foreground elements, “fbg_elt” the number of foreground and background element. “none” no normalize on loss. Defaults to “none”.
reduction – The method used to reduce the loss. Options are [“none”, “mean”, “sum”]. Default to “mean”.
loss_weight – Global weight of loss. Defaults is 1.0.
class_weight – Weight of each class. If given must be a vector with length equal to the number of classes. Default to None.
hard_neg_mining_cfg – hard negative mining config. Please refer to LossHardNegativeMining.
- forward(pred, target, weight=None, avg_factor=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.cross_entropy_loss.CEWithLabelSmooth(smooth_alpha=0.1, ignore_index: int = - 100, loss_weight=1.0)¶
The losses of cross-entropy with label smooth.
- 参数
smooth_alpha (float) – Alpha of label smooth.
ignore_index (int) – Specifies a target value that is ignored and does not contribute to the loss.
loss_weight (float) – Global weight of loss. Defaults is 1.0.
- forward(input, target)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.cross_entropy_loss.CEWithWeightMap(weight_min: float = 0.5, remap_params: Optional[Dict] = None, **kwargs)¶
Crossentropy loss with image-specfic class weighted map within batch.
- 参数
weight_min – Min weight for each label.
remap_params – Params for remap label.
- forward(pred, target, weight=None, avg_factor=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.cross_entropy_loss.CrossEntropyLoss(use_sigmoid: bool = False, reduction: str = 'mean', class_weight: Optional[List[float]] = None, loss_weight: float = 1.0, ignore_index: int = - 1, loss_name: Optional[str] = None, auto_class_weight: Optional[bool] = False, weight_min: Optional[float] = None, weight_noobj: Optional[float] = None, num_class: int = 0)¶
Calculate cross entropy loss of multi stride output.
- 参数
use_sigmoid (bool) – Whether logits tensor is converted to probability through sigmoid, Defaults to False. If True, use F.binary_cross_entropy_with_logits. If False, use F.cross_entropy.
reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].
class_weight (list[float]) – Weight of each class. Defaults is None.
loss_weight (float) – Global weight of loss. Defaults is 1.
ignore_index (int) – Only works when using cross_entropy.
loss_name (str) – The key of loss in return dict. If None, return loss directly.
- 返回
cross entropy loss
- forward(pred, target, weight=None, avg_factor=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.cross_entropy_loss.CrossEntropyLossWithTaskWeight(loss_weight: float = 1.0, **kwargs)¶
Calculate cross entropy with task weight.
- 参数
loss_weight – The multiplier of the loss weight of this task.
kwargs – The kwargs of torch.nn.CrossEntropyLoss.
- 返回
cross entropy loss
- forward(*args, **kwargs)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.cross_entropy_loss.SoftTargetCrossEntropy(loss_name=None)¶
The losses of cross-entropy with soft target.
- 参数
loss_name (str) – The name of returned losses.
- forward(input, target)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.focal_loss.FocalLoss(loss_name, num_classes, alpha=0.25, gamma=2.0, loss_weight=1.0, eps=1e-12, reduction='mean')¶
Sigmoid focal loss.
- 参数
loss_name (str) – The key of loss in return dict.
num_classes (int) – Num_classes including background, C+1, C is number of foreground categories.
alpha (float) – A weighting factor for pos-sample, (1-alpha) is for neg-sample.
gamma (float) – Gamma used in focal loss to compress the contribution of easy examples.
loss_weight (float) – Global weight of loss. Defaults is 1.0.
eps (float) – A small value to avoid zero denominator.
reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].
- 返回
A dict containing the calculated loss, the key of loss is loss_name.
- 返回类型
dict
- forward(pred, target, weight=None, avg_factor=None, points_per_strides=None, valid_classes_list=None)¶
Forward method.
- 参数
pred (Tensor) – Cls pred, with shape(N, C), C is num_classes of foreground.
target (Tensor) – Cls target, with shape(N,), values in [0, C-1] represent the foreground, C or negative value represent the background.
weight (Tensor) – The weight of loss for each prediction. Default is None.
avg_factor (float) – Normalized factor.
- class hat.models.losses.focal_loss.FocalLossV2(alpha: float = 0.25, gamma: float = 2.0, loss_weight: float = 1.0, eps: float = 1e-12, from_logits: bool = True, reduction: str = 'mean', ohem_fp_threshold: Optional[float] = 1.0, ohem_fp_loss_weight: Optional[float] = None)¶
-
- 参数
alpha – A weighting factor for pos-sample, (1-alpha) is for neg-sample.
gamma – Gamma used in focal loss to compress the contribution of easy examples.
loss_weight – Global weight of loss. Defaults to 1.0.
eps – A small value to avoid zero denominator.
from_logits – Whether the input prediction is logits (before sigmoid).
reduction – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.
ohem_fp_threshold – Score threshold to select ohem fp instance.
ohem_fp_loss_weight – Loss weight for ohem fp instance.
- forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)¶
Forward method.
- 参数
pred – cls pred, with shape (B, N, C), C is num_classes of foreground.
target – cls target, with shape (B, N, C), C is num_classes of foreground.
weight – The weight of loss for each prediction. It is mainly used to filter the ignored box. Default is None.
avg_factor – Normalized factor.
- class hat.models.losses.focal_loss.GaussianFocalLoss(alpha: float = 2.0, gamma: float = 4.0, loss_weight: float = 1.0)¶
Guassian focal loss.
- 参数
alpha – A weighting factor for positive sample.
gamma – Used in focal loss to balance contribution of easy examples and hard examples.
loss_weight – Weight factor for guassian focal loss.
- forward(logits, labels, grad_tensor=None)¶
Forward function.
- 参数
pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning target of the prediction in gaussian distribution.
- class hat.models.losses.focal_loss.LaneFastFocalLoss(alpha: float = 2.0, gamma: float = 4.0, loss_weight: float = 1.0)¶
- Modified focal loss. Exactly the same as CornerNet,
Runs faster and costs a little bit more memory, For Lane task, return effective loss when num_pos > 2.
- 参数
alpha – A weighting factor for positive sample.
gamma – Used in focal loss to balance contribution of easy examples and hard examples.
loss_weight – Weight factor for guassian focal loss.
- class hat.models.losses.focal_loss.SoftmaxFocalLoss(loss_name: str, num_classes: int, alpha: float = 0.25, gamma: float = 2.0, reduction: str = 'mean', weight: Union[float, Sequence] = 1.0)¶
Focal Loss.
- 参数
loss_name (str) – The key of loss in return dict.
num_classes (int) – Class number.
alpha (float, optional) – Alpha. Defaults to 0.25.
gamma (float, optional) – Gamma. Defaults to 2.0.
reduction (str, optional) – Specifies the reduction to apply to the output:
'mean'
|'sum'
. Defaults to'mean'
.weight (Union[float, Sequence], optional) – Weight to be applied to the loss of each input. Defaults to 1.0.
- forward(logits, labels)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.giou_loss.GIoULoss(loss_name, loss_weight=1.0, eps=1e-06, reduction='mean')¶
Generalized Intersection over Union Loss.
- 参数
loss_name (str) – The key of loss in return dict.
loss_weight (float) – Global weight of loss. Defaults is 1.0.
eps (float) – A small value to avoid zero denominator.
reduction (str) – The method used to reduce the loss. Options are [none, mean, sum].
- 返回
A dict containing the calculated loss, the key of loss is loss_name.
- 返回类型
dict
- forward(pred, target, weight=None, avg_factor=None)¶
Forward method.
- 参数
pred (torch.Tensor) – Predicted bboxes of format (x1, y1, x2, y2), represent upper-left and lower-right point, with shape(N, 4).
target (torch.Tensor) – Corresponding gt_boxes, the same shape as pred.
weight (torch.Tensor) – Element-wise weight loss weight, with shape(N,).
avg_factor (float) – Average factor that is used to average the loss.
- class hat.models.losses.hinge_loss.ElementwiseL1HingeLoss(loss_bound_l1: float = 0.0, pos_label: int = 1, neg_label: int = 0, norm_type: str = 'positive_label_elt', loss_weight: float = 1.0, reduction: Optional[str] = None, hard_neg_mining_cfg: Optional[Dict] = None)¶
Elementwise L1 Hinge Loss.
- 参数
loss_bound_l1 – Upper bound of l1 loss value in each entry.
pos_label – Value in label that represents positive entries.
neg_label – Value in label that represents negative entries.
norm_type – Normalization method, can be “positive_label_elt”, in which normalization factor is the number of positive elements, or “positive_loss_elt”, the number of positive losses.
loss_weight – Global weight of loss. Defaults is 1.0.
reduction – The method used to reduce the loss. Options are [none, mean, sum]. By default and recommended to be ‘mean’.
hard_neg_mining_cfg – hard negative mining config. Please refer to LossHardNegativeMining.
- 返回
loss value
- 返回类型
torch.Tensor
- class hat.models.losses.hinge_loss.ElementwiseL2HingeLoss(loss_bound_l1: float = 0.0, pos_label: int = 1, neg_label: int = 0, norm_type: str = 'positive_label_elt', loss_weight: float = 1.0, reduction: Optional[str] = None, hard_neg_mining_cfg: Optional[Dict] = None)¶
Elementwise L2 Hinge Loss.
- 参数
loss_bound_l1 – Upper bound of l1 loss value in each entry.
pos_label – Value in label that represents positive entries.
neg_label – Value in label that represents negative entries.
norm_type – Normalization method, can be “positive_label_elt”, in which normalization factor is the number of positive elements, or “positive_loss_elt”, the number of positive losses.
loss_weight – Global weight of loss. Defaults is 1.0.
reduction – The method used to reduce the loss. Options are [none, mean, sum]. By default and recommended to be ‘mean’.
hard_neg_mining_cfg – hard negative mining config. Please refer to LossHardNegativeMining.
- 返回
loss value
- 返回类型
torch.Tensor
- class hat.models.losses.hinge_loss.WeightedSquaredHingeLoss(reduction: str, loss_weight: float = 1.0, weight_low_thr: float = 0.1, weight_high_thr: float = 1.0, hard_neg_mining_cfg: Optional[Dict] = None)¶
Weighted Squared ElementWiseHingeLoss.
- 参数
reduction (str) – Possible values are {‘mean’, ‘sum’, ‘sum_mean’, ‘none’}
loss_weight (float) – by default 1.0
weight_low_thr (float) – Lower threshold for elementwise weight, by default 0.1
weight_high_thr (float) – Upper threshold for pixel-wise weight, by default 1.0
hard_neg_mining_cfg (dict) – Hard negative mining cfg
- forward(pred, target, weight=None, avg_factor=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.l1_loss.L1Loss(beta: float = 1.0, reduction: str = 'mean', loss_weight: Optional[float] = None, loss_name: Optional[str] = None, reduce_weight_shape=False, skip_neg_weight=False)¶
Smooth L1 Loss.
- 参数
beta – The threshold in the piecewise function. Defaults to 1.0.
reduction – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.
loss_weight – Loss weight.
- forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)¶
Forward function.
- 参数
pred – The prediction.
target – The learning target of the prediction.
weight – The weight of loss for each prediction. Defaults to None.
avg_factor – Normalized factor.
- class hat.models.losses.lnnorm_loss.LnNormLoss(norm_order: int = 2, epsilon: float = 0.0, power: float = 1.0, reduction: Optional[str] = None, loss_weight: Optional[float] = None)¶
LnNorm loss.
Different from torch.nn.L1Loss, the loss function uses Ln norm to calculate the distance of two feature maps.
- 参数
norm_order – The order of norm.
epsilon – A small constant for finetune.
power – A power num of norm + epsilon of loss.
reduction – Reduction mode.
loss_weight – If present, it will be used to weight the output.
- forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None) torch.Tensor ¶
Forward method.
- 参数
pred (Tensor) – Optical flow pred, with shape(N, 2, H, W).
target (Tensor) – Optical flow target, with shape(N, 2, H, W),
sampling. (which obtained by ground truth) –
weight (Tensor) – The weight of loss for each prediction. Default is None.
avg_factor (float) – Normalized factor.
- class hat.models.losses.mse_loss.MSELoss(clip_val: Optional[float] = None, reduction: Optional[str] = None, loss_weight: Optional[float] = None)¶
MSE (mean squared error) loss with clip value.
- 参数
clip_val – Clip value. If present, it is used to constrain the unweighted loss value between (-clip_val, clip_val). For the clipped entries, the gradient is calculated as if label value equals to predication +- clip_val.
reduction – Reduction mode.
loss_weight – If present, it will be used to weight the output.
- forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, valid_mask: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None) torch.Tensor ¶
Mse loss between pred and target items.
- 参数
pred – Predict output.
target – Target ground truth.
weight – Weight of loss, shape like pred.
valid_mask – Valid mask of loss.
avg_factor – Avg factor of loss.
- class hat.models.losses.seg_loss.MixSegLoss(losses: List[torch.nn.modules.module.Module], losses_weight: Optional[List[float]] = None, loss_name='mixsegloss')¶
Calculate multi-losses with same prediction and target.
- 参数
losses – List of losses with the same input pred and target.
losses_weight – List of weights used for loss calculation. Default: None
- forward(pred, target)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.seg_loss.MixSegLossMultipreds(losses: List[torch.nn.modules.module.Module], losses_weight: Optional[List[float]] = None, loss_name: str = 'multipredsloss')¶
Calculate multi-losses with multi-preds and correspondence targets.
- 参数
losses – List of losses with different prediction and target.
losses_weight – List of weights used for loss calculation. Default: None
loss_name – Name of output loss
- forward(pred, target)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.seg_loss.MultiStrideLosses(num_classes: int, out_strides: List[int], loss: torch.nn.modules.module.Module, loss_weights: Optional[List[float]] = None)¶
Multiple Stride Losses.
Apply the same loss function with different loss weights to multiple outputs.
- 参数
num_classes – Number of classes.
out_strides – strides of output feature maps
loss – Loss module.
loss_weights – Loss weight.
- forward(preds: List[torch.Tensor], targets: List[torch.Tensor]) Dict[str, torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.seg_loss.SegEdgeLoss(edge_graph: List[List[int]], kernel_half_size: int = 2, ignore_index: int = 255, loss_name: Optional[str] = None, loss_weight: float = 1e-05)¶
- forward(pred, target, weight=None, avg_factor=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.seg_loss.SegLoss(loss: List[torch.nn.modules.module.Module])¶
Segmentation loss wrapper.
- 参数
loss (dict) – loss config.
注解
This class is not universe. Make sure you know this class limit before using it.
- forward(pred: Any, target: List[Dict]) Dict ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.losses.smooth_l1_loss.SmoothL1Loss(beta: float = 1.0, reduction: str = 'mean', loss_weight: Optional[float] = None, hard_neg_mining_cfg: Optional[Dict] = None)¶
Smooth L1 Loss.
- 参数
beta – The threshold in the piecewise function. Defaults to 1.0.
reduction – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.
loss_weight – Loss weight.
hard_neg_mining_cfg – Hard negative mining cfg.
- forward(pred: torch.Tensor, target: torch.Tensor, weight: Optional[torch.Tensor] = None, avg_factor: Optional[Union[float, torch.Tensor]] = None)¶
Forward function.
- 参数
pred – The prediction.
target – The learning target of the prediction.
weight – The weight of loss for each prediction. Defaults to None.
avg_factor – Normalized factor.
- class hat.models.losses.yolo_losses.YOLOV3Loss(num_classes: int, anchors: list, strides: list, ignore_thresh: float, loss_xy: dict, loss_wh: dict, loss_conf: dict, loss_cls: dict, lambda_loss: list)¶
The loss module of YOLOv3.
- 参数
num_classes – Num classes of class branch.
anchors – The anchors of YOLOv3.
strides – The strides of feature maps.
ignore_thresh – Ignore thresh of target.
loss_xy – Losses of xy.
loss_wh – Losses of wh.
loss_conf – Losses of conf.
loss_cls – Losses of cls.
lambda_loss – The list of weighted losses.
- forward(input, target=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.multitask_graph_model.MultitaskGraphModel(inputs: Dict[str, Any], task_inputs: Dict[str, Dict[str, Any]], task_modules: Dict[str, torch.nn.modules.module.Module], opt_inputs: Optional[Dict[str, Any]] = None, funnel_modules: Optional[Dict[Tuple[Tuple[str], str], torch.nn.modules.module.Module]] = None, flatten_outputs: bool = True, lazy_forward: Optional[bool] = True, force_cpu_init: Optional[bool] = False, force_eval_init: Optional[bool] = False)¶
Graph model used to construct multitask model structure.
Structures of each task can be declared independently (while some modules are actually shared among multiple tasks), each corresponds to a separately built computational graph.
Then, some other modules that take outputs of multiple tasks as inputs, named as ‘funnel modules’, are called to generate final outputs.
By defining that nodes with the same inputs and shared operator (module) are identical, we can conduct a node merge in the multitask graph in a layer-by-layer manner (implemented as BFS).
This class differs from GraphModel primarily in the graph initialization stage.
- 参数
inputs – key-value pairs used to describe task-agnostic inputs. During initialization, they are used in tracing, to build the topology of the whole computational graph. Generally, keys are strings, while values can be tensor or None (for symbolic mode only).
task_inputs – key-value pairs used to describe task-specific inputs, which functions similar as inputs. The difference is, each task has its own namespace, so its can be better represented as {task_name1: task_inputs1, task_name2: task_inputs2, …}.
task_modules – key-value pairs used to describe the model structure of each task.
opt_inputs – key-value pairs used to describe task-agnostic inputs that are optional to the whole graph.
funnel_modules – key-value pairs used to describe “funnel” modules that collect outputs from multiple tasks and generate final results. Each funnel module corresponds to a key structured as (input_names, out_name), which means it “absorbs” (dict pop) outputs keyed by input_names and then pushes back its output keyed by out_name to the output dict.
flatten_outputs – whether to flatten final outputs to NamedTuple, in order to support tracing.
lazy_forward – whether to conduct symbolic tracing or not. If contents of any outputs of a graph node need expanding (for example, query value of a dict with a key), lazy_forward is not available.
force_cpu_init – force to init model on cpu, mainly to avoid
increases. (Gpu oom when tasks) –
- forward(inputs: Dict[str, Any], out_names: Optional[Union[str, Sequence[str]]] = None) Union[NamedTuple, Dict] ¶
Forward full or subgraph given output names and input data.
- 参数
out_names –
Graph output names, should be a subset of self._output_names , i.e. should keep accordance with the keys of name2out which is returned from self.topology_builder .
If None, means to forward the whole graph.
If not None, we will use it to get a sub graph then forward.
inputs –
A dict of (input name, data), should be a subset of self.inputs , providing necessary input data to forward the full or sub graph.
注解
Only provide reliable inputs used in graph forward, extra inputs will cause error.
- get_sub_graph(out_names: Union[str, Sequence[str]]) None ¶
Select part of the graph outputs by out_names to get sub graph.
- 参数
out_names – Names of graph outputs, should be a subset of
(self._output_names) –
- 返回
A sub graph of self._graph .
- 返回类型
hatbc.workflow.symbol.Symbol
- property graph¶
Full graph which represents GraphModel’s computational topology.
- named_buffers_by_outname(out_names: Tuple[str], prefix: str = '') Tuple[str, Any] ¶
Get all named buffers that contained by sub-graph of outname.
- named_modules_by_outname(out_names: Tuple[str], prefix: str = '') Tuple[str, Any] ¶
Get all named modules that contained by sub-graph of outname.
- named_parameters_by_outname(out_names: Tuple[str], prefix: str = '') Tuple[str, Any] ¶
Get all named parameters that contained by sub-graph of outname.
- property output_names¶
Names of graph output variables.
- split_module(out_names, split_node_name=None, start_node_name='img', common_module_flatten=False)¶
Split the model into two parts, the first part is common part, the second part is split part.
- 参数
out_names (list) – output names of the model.
split_node_name (str) – the name of the node which is used to split the model, if None, will auto search the graph starting by start_node.
start_node_name (str) – the name of the node to start searching the computation graph.
注解
Due to the limitation of the current implementation, the split node encountered first will be used. Visit function ‘get_split_node_v2’ for more details.
- class hat.models.structures.classifier.Classifier(backbone, losses=None, make_backbone_graph=False, num_warmup_iters=3)¶
The basic structure of classifier.
- 参数
backbone – Backbone module.
losses – Losses module.
make_backbone_graph – whether to use cuda_graph in backbone.
num_warmup_iters – Num of iters for warmup of cuda_graph.
- forward(data, target=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.classifier.ClassifierHbirInfer(model_path: str)¶
The basic structure of ClassifierHbirInfer.
- 参数
model_path – The path of hbir model.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.encoder_decoder.EncoderDecoder(backbone: torch.nn.modules.module.Module, decode_head: torch.nn.modules.module.Module, target: Optional[object] = None, loss: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, auxiliary_heads: Optional[List[Dict]] = None, decode: Optional[object] = None, with_target: Optional[torch.nn.modules.module.Module] = False)¶
The basic structure of encoder decoder.
- 参数
backbone – Backbone module.
decode_head – Decode head module.
target – Target module for decode head. Default: None.
loss – Loss module for decode head. Default: None.
neck – Neck module. Default: None.
auxiliary_heads – List of auxiliary head modules which contains of “head”, “target”, “loss”. Default: None.
decode – decode. Defualt: None.
with_target – Whether return target during inference.
- forward(data: dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.encoder_decoder.EncoderDecoderHbirInfer(model_path: str, post_process: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of EncoderDecoderHbirInfer.
- 参数
model_path – The path of hbir model.
post_process – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.motion_forecasting.MotionForecasting(encoder: torch.nn.modules.module.Module, decoder: torch.nn.modules.module.Module, target: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of motion forecasting.
- 参数
encoder – encoder module.
decoder – decoder module.
target – target generator.
loss – loss module.
post_process – post process module.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.motion_forecasting.MotionForecastingHbirInfer(model_path: str, pad_batch: int = 30, postprocess: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of MotionForecastingHbirInfer.
- 参数
model_path – The path of hbir model.
pad_batch – The num of pad for batchdata.
postprocess – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.segmentor.BMSegmentor(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module, head: torch.nn.modules.module.Module, target: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, desc: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None)¶
The segmentor structure that inputs image metas into postprocess.
- forward(data: dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.segmentor.Segmentor(backbone, neck, head, losses=None)¶
The basic structure of segmentor.
- 参数
backbone (torch.nn.Module) – Backbone module.
neck (torch.nn.Module) – Neck module.
head (torch.nn.Module) – Head module.
losses (torch.nn.Module) – Losses module.
- forward(data: dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.segmentor.SegmentorHbirInfer(model_path)¶
The basic structure of SegmentorHbirInfer.
- 参数
model_path – The path of hbir model.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.segmentor.SegmentorV2(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module, head: torch.nn.modules.module.Module, target: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, desc: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of segmentor.
- 参数
backbone – Backbone module.
neck – Neck module.
head – Head module.
loss – Loss module.
desc – Desc module
postprocess – Postprocess module.
- forward(data: dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.view_fusion.ViewFusion(backbone: torch.nn.modules.module.Module, neck: torch.nn.modules.module.Module, view_transformer: Optional[torch.nn.modules.module.Module] = None, temporal_fusion: Optional[torch.nn.modules.module.Module] = None, aux_heads: Optional[List[torch.nn.modules.module.Module]] = None, bev_encoder: Optional[torch.nn.modules.module.Module] = None, bev_decoders: Optional[List[torch.nn.modules.module.Module]] = None, bev_feat_index: int = 0, bev_transforms: Optional[List] = None, bev_upscale: int = 2, compile_model: bool = False)¶
The basic structure of bev.
- 参数
backbone – Backbone module.
neck – Neck module.
view_transformer – View transformer module for transforming from img view to bev view.
aux_heads – List of auxiliary heads for training.
bev_encoder – Encoder for the feature of bev view. If set to None, bev feature is used for decoders directly.
bev_decoders – Decoder for bev feature.
bev_feat_index – Index for bev feats. Default 0.
bev_transforms – Transfomrs for bev traning.
bev_upscale – Upscale parameter for bec feature.
compile_model – Whether in compile model.
- export_reference_points(data: Dict, feat_wh: Tuple[int, int]) Dict ¶
Export refrence points.
- 参数
data – A dictionary containing the input data.
feat_wh – View transformer input shape for generationg reference points.
- 返回
The Reference points.
- forward(data: Dict) Tuple[Dict, Dict] ¶
Perform the forward pass of the model.
- 参数
data – A dictionary containing the input data, including the image and other relevant information.
- 返回
The predictions of the model. results: A dictionary containing the results of the model.
- 返回类型
preds
- fuse_model() None ¶
Perform model fusion on the specified modules within the class.
- img_encode(img: torch.Tensor) torch.Tensor ¶
Encode the input image and returns the encoded features.
- 参数
img – The input image to be encoded.
- 返回
The encoded features of the input image.
- 返回类型
feats
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.structures.view_fusion.ViewFusion4DHbirInfer(bev_size: List, in_channels: int, num_views: int, **kwargs)¶
The basic structure of ViewFusion4DHbirInfer.
- 参数
bev_size – The deploy model to generate refpoints.
in_channels – Define the process of model convert.
num_views – Feature map shape.
kwargs – As same ViewFusionHbirInfer docstring.
- class hat.models.structures.view_fusion.ViewFusionHbirInfer(model_path: str, deploy_model: Optional[torch.nn.modules.module.Module] = None, model_convert_pipeline: Optional[List[callable]] = None, vt_input_hw: Optional[List[int]] = None, bev_decoder_infers: Optional[List[torch.nn.modules.module.Module]] = None)¶
The basic structure of ViewFusionHbirInfer.
- 参数
deploy_model – The deploy model to generate refpoints.
model_convert_pipeline – Define the process of model convert.
vt_input_hw – Feature map shape.
model_path – The path of hbir model.
bev_decoder_infers – bev_decoder_infers module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.centerpoint.CenterPointDetector(feature_map_shape: List[int], pre_process: Optional[torch.nn.modules.module.Module] = None, reader: Optional[torch.nn.modules.module.Module] = None, backbone: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None, quant_begin_neck: bool = False, is_deploy: bool = False)¶
The basic structure of CenterPoint.
- 参数
feature_map_shape – Feature map shape, in (W, H, 1) format.
pre_process – pre_process module.
reader – reader module.
backbone – Backbone module.
neck – Neck module.
head – Head module.
targets – Target generator module.
loss – Loss module.
postprocess – Postprocess module.
quant_begin_neck – Whether to quantize beginning from neck.
is_deploy – Is deploy model or not.
- forward(example)¶
Perform the forward pass of the model.
- 参数
example – A dictionary containing the input data, including points or extracted features by deploy flag.
- 返回
Results produced by post_process.
- 返回类型
results
- fuse_model()¶
Fuse quantizable modules in the model.
This function fuses quantizable modules within the model to prepare it for quantization.
- set_calibration_qconfig()¶
Set calibration quantization configurations for the model.
This function is deprecated by calibration_v2.
- set_qconfig()¶
Set quantization configurations for the model.
This function sets quantization configurations for the model and its submodules. It configures quantization settings for different parts of the model based on the quant_begin_neck attribute.
- class hat.models.structures.detectors.centerpoint.CenterPointDetectorHbirInfer(model_path: str, pre_process: torch.nn.modules.module.Module, feature_map_shape: List[int], postprocess: torch.nn.modules.module.Module, tasks: Optional[List[dict]], headkeys: List[str] = ('reg', 'height', 'dim', 'rot', 'vel', 'heatmap'))¶
The basic structure of CenterPointHbirInfer.
- 参数
model_path – The path of hbir model.
pre_process – pre_process module.
feature_map_shape – Feature map shape, in (W, H, 1) format.
postprocess – Postprocess module.
headkeys – The key of headoutputs.
tasks – Task information including class number and class names.
- forward(example)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.detr.Detr(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, criterion: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of detr.
- 参数
backbone – backbone module.
neck – neck module.
head – head module with transformer architecture.
criterion – loss module.
post_process – post process module.
- extract_feat(img)¶
Directly extract features from the backbone + neck.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.detr.DetrHbirInfer(model_path: str, post_process: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of DetrHbirInfer.
- 参数
model_path – The path of hbir model.
post_process – Post process module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.detr3d.Detr3d(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, target: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_reg: Optional[torch.nn.modules.module.Module] = None, compile_model: bool = False)¶
The basic structure of detr3d.
- 参数
backbone – backbone module.
neck – neck module.
head – head module with transformer architecture.
target – detr3d target generator.
post_process – post process module.
loss_cls – loss module for classification.
loss_reg – loss module for regression.
compile_model – Whether in compile model.
- export_reference_points(data: Dict, feat_wh: Tuple[int, int]) Dict ¶
Export the reference points.
- 参数
data – The data used for exporting the reference points.
feat_wh – The size of the feature map.
- 返回
The exported reference points.
- extract_feat(img: torch.Tensor) torch.Tensor ¶
Directly extract features from the backbone + neck.
- 参数
img – The input image to be encoded.
- 返回
The encoded features of the input image.
- forward(data: Dict) Dict ¶
Perform the forward pass of the model.
- 参数
data – A dictionary containing the input data.
- 返回
A dictionary containing the output of the forward pass.
- fuse_model() None ¶
Fuse the model.
- set_calibration_qconfig()¶
Set the calibration qconfig.
- set_qconfig() None ¶
Set the qconfig.
- class hat.models.structures.detectors.detr3d.Detr3dHbirInfer(deploy_model: torch.nn.modules.module.Module, model_convert_pipeline: List[callable], vt_input_hw: List[int], model_path: str, post_process: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of Detr3dHbirInfer.
- 参数
deploy_model – The deploy model to generate refpoints.
model_convert_pipeline – Define the process of model convert.
vt_input_hw – Feature map shape.
model_path – The path of hbir model.
post_process – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.fcos.FCOS(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, desc: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_reg: Optional[torch.nn.modules.module.Module] = None, loss_centerness: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of fcos.
- 参数
backbone – Backbone module.
neck – Neck module.
head – Head module.
targets – Target module.
loss_cls – Classification loss module.
loss_reg – Regiression loss module.
loss_centerness – Centerness loss module.
desc – Description module.
postprocess – Postprocess module.
- extract_feat(img, uv_map=None)¶
Directly extract features from the backbone + neck.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.fcos.FCOSHbirInfer(model_path: str, num_class: int = 80, post_process: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of FCOSHbirInfer.
- 参数
num_class – The num of class.
post_process – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.fcos3d.FCOS3D(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of fcos3d.
- 参数
backbone – Backbone module.
neck – Neck module.
head – Head module.
targets – Target module.
post_process – post_process module.
loss – loss module.
- extract_feat(img)¶
Directly extract features from the backbone + neck.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.fcos3d.FCOS3DHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module, strides: Tuple[int])¶
The basic structure of FCOS3DHbirInfer.
- 参数
model_path – The path of hbir model.
post_process – Postprocess module.
strides – A list of strides.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.pointpillars.PointPillarsDetector(feature_map_shape: List[int], pre_process: Optional[torch.nn.modules.module.Module] = None, reader: Optional[torch.nn.modules.module.Module] = None, backbone: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, anchor_generator: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, postprocess: Optional[torch.nn.modules.module.Module] = None, quant_begin_neck: bool = False, is_deploy: bool = False)¶
The basic structure of PointPillars.
- 参数
feature_map_shape – Feature map shape, in (W, H, 1) format.
out_size_factor – Downsample factor.
reader – Reader module.
backbone – Backbone module.
neck – Neck module.
head – Head module.
anchor_generator – Anchor generator module.
targets – Target generator module.
loss – Loss module.
postprocess – Postprocess module.
- forward(example)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- fuse_model()¶
Fuse quantizable modules in the model, used in eager mode.
This function fuses quantizable modules within the model to prepare it for quantization.
- set_calibration_qconfig()¶
Set calibration quantization configurations for the model.
This function is deprecated by calibration_v2.
- set_qconfig()¶
Set quantization configurations for the model.
This function sets quantization configurations for the model and its submodules. It configures quantization settings for different parts of the model based on the quant_begin_neck attribute.
- class hat.models.structures.detectors.pointpillars.PointPillarsDetectorHbirInfer(model_path: str, postprocess: torch.nn.modules.module.Module, anchor_generator: torch.nn.modules.module.Module, max_points: int = 150000)¶
The basic structure of PointPillarsDetectorHbirInfer.
- 参数
model_path – The path of hbir model.
postprocess – Postprocess module.
anchor_generator – The anchor generator module.
max_points – The max of points.
- forward(example)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.retinanet.RetinaNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, filter_module: Optional[torch.nn.modules.module.Module] = None, anchors: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_reg: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of retinanet.
- 参数
backbone – backbone module or dict for building backbone module.
neck – neck module or dict for building neck module.
head – head module or dict for building head module.
anchors – anchors module or dict for building anchors module.
targets – targets module or dict for building target module.
post_process – post_process module or dict for building post_process module.
loss_cls – loss_cls module or dict for building loss_cls module.
loss_reg – loss_reg module or dict for building loss_reg module.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.retinanet.RetinaNetHbirInfer(model_path: str, anchors: torch.nn.modules.module.Module, post_process: torch.nn.modules.module.Module, split_dim: List[int], featsizes: List[List[int]])¶
The basic structure of RetinaNetHbirInfer.
- 参数
model_path – The path of hbir model.
anchors – The AnchorGenerator.
post_process – Postprocess module.
split_dim – The dim will split.
featsizes – The size of featmaps.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.yolov3.YOLOHbirInfer(model_path: str, postprocess: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of YOLOHbirInfer.
- 参数
model_path – The path of hbir model.
postprocess – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.detectors.yolov3.YOLOV3(backbone: Optional[dict] = None, neck: Optional[dict] = None, head: Optional[dict] = None, filter_module: Optional[dict] = None, anchor_generator: Optional[dict] = None, target_generator: Optional[dict] = None, loss: Optional[dict] = None, postprocess: Optional[dict] = None)¶
The basic structure of yolov3.
- 参数
backbone – Backbone module.
neck – Neck module.
head – Head module.
anchor_generator – Anchor generator module.
target_generator – Target generator module.
loss – Loss module.
postprocess – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.disparity_pred.stereonet.StereoNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, loss_weights: Optional[List[float]] = None)¶
The basic structure of StereoNet.
- 参数
backbone – backbone module.
neck – neck module
head – head module.
post_process – post_process module.
loss – loss module.
loss_weights – loss weights for each feature.
- forward(data: Dict) Union[List, Dict] ¶
Perform the forward pass of the model.
- 参数
data – The input data,
- fuse_model() None ¶
Perform model fusion on the specified modules within the class.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.structures.disparity_pred.stereonet.StereoNetHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module)¶
The basic structure of StereoNetHbirInfer.
- 参数
model_path – The path of hbir model.
post_process – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.disparity_pred.stereonet.StereoNetPlus(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, loss_weights: Optional[List[float]] = None, num_fpn_feat: int = 3)¶
The basic structure of StereoNetPlus.
- 参数
backbone – backbone module.
neck – neck module
head – head module.
post_process – post_process module.
loss – loss module.
loss_weights – loss weights for each feature.
num_fpn_feat – the number of featmap use fpn.
- forward(data: Dict) Union[List, Dict] ¶
Perform the forward pass of the model.
- 参数
data – The input data,
- class hat.models.structures.keypoints.keypoint_model.HeatmapKeypointHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module)¶
The basic structure of HeatmapKeypointHbirInfer.
- 参数
model_path – The path of hbir model.
post_process – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.keypoints.keypoint_model.HeatmapKeypointModel(backbone: torch.nn.modules.module.Module, decode_head: torch.nn.modules.module.Module, loss: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, deploy: bool = False)¶
HeatmapKeypointModel is a model for keypoint detection using heatmaps.
- 参数
backbone – Backbone network used for feature extraction.
decode_head – Decode head that upsample the feature to generate heatmap.
loss – Loss function that compute the loss
post_processes – Module that decode keypoints prediction from heatmap.
deploy – Flag indicating whether the model is used for deployment or training.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.lane_pred.ganet.GaNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, targets: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, losses: Optional[torch.nn.modules.module.Module] = None)¶
The basic structure of GaNet.
- 参数
backbone – Backbone module.
neck – Neck module.
head – Head module.
targets – Target module.
post_process – Post process module.
losses – Loss module.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.lane_pred.ganet.GaNetHbirInfer(model_path: str, post_process: torch.nn.modules.module.Module)¶
The basic structure of GaNetHbirInfer.
- 参数
model_path – The path of hbir model.
post_process – Postprocess module.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.lidar_multitask.lidar_multitask.LidarMultiTask(feature_map_shape: List[int], pre_process: Optional[torch.nn.modules.module.Module] = None, reader: Optional[torch.nn.modules.module.Module] = None, scatter: Optional[torch.nn.modules.module.Module] = None, backbone: Optional[torch.nn.modules.module.Module] = None, neck: Optional[torch.nn.modules.module.Module] = None, lidar_decoders: Optional[List[torch.nn.modules.module.Module]] = None, quant_begin_backbone: bool = False, is_deploy: bool = False)¶
The basic structure of LidarMultiTask.
- 参数
feature_map_shape – Feature map shape, in (W, H, 1) format.
pre_process – Pre-process module.
reader – Reader module.
scatter – Scatter module.
backbone – Backbone module.
neck – Neck module.
lidar_decoders – List of Lidar Decoder modules.
quant_begin_backbone – Whether to quantize beginning from the backbone.
is_deploy – Is it a deploy model or not.
- forward(example)¶
Forward pass through the LidarMultiTask model.
- 参数
example – Input data dictionary containing “points” and other
information. (relevant) –
- 返回
Model predictions. results: Additional results if available.
- 返回类型
preds
- fuse_model()¶
Fuse model operations for quantization.
- set_qconfig()¶
Set quantization configuration for the model.
- class hat.models.structures.lidar_multitask.lidar_multitask.LidarMultiTaskHbirInfer(model_path: str, pre_process: torch.nn.modules.module.Module, feature_map_shape: List[int], lidar_decoders: List[torch.nn.modules.module.Module])¶
The basic structure of LidarMultiTaskHbirInfer.
- 参数
model_path – The path of hbir model.
pre_process – pre_process module.
feature_map_shape – Feature map shape, in (W, H, 1) format.
lidar_decoders – Lidar decoder module.
- forward(example)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.opticalflow.pwcnet.PwcNet(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, loss_weights: Optional[List[float]] = None)¶
The basic structure of PWCNet.
- 参数
backbone – backbone module or dict for building backbone module.
neck – neck module or dict for building neck module.
head – head module or dict for building head module.
loss – loss module or dict for building loss module.
loss_weights – loss weights for each feature.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.opticalflow.pwcnet.PwcNetHbirInfer(model_path: str)¶
The basic structure of PwcNetHbirInfer.
- 参数
model_path – The path of hbir model.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.track_pred.motr.Motr(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None, head: Optional[torch.nn.modules.module.Module] = None, criterion: Optional[torch.nn.modules.module.Module] = None, post_process: Optional[torch.nn.modules.module.Module] = None, track_embed: Optional[torch.nn.modules.module.Module] = None, compile_motr: bool = False, compile_qim: bool = False, num_query_h: int = 2, batch_size: int = 1)¶
The basic structure of Motr.
- 参数
backbone – backbone module.
neck – neck module.
head – head module with transformer architecture.
criterion – loss module.
post_process – post process module.
track_embed – track embed module.
compile_motr – Whether to compile motr model.
compile_qim – Whether to compile qim model
num_query_h – The num of h dim for query reshape.
batch_size – batch size
- extract_feat(img)¶
Directly extract features from the backbone + neck.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.structures.track_pred.motr.MotrHbirInfer(model_path: str, qim_model_path: str, post_process: Optional[torch.nn.modules.module.Module] = None, num_query_h: int = 2, batch_size: int = 1, num_queries: int = 256, queries_dim: int = 256, LoadCheckpoint: Optional[Callable] = None, num_classes: int = 1)¶
The basic structure of MotrHbirInfer.
- 参数
model_path – The path of hbir model.
qim_model_path – The path of qim hbir model.
post_process – post process module.
num_query_h – The num of h dim for query reshape.
batch_size – batch size.
num_queries – The num of query.
queries_dim – The dim of query.
LoadCheckpoint – LoadCheckpoint func.
num_classes – Num class.
- forward(data: Dict)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.model_convert.converters.FixWeightQScale¶
Fix qscale of weight while calibration or qat stage.
- class hat.models.model_convert.converters.Float2Calibration(convert_mode='eager', hybrid=False, hybrid_dict=None, optimize_graph=False, qconfig_setter=None, example_inputs=None, example_data_loader=None, batch_transforms=None)¶
Define the process of convert float model to calibration model.
- 参数
convert_mode – convert mechanism, can be choosen from (‘eager’, ‘symbolic’, ‘jit’, ‘jit-strip’).
hybrid – only used when convert_mode == ‘symbolic’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info.
hybrid_dict – only used when convert_mode == ‘symbolic’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info.
optimize_graph – whether to do some process on origin model for special purpose. Currently only support using torch.fx to fix cat input scale(only used on Bernoulli).
qconfig_setter – set qconfig automatically. Value is an qconfig setter in horizon_plugin_pytorch.quantization.qconfig_template.
example_inputs – example inputs for tracing graph. When using ‘jit’/’jit-strip’ convert_mode or template qconfig setter, one of example_inputs and example_data_loader should be provided.
example_data_loader – example data loader to get example inputs for tracing graph. When using ‘jit’/’jit-strip’ convert_mode or template qconfig setter, one of example_inputs and example_data_loader should be provided.
batch_transforms – batch transforms on example data loader.
- class hat.models.model_convert.converters.Float2QAT(convert_mode='eager', hybrid=False, hybrid_dict=None, optimize_graph=False, qconfig_setter=None, example_inputs=None, example_data_loader=None, batch_transforms=None, state='val')¶
Define the process of convert float model to qat model.
- 参数
convert_mode – convert mechanism, can be choosen from (‘eager’, ‘symbolic’, ‘jit’, ‘jit-strip’).
hybrid – only used when convert_mode == ‘symbolic’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info.
hybrid_dict – only used when convert_mode == ‘symbolic’, please refer to the doc of horizon.quantization.prepare_qat_fx for more info.
optimize_graph – whether to do some process on origin model for special purpose. Currently only support using torch.fx to fix cat input scale(only used on Bernoulli).
qconfig_setter – set qconfig automatically. Value is an qconfig setter in horizon_plugin_pytorch.quantization.qconfig_template.
example_inputs – example inputs for tracing graph. When using ‘jit’/’jit-strip’ convert_mode or template qconfig setter, one of example_inputs and example_data_loader should be provided.
example_data_loader – example data loader to get example inputs for tracing graph. When using ‘jit’/’jit-strip’ convert_mode or template qconfig setter, one of example_inputs and example_data_loader should be provided.
batch_transforms – batch transforms on example data loader.
state – model state when tracing.can be choosen from (‘train’, ‘val’).
- class hat.models.model_convert.converters.GraphModelInputKeyMapping(input_key_mapping: Dict[str, str])¶
Mapping input key in graph model for deploy mode.
- class hat.models.model_convert.converters.GraphModelSplit(split_nodes: List[str], next_bases: List[str], save_models: Optional[List[str]] = None, pick_models_index: Optional[int] = None)¶
Split graph model in deploy mode.
- class hat.models.model_convert.converters.LoadCheckpoint(checkpoint_path: str, state_dict_update_func: Optional[Callable] = None, check_hash: bool = True, allow_miss: bool = False, ignore_extra: bool = False, ignore_tensor_shape: bool = False, verbose: bool = False, enable_tracking: bool = False)¶
Load the checkpoint from file to model and return the checkpoint.
LoadCheckpoint usually happens before or after BaseConverter.It means the model needs to load parameters before or after BaseConverter.
- 参数
checkpoint_path – Path of the checkpoint file.
state_dict_update_func – state_dict update function. The input of the function is a state_dict, The output is a modified state_dict as you want.
check_hash – Whether to check the file hash.
allow_miss – Whether to allow missing while loading state dict.
ignore_extra – Whether to ignore extra while loading state dict.
ignore_tensor_shape – Whether to ignore matched key name but unmatched shape of tensor while loading state dict.
verbose – Show unexpect_key and miss_key info.
return_checkpoint – whether return the values of the checkpoint.
enable_tracking – whether enable tracking checkpoint.
- class hat.models.model_convert.converters.LoadHbir(path)¶
Load hbir module from file.
- 参数
path – hbir model path
- class hat.models.model_convert.converters.LoadMeanTeacherCheckpoint(checkpoint_path: str, strip_prefix: str = 'module.', state_dict_update_func: Optional[Callable] = None, check_hash: bool = True, allow_miss: bool = False, ignore_extra: bool = False, verbose: bool = False)¶
Load the Mean-teacher model checkpoint.
student and teacher model have same structure. LoadMeanTeacherCheckpoint usually happens before or after BaseConverter. It means the model needs to load parameters before or after BaseConverter.
- 参数
checkpoint_path – Path of the checkpoint file.
state_dict_update_func – state_dict update function. The input of the function is a state_dict, The output is a modified state_dict as you want.
check_hash – Whether to check the file hash.
allow_miss – Whether to allow missing while loading state dict.
ignore_extra – Whether to ignore extra while loading state dict.
verbose – Show unexpect_key and miss_key info.
return_checkpoint – whether return the values of the checkpoint.
- class hat.models.model_convert.converters.QATFusePartBN(qat_fuse_patterns: List[str], fuse_method: str = 'fuse_norm', regex: bool = True, strict: bool = False)¶
Define the process of fusing bn in a QAT model.
Usually used in step fuse bn. Note that module do fuse bn only when block implement block.”fuse_method”().
- 参数
qat_fuse_patterns – Regex, compile by re.
fuse_method – Fuse bn method that block calls.
regex – Whether to match by regex. if not, match by module name.
strict – Whether the regular expression is required to be all matched.
- class hat.models.model_convert.converters.RepModel2Deploy¶
Convert Reparameterized model to deploy mode.
- class hat.models.model_convert.converters.Torch2Compile(compile_submodules: List[str] = None, skip_modules: List[str] = None, regex: bool = True, strict: bool = False, dynamo_cfg: Optional[Dict] = None, **kwargs)¶
Compile model(nn.Module) by torch.compile() in torch>=2.0.
注解
compile_submodules and skip_modules are mutually exclusive and can only be selected for use. If none of them are used, the entire model will be compiled.
- 参数
compile_submodules – Module to compile, support regex or module name.
skip_modules – Module to skip compile, support regex or module name.
regex – Whether to match by regex. if not, match by module name.
strict – Whether regular expression is required to be all matched.
dynamo_cfg – A dictionary of options to set torch._dynamo.config.
kwargs – Args of torch.compile interface, see:
https – //pytorch.org/docs/stable/generated/torch.compile.html#torch.compile
- static compile_modules(model: torch.nn.modules.module.Module, compile_submodules: List[str], regex: bool = True, strict: bool = False, **kwargs)¶
Add a wrap hook to compile submodule.
- 参数
model – Model to add hook.
skip_modules – Submodule to compile, support regex or module name.
regex – Whether to match by regex. if not, match by module name.
strict – Whether regular expression is required to be all matched.
- static skip_compile_modules(model: torch.nn.modules.module.Module, skip_modules: List[str], regex: bool = True, strict: bool = False)¶
Add a wrap hook to skip compile.
- 参数
model – Model to add hook.
skip_modules – Module to skip compile, support regex or module name.
regex – Whether to match by regex. if not, match by module name.
strict – Whether regular expression is required to be all matched.
- class hat.models.model_convert.converters.TorchCompile(compile_backend: Optional[Union[str, Callable]] = None, load_extensions: Optional[Union[List[str], str]] = None)¶
Convert torch module to compile wrap module.
NOTE: Compilation occurs at the first model forward! Slower is as expected!
- 参数
compile_backend – TorchDynamo compile optimizer backend.
load_extensions – Load extension from hat.utils.trt_fx_extension.py.
- class hat.models.model_convert.pipelines.FloatQatConvertPipeline(qat_mode: str, enable_qat: Optional[bool] = True, enable_calibraion: Optional[bool] = False, checkpoint_mode: Optional[str] = None, checkpoint_configs: Optional[Dict] = None, qconfig_params: Optional[Dict] = None)¶
Convert pipeline for QAT Fuse BN case.
This convert pipeline is created to simplify configurations of float-float_freeze_bn-qat training.
This class works closely with LoadCheckpoint converter, please refer to the documents for more detail.
- 参数
qat_mode – whether need to fuse bn or not.
enable_qat – whether to convert model to QAT.
checkpoint_mode – can be “resume” or “pre_step”, or left None, when no checkpoint provided. “resume” corresponds to the case where the provided checkpoint is saved from a module in current training stage, while “pre_step” the previous stage. Further details of the checkpoint loading (such as how to deal with missed or extra parameters in checkpoint) should be specified in “checkpoint_configs” arg.
checkpoint_configs – specify the checkpoint loading details, such as checkpoint_path, allow_miss, ignore_extra… During initialization, value of this arg is directly passed to LoadCheckpoint converter, please refer to its document for details.
qconfig_params – the params of qat config.
- class hat.models.model_convert.pipelines.QATFuseBNConvertPipeline(qat_mode: str, pre_stage_fuse_patterns: List[hat.models.model_convert.converters.BaseConverter], cur_stage_fuse_patterns: List[hat.models.model_convert.converters.BaseConverter], fuse_part_configs: Optional[Dict] = None, checkpoint_mode: Optional[str] = None, checkpoint_configs: Optional[Dict] = None, qconfig_params: Optional[Dict] = None)¶
Convert pipeline for QAT Fuse BN case.
This convert pipeline is created to simplify configurations of QAT Fuse BN training. As the name indicates, this pipeline works only with QAT training. In each training stage, BatchNorms from some user-specfied parts of the whole model are fused into nearest Convs.
This class works closely with QATFusePartBN and LoadCheckpoint converter, please refer to the documents for more detail.
- 参数
qat_mode – whether need to fuse bn or not.
pre_stage_fuse_patterns – specify which parts of the module should be fused in previous stage.
cur_stage_fuse_patterns – specify which parts of the module should be fused in current stage.
fuse_part_configs – specify the kwargs of QATFuseBNPart converter, please refer to its document for details.
checkpoint_mode – can be “resume” or “pre_step”, or left None, when no checkpoint provided. “resume” corresponds to the case where the provided checkpoint is saved from a module in current training stage, while “pre_step” the previous stage. Further details of the checkpoint loading (such as how to deal with missed or extra parameters in checkpoint) should be specified in “checkpoint_configs” arg.
checkpoint_configs – specify the checkpoint loading details, such as checkpoint_path, allow_miss, ignore_extra… During initialization, value of this arg is directly passed to LoadCheckpoint converter, please refer to its document for details.
qconfig_params – the params of qat config.
- class hat.models.necks.bifpn.BiFPN(in_strides: List[int], out_strides: int, stride2channels: Dict, out_channels: Union[int, Dict], num_outs: int, stack: int = 3, start_level: int = 0, end_level: int = - 1, fpn_name: str = 'bifpn_sum', upsample_type: str = 'module', use_fx: bool = False)¶
Weighted Bi-directional Feature Pyramid Network(BiFPN).
This is an implementation of - EfficientDet: Scalable and Efficient Object Detection (https://arxiv.org/abs/1911.09070)
- 参数
in_strides – Stride of input feature map
out_strides – Stride of output feature map
stride2channels – The key:value is stride:channel , the channles have been multipified by alpha
out_channels – Channel number of output layer, the key:value is stride:channel.
num_outs – Number of BifpnLayer’s input, the value is must 5, because the bifpn layer is fixed
stack – Number of BifpnLayer
start_level – Index of the start input backbone level used to build the feature pyramid. Default: 0.
end_level – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, means the last level.
fpn_name – the value is mutst between with ‘bifpn_sum’, ‘bifpn_fa’.
upsample_type – use module or function unsample, the candidate is [‘module’, ‘function’].
use_fx – Whether use fx mode qat. Default: False.
- forward(inputs)¶
Forward features.
- 参数
inputs (list[tensor]) – Input tensors
Returns (list[tensor]): Output tensors
- class hat.models.necks.dw_unet.DwUnet(base_channels: int, bn_kwargs: Optional[Dict] = None, act_type: torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, use_deconv: bool = False, dw_with_act: bool = False, output_scales: Sequence = (4, 8, 16, 32, 64))¶
Unet segmentation neck structure.
Built with separable convolution layers.
- 参数
base_channels (int) – Output channel number of the output layer of scale 1.
bn_kwargs (Dict, optional) – Keyword arguments for BN layer. Defaults to {}.
use_deconv (bool, optional) – Whether user deconv for upsampling layer. Defaults to False.
dw_with_act (bool, optional) – Whether user relu after the depthwise conv in SeparableConv. Defaults to False.
output_scales (Sequence, optional) – The scale of each output layer. Defaults to (4, 8, 16, 32, 64).
- forward(inputs)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.necks.fast_scnn.FastSCNNNeck(in_channels: List[int], feat_channels: List[int], indexes: List[int], bn_kwargs: Optional[Dict] = None, scale_factor: int = 4, split_pooling: bool = False)¶
Upper neck module for segmentation.
- 参数
in_channels – channels of each input feature map
feat_channels – channels for featture maps.
indexes – indexes of inputs.
bn_kwargs – Dict for Bn layer.
scale_factor – scale factor for fusion.
split_pooling – Whehter split pooling. For bernoulli2.
- forward(inputs)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.necks.fpn.FPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None, bn_kwargs: Optional[Dict] = None)¶
- forward(features: List[torch.Tensor]) List[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.necks.pafpn.PAFPN(in_channels, out_channels, out_strides, num_outs, start_level=0, end_level=- 1, add_extra_convs=False, relu_before_extra_convs=False, norm_cfg=None)¶
Path Aggregation Network for Instance Segmentation.
This is an implementation of the PAFPN in Path Aggregation Network <https://arxiv.org/abs/1803.01534>.
- 参数
in_channels (List[int]) – Number of input channels per scale.
out_channels (int | Dict) – Output channels of each scale
out_strides (List[int]) – Stride of output feature map
num_outs (int) – Number of output scales.
start_level (int) – Index of the start input backbone level used to build the feature pyramid. Default: 0.
end_level (int) – Index of the end input backbone level (exclusive) to build the feature pyramid. Default: -1, which means the last level.
add_extra_convs (bool | str) –
If bool, it decides whether to add conv layers on top of the original feature maps. Default to False. If True, it is equivalent to add_extra_convs=’on_input’. If str, it specifies the source feature map of the extra convs. Only the following options are allowed:
’on_input’: Last feat map of neck inputs (i.e. backbone feature).
’on_lateral’: Last feature map after lateral convs.
’on_output’: The last output feature map after fpn convs.
relu_before_extra_convs (bool) – Whether to apply relu before the extra conv. Default: False.
norm_cfg (dict) – A dict of norm layer configuration. A typical norm_cfg can be {“norm_type”: “gn”, “num_groups”: 32, “affine”: True} or {“norm_type”: “bn”}. Default: None. If norm_cfg is none, no norm layer is used. If norm_cfg[“norm_type”] == “gn”, the group norm layer is used. If norm_cfg[“norm_type”] == “bn”, the batch norm layer is used.
- forward(inputs)¶
Forward function.
- class hat.models.necks.pafpn.VargPAFPN(in_channels: List[int], out_channels: int, out_strides: List[int], num_outs: int, bn_kwargs: Dict, start_level: int = 0, end_level: int = - 1, with_pafpn_conv: bool = False, varg_block_type: str = 'BasicMixVarGEBlock', group_base: int = 16)¶
Path Aggregation Network with BasicVargNetBlock or BasicMixVargNetBlock.
- 参数
in_channels – Number of input channels per scale.
out_channels – Output channels of each scale
out_strides – Stride of output feature map
num_outs – Number of output scales.
bn_kwargs – Dict for Bn layer.
start_level – Index of the start input backbone level used to build the feature pyramid. Default is 0.
level (end_level Index of the end input backbone) – build the feature pyramid. Default is -1, which means the last level.
with_pafpn_conv – Choice whether to use a extra 3x3 conv_block to the out features. Default is False.
varg_block_type – Choice varg block type from [“BasicVarGBlock”, “BasicMixVarGEBlock”], Default is “BasicMixVarGEBlock”.
group_base – groupbase for varg block. Default is 16.
- forward(inputs)¶
Forward function.
- class hat.models.necks.retinanet_fpn.RetinaNetFPN(in_strides: List[int], in_channels: List[int], out_strides: List[int], out_channels: List[int], fix_out_channel: Optional[int] = None)¶
FPN for RetinaNet.
The difference with FPN is that RetinaNetFPN has two extra convs correspond to stride 64 and stride 128 except the lateral convs.
- 参数
in_strides (list) – strides of each input feature map
in_channels (list) – channels of each input feature map, the length of in_channels should be equal to in_strides
out_strides (list) – strides of each output feature map, should be a subset of in_strides, and continuous (any subsequence of 2, 4, 8, 16, 32, 64 …). The largest stride in in_strides and out_strides should be equal
out_channels (list) – channels of each output feature maps the length of out_channels should be equal to out_strides
fix_out_channel (
int
, optional) – if set, there will be a 1x1 conv following each output feature map so that each final output has fix_out_channel channels
- forward(features: List[torch.Tensor]) List[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- init_weights()¶
Initialize the weights of FPN module.
- class hat.models.necks.second_neck.SECONDNeck(in_feature_channel: int, down_layer_nums: List[int], down_layer_strides: List[int], down_layer_channels: List[int], up_layer_strides: List[int], up_layer_channels: List[int], bn_kwargs: Optional[Dict] = None, use_relu6: bool = False, quantize: bool = False, quant_scale: float = 0.0078125)¶
Second FPN modules.
Implements the network structure of PointPillars: <https://arxiv.org/abs/1812.05784>
Although the structure is called backbone in the original paper, we follow the publicly available code structure and use it as a neck module.
Adapted from GitHub second.pytorch: <https://github.com/traveller59/second.pytorch>
- 参数
in_feature_channel – number of input feature channels.
down_layer_nums – number of layers for each down-sample stage.
down_layer_strides – stride for each down-sampling stage.
down_layer_channels – number of filters for each down-sample stage.
up_layer_strides – stride for each up-sample stage.
up_layer_channels – number of filters for each up-sampling stage.
bn_kwargs – batch norm kwargs.
use_relu6 – whether to use relu6.
quantize – whether to quantize the module.
quant_scale – init scale for Quantstub.
- forward(x: torch.Tensor, quant: bool = False)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.necks.unet.Unet(in_strides: List[int], out_strides: List[int], stride2channels: Dict[int, int], out_stride2channels: Optional[Dict[int, int]] = None, factor: int = 2, use_bias: bool = False, bn_kwargs: Optional[Dict] = None, group_base: int = 8, fusion_block_name: str = 'default')¶
Unet neck module.
- 参数
in_strides – contains the strides of feature maps from backbone.
out_strides – contains the strides of feature maps the neck output.
out_stride2channels – output stride to channel dict.
stride2channels – input stride to channel dict.
fusion_block_name – support FusionBlock and OnePathFusionBlock.
- forward(features)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.necks.yolov3.YOLOV3Neck(backbone_idx: list, in_channels_list: list, out_channels_list: list, bn_kwargs: dict, bias: bool = True)¶
Necks module of yolov3.
- 参数
backbone_idx (list) – Index of backbone output for necks.
in_channels_list (list) – List of input channels.
out_channels_list (list) – List of output channels.
bn_kwargs (dict) – Config dict for BN layer.
bias (bool) – Whether to use bias in module.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.necks.yolov3_group.YoloGroupNeck(backbone_idx: list, in_channels_list: list, out_channels_list: list, bn_kwargs: dict, bias: bool = True, head_group: bool = True)¶
Necks module of yolov3.
- 参数
backbone_idx – Index of backbone output for necks.
in_channels_list – List of input channels.
out_channels_list – List of output channels.
bn_kwargs – Config dict for BN layer.
bias – Whether to use bias in module.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.pointpillars.head.PointPillarsHead(in_channels: int = 128, num_classes: int = 1, anchors_num_per_class: int = 2, use_direction_classifier: bool = True, num_direction_bins: int = 2, box_code_size: int = 7)¶
Basic module of PointPillarsHead.
- 参数
in_channels – Channel number of input feature.
num_classes – Number of class.
anchors_num_per_class – Anchor number for per class.
use_direction_classifier – Whether to use direction.
num_direction_bin – Number of direction bins.
box_code_size – BoxCoder size.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.pointpillars.loss.PointPillarsLoss(num_classes: int, loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_bbox: Optional[torch.nn.modules.module.Module] = None, loss_dir: Optional[torch.nn.modules.module.Module] = None, pos_cls_weight: float = 1.0, neg_cls_weight: float = 1.0, num_direction_bins: int = 2, direction_offset: float = 0.0)¶
PointPillars Loss Module.
- 参数
num_classes – Number of classes
loss_cls – Classification loss module.
loss_bbox – Bbox regression loss module.
loss_dir – Direction loss module.
pos_cls_weight – Positive weight. Defaults to 1.0.
neg_cls_weight – Negative weight. Defaults to 1.0.
num_direction_bins – Number of direction. Defaults to 2.
direction_offset – The offset of BEV rotation angles. Defaults to 0.0.
- add_sin_difference(boxes1: torch.Tensor, boxes2: torch.Tensor)¶
Convert the rotation difference to difference in sine function.
- 参数
boxes1 – Original Boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.
boxes2 – Target boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.
- 返回
Rotation bbox by sin*cos. boxes2: Rotation bbox by cos*sin.
- 返回类型
boxes1
- forward(anchors: torch.Tensor, box_cls_labels: torch.Tensor, reg_targets: torch.Tensor, box_preds: torch.Tensor, cls_preds: torch.Tensor, dir_preds: torch.Tensor)¶
Forward pass, calculate losses.
- 参数
anchors – Anchors.
box_cls_labels – Bbox classification label.
reg_targets – 3D bbox targets.
box_preds – 3D bbox predictions.
cls_preds – Classification predictions.
dir_preds – Direction classification predictions.
- 返回
Classification losses. loc_loss: Box regression losses. dir_loss: Direction classification losses.
- 返回类型
cls_loss
- get_box_reg_loss(batch_anchors: torch.Tensor, box_cls_labels: torch.Tensor, reg_targets: torch.Tensor, box_preds: torch.Tensor, dir_cls_preds: Optional[torch.Tensor] = None)¶
Calculate bbox regression and direction classification losses.
- 参数
batch_anchors – Anchors.
box_cls_labels – Bbox classification label.
reg_targets – 3D bbox targets.
box_preds – 3D bbox predictions.
dir_cls_preds – Direction classification predictions.
- 返回
Reduced bbox regression loss. dir_loss_reduced: Reduced direction classification loss.
- 返回类型
loc_loss_reduced
- get_cls_loss(cls_preds: torch.Tensor, box_cls_labels: torch.Tensor)¶
Calculate classification loss.
- 参数
cls_preds – Prediction class.
box_cls_labels – Bbox classification label.
- 返回
Reduced classification loss.
- 返回类型
cls_loss_reduced
- get_direction_target(anchors: torch.Tensor, reg_targets: torch.Tensor, one_hot: bool = True, dir_offset: float = 0.0)¶
Encode direction to 0 ~ num_bins-1.
- 参数
anchors – Anchors.
reg_targets – Bbox regression targets.
one_hot – Whether to encode as one hot. Default to True.
dir_offset – Direction offset. Default to 0.
- 返回
Encoded direction targets.
- 返回类型
dir_cls_targets
- get_pos_neg_loss(cls_loss: torch.Tensor, labels: torch.Tensor)¶
Calculate positive and negative object losses.
- 参数
cls_loss – Classification loss.
labels – Classification labels.
- 返回
Positive classification losses. cls_neg_loss: Negative classification losses.
- 返回类型
cls_pos_loss
- one_hot_f(tensor, depth, dim: int = - 1, on_value: float = 1.0, dtype=torch.float32)¶
Encode to one-hot.
- 参数
tensor – Input tensor to be one-hot encoded.
depth – Number of classes for one-hot encoding.
dim – Dimension along which to perform one-hot encoding.
on_value – Value to fill in the “on” positions.
dtype – Data type of the resulting tensor.
- 返回
one-hot encoded tensor.
- 返回类型
tensor_onehot
- prepare_loss_weights(labels: torch.Tensor, dtype=torch.float32)¶
Calculate classification and regression weights.
- 参数
labels – Classification labels.
dtype – Data type of the resulting tensor.
- 返回
Classification weights. reg_weights: Regression weights. cared: cared mask.
- 返回类型
cls_weights
- class hat.models.task_modules.pointpillars.postprocess.PointPillarsPostProcess(num_classes: int, box_coder: int, use_direction_classifier: bool = True, num_direction_bins: int = 2, direction_offset: float = 0.0, use_rotate_nms: bool = False, nms_pre_max_size: int = 1000, nms_post_max_size: int = 300, nms_iou_threshold: float = 0.5, score_threshold: float = 0.05, post_center_limit_range: List[float] = [0, - 39.68, - 5, 69.12, 39.68, 5], max_per_img: int = 100)¶
PointPillars PostProcess Module.
- 参数
num_classes – Number of classes.
box_coder – BoxCeder module.
use_direction_classifier – Whether to use direction.
num_direction_bins – Number of direction for per anchor. Defaults to 2.
direction_offset – Direction offset. Defaults to 0.0.
use_rotate_nms – Whether to use rotated nms.
nms_pre_max_size – Max size of nms preprocess.
nms_post_max_size – Max size of nms postprocess.
nms_iou_threshold – IoU threshold of nms.
score_threshold – Score threshold.
post_center_limit_range – PointCloud range.
max_per_img – Max number of object per image.
- forward(box_preds: torch.Tensor, cls_preds: torch.Tensor, dir_preds: torch.Tensor, anchors: torch.Tensor)¶
Forward pass.
- 参数
box_preds – BBox predictions.
cls_preds – Classification predictions.
dir_preds – Direction classification predictions.
anchors – Anchors.
- 返回
Batch predictions.
- 返回类型
detections
- nms(boxes: torch.Tensor, scores: torch.Tensor, iou_threshold: float, pre_max_size: Optional[int] = None, post_max_size: Optional[int] = None)¶
NMS.
- 参数
boxes – Shape(N, 4), boxes in (x1, y1, x2, y2) format.
scores – Shape(N), scores.
iou_threshold – IoU threshold.
pre_nms_top_n – Get top n boxes by score before nms.
output_num – Get top n boxes by score after nms.
- 返回
Indices.
- class hat.models.task_modules.pointpillars.preprocess.BatchVoxelization(pc_range: List[float], voxel_size: List[float], max_voxels_num: Union[tuple, int] = 20000, max_points_in_voxel: int = 30)¶
Batch voxelization.
- 参数
pc_range – Point cloud range.
voxel_size – voxel size, (x, y, z) scale.
max_voxels_num – Max voxel number to use. Defaults to 20000.
max_points_in_voxel – Number of points in per voxel. Defaults to 30.
- forward(points_lst: List[torch.Tensor], is_deploy=False)¶
Forward pass.
- 参数
points_lst – List of point cloud data.
is_deploy – Whether is deploy pipeline. Defaults to False.
- 返回
Voxel features map. Coors of voxel feature. Number of point in per voxel.
- class hat.models.task_modules.pointpillars.preprocess.PointPillarsPreProcess(pc_range: List[float], voxel_size: List[float], max_voxels_num: int = 20000, max_points_in_voxel: int = 30, norm_range: Optional[List] = None, norm_dims: Optional[List] = None)¶
Point Pillars preprocess, include voxelization and extend features.
- 参数
pc_range – Point cloud range.
voxel_size – voxel size, (x, y, z) scale.
max_voxels_num – Max voxel number to use. Defaults to 20000.
max_points_in_voxel – Number of points in per voxel. Defaults to 30.
norm_range – Feature range, like [x_min, y_min, z_min, …, x_max, y_max, z_max, …].
norm_dims – Dims to do normalize.
- forward(points_lst, is_deploy=False)¶
Forward pass.
- 参数
points_lst – List of point cloud data.
is_deploy – Whether is deploy pipeline. Defaults to False.
- 返回
Voxel features map. Coors of voxel feature. Number of point in per voxel.
- class hat.models.task_modules.carfusion_keypoints.heatmap_decoder.HeatmapDecoder(scale: int, mode: str = 'diff_sign', k_size: int = 5)¶
Decode heatmap prediction to landmark coordinates.
- 参数
scale – Same as feat stride, the Scale of heatmap coordinates relative to the original image.
mode – The decoder method, currently support “diff_sign” and “averaged” In the ‘averaged’ mode, the coordinates and heatmap values of the area surrounding the maximum point on the heatmap, with a size of k_size x k_size, are weighted to obtain the coordinates of the key point.
k_size – kernel size used for “averaged” decoder.
- forward(heatmap: torch.Tensor)¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.carfusion_keypoints.keypoint_head.DeconvDecoder(input_index, in_channels: int, out_channels: int, num_conv_layers, num_deconv_filters: List[int], num_deconv_kernels: List[int], final_conv_kernel: int)¶
Deconder Head consists of multi deconv layers.
- 参数
input_index – The stage index of the pre backbone outputs.
in_channels – Number of input channels of the feature output from backbone.
out_channels – Number of out channels of the DeconvDecoder.
num_conv_layers – Number of convolutional layers for decoder.
num_deconv_filters – List of the number of filters for deconv layers
num_deconv_kernels – List of the kernel sizes for deconv layers.
final_conv_kernel – Kernel size of the final convolutional layer.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.centerpoint.bbox_coders.CenterPointBBoxCoder(pc_range: List[float], out_size_factor: int, voxel_size: List[float], post_center_range: Optional[List[float]] = None, max_num: Optional[int] = 100, score_threshold: Optional[float] = None)¶
Bbox coder for CenterPoint.
- 参数
pc_range – Range of point cloud.
out_size_factor – Downsample factor of the model.
voxel_size – Size of voxel.
post_center_range – Limit of the center. Default: None.
max_num – Max number to be kept. Default: 100.
score_threshold – Threshold to filter boxes based on score. Default: None.
- decode(heat: torch.Tensor, rot_sine: torch.Tensor, rot_cosine: torch.Tensor, hei: torch.Tensor, dim: torch.Tensor, vel: torch.Tensor, reg: Optional[torch.Tensor] = None, task_id: int = - 1)¶
Decode bboxes.
- 参数
heat – Heatmap with the shape of [B, N, W, H].
rot_sine – Sine of rotation with the shape of [B, 1, W, H].
rot_cosine – Cosine of rotation with the shape of [B, 1, W, H].
hei – Height of the boxes with the shape of [B, 1, W, H].
dim – Dim of the boxes with the shape of [B, 3, W, H].
vel – Velocity with the shape of [B, 2, W, H].
reg – Regression value of the boxes in 2D with the shape of [B, 2, W, H]. Default: None.
task_id – Index of task. Default: -1.
- 返回
Decoded boxes.
- 返回类型
list[dict]
- class hat.models.task_modules.centerpoint.decoder.CenterPointDecoder(class_names: List[str], tasks: List[Dict], bev_size: Tuple[float], norm_bbox: bool = True, max_num: int = 50, use_max_pool: bool = True, max_pool_kernel: Optional[int] = 3, out_size_factor: int = 4, score_threshold: float = 0.1, nms_type: Optional[List[str]] = None, min_radius: Optional[List[int]] = None, nms_threshold: Optional[float] = None, pre_max_size: int = 1000, post_max_size: int = 100, decode_to_ego: bool = True)¶
The CenterPoint Decoder.
- 参数
class_names – List of calss name for detection task
tasks – List of tasks
bev_size – Bev view size.
norm_bbox – Whether using normalize for dim of bbox.
max_num – Maximun number for bboxes of single task.
use_max_pool – Whether using max pool as nms.
max_pool_kernel – Kernel size if using max pool for nms.
out_size_factor – Factor for output bbox.
score_threshold – Treshold for filtering bbox of low score.
nms_type – Which NMS type used for single task. Choose [“rotate”, “”circle”]
min_radius – Min radius for circle nms.
nms_threshold – NMS threshold.
pre_max_size – Max size before nms.
post_max_size – Max size after nms.
decode_to_ego – Whether decoding to ego coordinate.
- forward(preds: Sequence[torch.Tensor], meta_data: Dict[str, Any])¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.centerpoint.head.CenterPointHead(in_channels: int, tasks: List[dict], share_conv_channels: int, share_conv_num: int, common_heads: Dict, num_heatmap_convs: int = 2, bn_kwargs=None, **kwargs)¶
CenterPointHead module.
- 参数
in_channels – In channels for each task.
tasks – List of task info.
share_conv_channels – Channels for share conv.
share_conv_num – Number of convs for shared.
common_heads – common head for each task.
num_heatmap_convs – Number of heatmap convs.
bn_kwargs – Kwargs of bn layer
final_kernel – Kernerl size for final kernel.
- forward(feats)¶
Perform the forward pass for extracted features.
- 参数
feats – Input feature(s) to the model. If a sequence of features is provided, only the first one will be used.
- 返回
A list of outputs from the individual task heads.
- 返回类型
rets
- fuse_model() None ¶
Perform model fusion on the modules.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.centerpoint.head.DepthwiseSeparableCenterPointHead(in_channels: int, tasks: List[dict], share_conv_channels: int, share_conv_num: int, common_heads: Dict, num_heatmap_convs: int = 2, bn_kwargs=None, **kwargs)¶
- class hat.models.task_modules.centerpoint.head.VargCenterPointHead(group_base=8, merge_branch=False, factor=2, dw_with_relu=True, pw_with_relu=False, **kwargs)¶
- class hat.models.task_modules.centerpoint.target.CenterPointLidarTarget(grid_size: List[int], voxel_size: List[float], point_cloud_range: List[float], tasks: List[dict], dense_reg: int = 1, max_objs: int = 500, gaussian_overlap: float = 0.1, min_radius: int = 2, out_size_factor: int = 4, norm_bbox: bool = True, with_velocity: bool = False)¶
Generate CenterPoint targets.
- 参数
grid_size – List of grid sizes (W, H, D).
voxel_size – List of voxel sizes (dx, dy, dz).
point_cloud_range – List specifying the point cloud range (x_min, y_min, z_min, x_max, y_max, z_max).
tasks – List of task dictionaries.
dense_reg – Density of regression targets.
max_objs – Maximum number of objects.
gaussian_overlap – Gaussian overlap for generating heatmap targets.
min_radius – Minimum radius for generating heatmap targets.
out_size_factor – Output size factor.
norm_bbox – Whether to use normalized bounding boxes.
with_velocity – Whether to include velocity information in targets.
- forward(gt_bboxes_3d, gt_labels_3d)¶
Generate CenterPoint training targets for a batch of samples.
- 参数
gt_bboxes_3d – Ground truth 3D bounding boxes.
gt_labels_3d – Labels of the boxes.
- 返回
Heatmap scores.
Ground truth boxes.
Indexes indicating the position of the valid boxes.
Masks indicating which boxes are valid.
- 返回类型
Tuple of target lists containing
- get_targets_single(gt_bboxes_3d, gt_labels_3d)¶
Generate training targets for a single sample.
- 参数
gt_bboxes_3d – Ground truth 3D bounding boxes.
gt_labels_3d – Labels of the boxes.
- 返回
Heatmap scores.
Ground truth boxes.
Indexes indicating the position of the valid boxes.
Masks indicating which boxes are valid.
- 返回类型
Tuple of target lists containing
- class hat.models.task_modules.centerpoint.target.CenterPointTarget(class_names: Sequence[str], tasks: Sequence[dict], gaussian_overlap: float = 0.1, min_radius: int = 2, out_size_factor: int = 4, norm_bbox: bool = True, max_num: int = 500, bbox_weight: Optional[float] = None, use_heatmap: bool = True)¶
Generate centerpoint targets for bev task.
- 参数
class_names – List of class names for bev detection.
tasks – List of tasks
gaussian_overlap – Gaussian overlap for genenrate heatmap target.
min_radius – Min values for radius.
out_size_factor – Output size for factor.
norm_bbox – Whether using norm bbox.
max_num – Max number for bbox.
bbox_weight – Weight for bbox meta.
- forward(label, preds, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.centerpoint.loss.CenterPointLoss(loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_bbox: Optional[torch.nn.modules.module.Module] = None, with_velocity: bool = False, code_weights: Optional[list] = None)¶
CenterPoint loss module.
- 参数
loss_cls – Classification loss module. Default: None.
loss_bbox – Regression loss module. Default: None.
with_velocity – Whether velocity information is included.
code_weights – Weights for the regression loss. Default: None.
- forward(heatmaps: List[torch.Tensor], anno_boxes: List[torch.Tensor], inds: List[torch.Tensor], masks: List[torch.Tensor], preds_dicts: List[Dict[str, torch.Tensor]]) Dict[str, torch.Tensor] ¶
Compute CenterPoint loss.
- 参数
heatmaps – List of heatmap tensors.
anno_boxes – List of ground truth annotation boxes.
inds – List of indexes indicating the position of the valid boxes.
masks – List of masks indicating which boxes are valid.
preds_dicts – List of predicted tensors.
- 返回
A dictionary containing loss components.
- 返回类型
Dict
- class hat.models.task_modules.centerpoint.post_process.CenterPointPostProcess(tasks: Optional[List[dict]] = None, norm_bbox: bool = True, bbox_coder: Optional[hat.models.task_modules.centerpoint.bbox_coders.CenterPointBBoxCoder] = None, max_pool_nms: bool = False, score_threshold: float = 0.0, post_center_limit_range: Optional[List[float]] = None, min_radius: Optional[List[float]] = None, out_size_factor: int = 1, nms_type: str = 'rotate', pre_max_size: int = 1000, post_max_size: int = 83, nms_thr: float = 0.2, use_max_pool: bool = False, max_pool_kernel: Optional[int] = 3, box_size: Optional[int] = 9)¶
CenterPoint PostProcess Module.
- 参数
tasks – Task information including class number and class names. Default: None.
norm_bbox – Whether to normalize bounding boxes. Default: True.
bbox_coder – BoxCoder module. Default: None.
max_pool_nms – Whether to use max-pooling NMS. Default: False.
score_threshold – Score threshold for filtering detections.
post_center_limit_range – Point cloud range. Default: None.
min_radius – Minimum radius. Default: None.
out_size_factor – Output size factor. Default: 1.
nms_type – NMS type, either “rotate” or “circle”. Default: “rotate”.
pre_max_size – Maximum size of NMS preprocess. Default: 1000.
post_max_size – Maximum size of NMS postprocess. Default: 83.
nms_thr – IoU threshold for NMS. Default: 0.2.
use_max_pool – Whether to use max-pooling during NMS. Default: False.
max_pool_kernel – Max-pooling kernel size. Default: 3.
box_size – Size of bounding boxes. Default: 9.
- forward(preds_dicts)¶
Generate bboxes from bbox head predictions.
- 参数
preds_dicts – Prediction results.
- 返回
Decoded bbox, scores and labels after nms.
- 返回类型
ret_list
- get_task_detections(num_class_with_bg: int, batch_cls_preds: List[torch.Tensor], batch_reg_preds: List[torch.Tensor], batch_cls_labels: List[torch.Tensor])¶
Rotate nms for each task.
- 参数
num_class_with_bg – Number of classes for the current task.
batch_cls_preds – Prediction score with the shape of [N].
batch_reg_preds – Prediction bbox with the shape of [N, 9].
batch_cls_labels – Prediction label with the shape of [N].
- 返回
contains the following keys:
- -bboxes: Prediction bboxes after nms with the
shape of [N, 9].
- -scores: Prediction scores after nms with the
shape of [N].
- -labels: Prediction labels after nms with the
shape of [N].
- 返回类型
predictions_dicts
- class hat.models.task_modules.centerpoint.pre_process.CenterPointPreProcess(pc_range: List[float], voxel_size: List[float], max_voxels_num: Union[tuple, int] = 30000, max_points_in_voxel: int = 20, norm_range: Optional[List] = None, norm_dims: Optional[List] = None)¶
Centerpoint preprocess, include voxelization and features encoder.
- 参数
pc_range – Point cloud range.
voxel_size – voxel size, (x, y, z) scale.
max_voxels_num – Max voxel number to use. Defaults to 30000.
max_points_in_voxel – Number of points in per voxel. Defaults to 20.
norm_range – Feature range, like [x_min, y_min, z_min, …, x_max, y_max, z_max, …].
norm_dims – Dims to do normalize.
- forward(points_lst: List[torch.Tensor], is_deploy: bool = False) Tuple[torch.Tensor, torch.Tensor] ¶
Forward pass of Centerpoint preprocess.
- 参数
points_lst – List of input point clouds.
is_deploy – Flag indicating whether the model is in deployment mode. Default is False.
- 返回
features: Voxel-encoded feature map.
coors_batch: Voxel coordinates for the batch.
- 返回类型
A tuple containing the following elements
- class hat.models.task_modules.deeplab.head.Deeplabv3plusHead(in_channels: int, c1_index: int, c1_in_channels: int, feat_channels: int, num_classes: int, dilations: List[int], num_repeats: List[int], argmax_output: Optional[bool] = False, dequant_output: Optional[bool] = True, int8_output: Optional[bool] = True, bn_kwargs: Optional[Dict] = None, dropout_ratio: Optional[float] = 0.1, upsample_output_scale: Optional[int] = None, upsample_decode_scale: Optional[int] = 4, bias=True)¶
Head Module for FCN.
- 参数
in_channels – Input channels.
c1_index – Index for c1 input.
c1_in_channels – In channels of c1.
feat_channels – Channels for the module.
num_classes – Number of classes.
dilations – List of dilations for aspp.
num_repeats – List of repeat for each branch of ASPP.
argmax_output – Whether conduct argmax on output. Default: False.
dequant_output – Whether to dequant output. Default: True
int8_output – If True, output int8, otherwise output int32. Default: False.
bn_kwargs – Extra keyword arguments for bn layers. Default: None.
dropout_ratio – Ratio for dropout during training. Default: 0.1.
upsample_decode_scale – upsample scale to c1. Default is 4.
upsample_output_scale – Output upsample scale, only used in qat model, default is None.
bias – Whether has bias. Default: True.
- forward(inputs)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.detr.matcher.HungarianMatcher(cost_class: float = 1, cost_bbox: float = 1, cost_giou: float = 1, use_focal: bool = False, alpha: float = 0.25, gamma: float = 2.0)¶
Compute an assignment between targets and predictions.
For efficiency reasons, the targets don’t include the no_object. Because of this, in general, there are more predictions than targets. In this case, we do a 1-to-1 matching of the best predictions, while the others are un-matched (and thus treated as non-objects).
- 参数
cost_class – weight of the classification error.
cost_bbox – weight of the L1 error of the bbox coordinates.
cost_giou – weight of the giou loss of the bounding box.
use_focal – whether to use focal loss.
alpha – A weighting factor for pos-sample, (1-alpha) is for neg-sample.
gamma – Gamma used in focal loss to compress the contribution of easy examples.
- 返回
- index_i is the indices of the selected predictions (in order)
index_j is the indices of the selected targets (in order)
- For each batch element, it holds:
len(index_i) = len(index_j) = min(num_queries, num_target_boxes)
- 返回类型
A list, containing tuples of (index_i, index_j) where
- forward(outputs, data)¶
Perform the matching.
- 参数
outputs – a dict containing at least these entries: “pred_logits”: Tensor of dim [bs, num_queries, num_classes] “pred_boxes”: Tensor of dim [bs, num_queries, 4]
data – a dict containing at least these entries: “gt_classes”: Tensor of dim [num_target_boxes] “boxes”: Tensor of dim [num_target_boxes, 4]
- class hat.models.task_modules.detr.criterion.DetrCriterion(num_classes: int, dec_layers: int = 6, cost_class: float = 1.0, cost_bbox: float = 5.0, cost_giou: float = 2.0, loss_ce: float = 1.0, loss_bbox: float = 5.0, loss_giou: float = 2.0, eos_coef: float = 0.1, losses: Sequence[str] = ('labels', 'boxes', 'cardinality'), aux_loss: bool = True)¶
This class computes the loss for DETR.
- 参数
num_classes – number of object categories.
dec_layers – number of the decoder layers.
cost_class – weight of the classification error in the matching cost.
cost_bbox – weight of the L1 error of the bbox in the matching cost.
cost_giou – weight of the giou loss of the bbox in the matching cost.
loss_class – weight of the classification loss.
loss_bbox – weight of the L1 loss of the bbox.
loss_giou – weight of the giou loss of the bbox.
eos_coef – classification weight applied to the no-object category.
losses – list of all the losses to be applied.
aux_loss – True if auxiliary decoding losses are to be used.
- forward(outs, targets)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- loss_boxes(outputs, targets, indices, num_boxes)¶
Compute the losses related to the bounding boxes.
the L1 regression loss and the GIoU loss. Targets dicts must contain the key “gt_bboxes”, which containing a tensor of dim [nb_target_boxes, 4]. Target boxes are expected in format (center_x, center_y, w, h), which normalized by the image size.
- loss_cardinality(outputs, targets, indices, num_boxes)¶
Compute absolute error in the number of predicted non-empty boxes.
This is not really a loss, it is intended for logging purposes only, It doesn’t propagate gradients.
- loss_labels(outputs, targets, indices, num_boxes, log=True)¶
Classification loss (NLL).
- class hat.models.task_modules.detr.head.DetrHead(transformer: torch.nn.modules.module.Module, pos_embed: torch.nn.modules.module.Module, num_classes: int = 80, in_channels: int = 2048, max_per_img: int = 100, int8_output: bool = False, dequant_output: bool = True, set_int16_qconfig: bool = False, input_shape: tuple = (800, 1332))¶
Implements the DETR transformer head.
See paper: End-to-End Object Detection with Transformers for details.
- 参数
transformer – transformer module.
pos_embed – position encoding module.
num_classes – Number of categories excluding the background.
in_channels – Number of channels in the input feature map.
max_per_img – Number of object queries, ie detection slot. The maximal number of objects DETR can detect in a single image. For COCO, we recommend 100 queries.
int8_output – If True, output int8, otherwise output int32. Default: False.
dequant_output – Whether to dequant output. Default: True.
set_int16_qconfig – Whether to set int16 qconfig. Default: False.
input_shape – shape used to construct masks for inference.
- forward(feats, img_meta)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- forward_single(x, img_meta)¶
Forward features of a single scale levle.
- 参数
x – FPN feature maps of the specified stride.
img_meta – Dict containing keys of different image size. batch_input_shape means image size after padding while img_shape means image size after data augment, but before padding.
- class hat.models.task_modules.detr.post_process.DetrPostProcess¶
Convert model’s output into the format expected by evaluation.
- forward(outs, targets)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.detr.transformer.Transformer(embed_dims: int = 512, num_heads: int = 8, num_encoder_layers: int = 6, num_decoder_layers: int = 6, feedforward_channels: int = 2048, dropout: float = 0.1, act_layer: torch.nn.modules.module.Module = <class 'torch.nn.modules.activation.ReLU'>, normalize_before: bool = False, return_intermediate_dec: bool = False)¶
Implements the DETR transformer.
Following the official DETR implementation, this module copy-paste from torch.nn.Transformer with modifications:
positional encodings are passed in MultiheadAttention
extra LN at the end of encoder is removed
decoder returns a stack of activations from all decoding layers
See paper: End-to-End Object Detection with Transformers for details.
- 参数
embed_dims – The feature dimension.
num_heads – Parallel attention heads.
num_encoder_layers – Number of TransformerEncoderLayer.
num_decoder_layers – Number of TransformerDecoderLayer.
feedforward_channels – The hidden dimension for FFNs used in both encoder and decoder.
dropout – Probability of an element to be zeroed. Default 0.1.
act_layer – Activation module for FFNs used in both encoder and decoder. Default ReLU.
normalize_before – Whether the normalization layer is ordered first in the encoder and decoder. Default False.
return_intermediate_dec – Whether to return the intermediate output from each TransformerDecoderLayer or only the last TransformerDecoderLayer. Default False. If True, the returned hs has shape [num_decoder_layers, bs, num_query, embed_dims]. If False, the returned hs will have shape [1, bs, num_query, embed_dims].
- forward(x, mask, query_embed, pos_embed)¶
Forward function for Transformer.
- 参数
x – Input query with shape [bs, c, h, w] where c = embed_dims.
mask – The key_padding_mask used for encoder and decoder, with shape [bs, h, w].
query_embed – The query embedding for decoder, with shape [num_query, c].
pos_embed – The positional encoding for encoder and decoder, with the same shape as x.
- 返回
out_dec: decoder output. If return_intermediate_dec is True, output has shape [num_dec_layers, bs, num_query, embed_dims], else has shape [1, bs, num_query, embed_dims]. memory: Output results from encoder, with shape [bs, embed_dims, h, w].
- 返回类型
tuple, containing the following tensor
- init_weights()¶
Initialize the transformer weights.
- class hat.models.task_modules.detr3d.head.Detr3dDecoder(num_layer: int = 6, **kwargs)¶
Detr3d decoder module.
- 参数
num_layer – Number of layers.
- forward(query: torch.Tensor, value: torch.Tensor, query_pos: torch.Tensor, reference_points: torch.Tensor, masks: torch.Tensor) List[torch.Tensor] ¶
Forward pass of the module.
- 参数
query – The query tensor.
value – The value tensor.
query_pos – The positional encoding of the query tensor.
reference_points – The reference points tensor.
masks – The masks tensor.
- 返回
The list of output tensors from each decoding layer.
- fuse_model() None ¶
Perform model fusion on the modules.
- class hat.models.task_modules.detr3d.head.Detr3dHead(transformer: torch.nn.modules.module.Module, num_query: int = 900, query_align: int = 8, embed_dims: int = 256, num_cls_fcs: int = 2, num_reg_fcs: int = 2, reg_out_channels: int = 10, cls_out_channels: int = 10, bev_range: Optional[Tuple[float]] = None, num_levels: int = 4, int8_output: bool = False, dequant_output: bool = True)¶
Detr3d Head module.
- 参数
transformer – Transformer module for Detr3d.
num_query – Number of query.
query_align – Align number for query.
embed_dims – embeding channels.
num_cls_fcs – Number of classification layer.
num_reg_fcs – Number of classification layer.
reg_out_channels – Number of regression outoput channels.
cls_out_channels – Numbger of classification output channels,
bev_range – BEV range.
num_levels – Nunmber of levels for multiscale inputs.
int8_output – Whether output is int8.
dequant_output – Whether dequant output.
- build_res_list(feats: List[torch.Tensor]) Tuple[List[torch.Tensor], List[torch.Tensor]] ¶
Build the list of output tensors.
- 参数
feats – The list of feature tensors.
reference_points – The reference points tensor.
- 返回
The list of output tensors for classification and regression branches.
- forward(feats: List[torch.Tensor], meta: Dict, compile_model: bool = False) List[torch.Tensor] ¶
Forward pass of the module.
- 参数
feats – The feature tensor.
meta – The metadata dictionary.
compile_model – Whether in compile model.
- 返回
The list of output tensors and the reference points tensor.
- fuse_model() None ¶
Perform model fusion on the modules.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.detr3d.head.Detr3dTransformer(decoder: torch.nn.modules.module.Module, embed_dims: int = 256, num_views: int = 6, mode: str = 'bilinear', padding_mode: str = 'zeros', grid_quant_scales: Optional[List[float]] = None, homography: Optional[torch.Tensor] = None)¶
Detr3d Transfomer module.
- 参数
decoder – Decoder modules.
embed_dims – Embeding dims for output.,
num_views – Number of views for input,
mode – Mode for grid sample.
padding_mode – Padding mode for grid sample.
grid_quant_scales – Quanti scale for grid sample.
homography – Homegraphy for view transformation.
- forward(feats: List[torch.Tensor], query_embed: torch.Tensor, pos_embed: torch.Tensor, meta: Dict, bev_range: List[float], compile_model: bool) Tuple[torch.Tensor, torch.Tensor] ¶
Forward pass of the module.
- 参数
feats – The feature tensor.
query_embed – The query embedding tensor.
pos_embed – The positional embedding tensor.
meta – The metadata dictionary.
bev_range – The BEV (Bird’s Eye View) range.
compile_model – A flag indicating whether to use pre-compiled homography matrix or use it from metadata.
- 返回
The output tensor and the reference points tensor.
- fuse_model() None ¶
Perform model fusion on the modules.
- init_weights() None ¶
Initialize the weights.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.detr3d.post_process.Detr3dPostProcess(bev_range, max_num: int = 100, score_threshold: float = - 1.0)¶
The Detr3d PostProcess.
- 参数
max_num – Max number of output.
score_threshold – Score threshold for output.
- forward(cls_preds: torch.Tensor, reg_preds: torch.Tensor, reference_points: torch.Tensor) torch.Tensor ¶
Forward pass of the module.
- 参数
cls_preds – The list of predicted classification tensors.
reg_preds – The list of predicted regression tensors.
- 返回
The list of decoded bounding box tensors.
- class hat.models.task_modules.detr3d.target.Detr3dTarget(cls_cost: torch.nn.modules.module.Module, reg_cost: torch.nn.modules.module.Module, bev_range, num_classes: int = 10, bbox_weight: Optional[float] = None)¶
Generate detr3d targets.
- 参数
cls_cost – classification cost module.
reg_cost – regression cost module.
num_classes – Number of calassification.
bbox_weight – Weight for bbox meta.
- forward(label: torch.Tensor, cls_preds: torch.Tensor, reg_preds: torch.Tensor, reference_points: torch.Tensor) Tuple[Dict, Dict] ¶
Forward pass of the module.
- 参数
label – The label tensor.
cls_preds – The predicted classification tensor.
reg_preds – The predicted regression tensor.
- 返回
Dictionaries containing the target values for the classification and regression branches.
- class hat.models.task_modules.fcn.decoder.FCNDecoder(upsample_output_scale: int = 8, use_bce: bool = False, bg_cls: int = 0, bg_threshold: float = 0.25)¶
FCN Decoder.
- 参数
upsample_output_scale – Output upsample scale. Default: 8.
use_bce – Whether using binary crosse entrypy. Default: False.
bg_cls – Background classes id. Default: 0.
bg_threshold – Background threshold. Default: 0.25.
- forward(pred)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcn.head.DepthwiseSeparableFCNHead(in_channels, feat_channels, num_convs=1, **kwargs)¶
- class hat.models.task_modules.fcn.head.FCNHead(input_index: int, in_channels: int, feat_channels: int, num_classes: int, dropout_ratio: Optional[float] = 0.1, int8_output: Optional[bool] = False, argmax_output: Optional[bool] = False, dequant_output: Optional[bool] = True, upsample_output_scale: Optional[int] = None, num_convs: Optional[int] = 2, bn_kwargs: Optional[Dict] = None)¶
Head Module for FCN.
- 参数
input_index – Index of inputs.
in_channels – Input channels.
feat_channels – Channels for the module.
num_classes – Number of classes.
dropout_ratio – Ratio for dropout during training. Default: 0.1.
int8_output – If True, output int8, otherwise output int32. Default: False.
argmax_output – Whether conduct argmax on output. Default: False.
dequant_output – Whether to dequant output. Default: True.
upsample_output_scale – Output upsample scale. Default: None.
num_convs – number of convs in head. Default: 2.
bn_kwargs – Extra keyword arguments for bn layers. Default: None.
- forward(inputs: List[torch.Tensor])¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcn.target.FCNTarget(num_classes: Optional[int] = 19)¶
Generate Target for FCN.
- 参数
num_classes – Number of classes. Defualt: 19.
- forward(label: torch.Tensor, pred: torch.Tensor) dict ¶
- 参数
label – data Tenser.(n, h, w)
pred – Output Tenser. (n, c, h, w).
- 返回
Loss inputs.
- 返回类型
dict
- class hat.models.task_modules.fcos.target.DynamicFcosTarget(strides: Sequence[int], topK: int, loss_cls: torch.nn.modules.module.Module, loss_reg: torch.nn.modules.module.Module, cls_out_channels: int, background_label: int, center_sampling: bool = False, center_sampling_radius: float = 2.5, bbox_relu: bool = False)¶
Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X.
- 参数
strides – Strides of points in multiple feature levels.
topK – Number of positive sample for each ground truth to keep.
cls_out_channels – Out_channels of cls_score.
background_label – Label ID of background, set as num_classes.
loss_cls – Loss for cls to choose positive target.
loss_reg – Loss for reg to choose positive target.
center_sampling – Whether to perform center sampling.
center_sampling_radius – The radius of the center sampling area.
bbox_relu – Whether apply relu to bbox preds.
- forward(label, pred, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.target.DynamicVehicleSideFcosTarget(strides: Sequence[int], topK: int, loss_cls: torch.nn.modules.module.Module, loss_reg: torch.nn.modules.module.Module, cls_out_channels: int, background_label: int, center_sampling: bool = False, center_sampling_radius: float = 2.5, bbox_relu: bool = False, decouple_h: bool = False)¶
Generate cls and box training targets for FCOS based on simOTA label assignment strategy used in YOLO-X.
- 参数
strides – Strides of points in multiple feature levels.
topK – Number of positive sample for each ground truth to keep.
cls_out_channels – Out_channels of cls_score.
background_label – Label ID of background, set as num_classes.
loss_cls – Loss for cls to choose positive target.
loss_reg – Loss for reg to choose positive target.
center_sampling – Whether to perform center sampling.
center_sampling_radius – The radius of the center sampling area.
bbox_relu – Whether apply relu to bbox preds.
decouple_h – Whether decouple height when calculating targets.
- forward(label, pred, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.target.FCOSTarget(strides: Tuple[int, ...], regress_ranges: Tuple[Tuple[int, int], ...], cls_out_channels: int, background_label: int, norm_on_bbox: bool = True, center_sampling: bool = True, center_sample_radius: float = 1.5, use_iou_replace_ctrness: bool = False, task_batch_list: Optional[List[int]] = None)¶
Generate cls and reg targets for FCOS in training stage.
- 参数
strides – Strides of points in multiple feature levels.
regress_ranges – Regress range of multiple level points.
cls_out_channels – Out_channels of cls_score.
background_label – Label ID of background, set as num_classes.
center_sampling – If true, use center sampling.
center_sample_radius – Radius of center sampling. Default: 1.5.
norm_on_bbox – If true, normalize the regression targets with FPN strides.
use_iou_replace_ctrness – If true, use iou as box quality assessment method, else use ctrness. Default: false.
task_batch_list – Mask for different label source dataset.
- forward(label, pred, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.target.FCOSTarget4RPNHead(strides: Tuple[int, ...], regress_ranges: Tuple[Tuple[int, int], ...], cls_out_channels: int, background_label: int, norm_on_bbox: bool = True, center_sampling: bool = True, center_sample_radius: float = 1.5, use_iou_replace_ctrness: bool = False, soft_label: bool = False, task_batch_list: Optional[List[int]] = None, reference_anchor_width: int = 3, reference_anchor_height: int = 3)¶
Generate fcos-style cls and reg targets for RPNHead and HingeLoss.
- 参数
strides – Strides of points in multiple feature levels.
regress_ranges – Regress range of multiple level points.
cls_out_channels – Out_channels of cls_score.
background_label – Label ID of background, set as num_classes.
center_sampling – If true, use center sampling.
center_sample_radius – Radius of center sampling. Default: 1.5.
norm_on_bbox – If true, normalize the regression targets with FPN strides.
use_iou_replace_ctrness – If true, use iou as box quality assessment method, else use ctrness. Default: false.
soft_label – If true, Use iou as class ground truth.
task_batch_list – Mask for different label source dataset.
reference_anchor_width – the width of the corresponding anchor.
reference_anchor_height – the height of the corresponding anchor.
- class hat.models.task_modules.fcos.target.VehicleSideFCOSTarget(strides: Tuple[int, ...], regress_ranges: Tuple[Tuple[int, int], ...], cls_out_channels: int, background_label: int, norm_on_bbox: bool = True, center_sampling: bool = True, center_sample_radius: float = 1.5, use_iou_replace_ctrness: bool = False, task_batch_list: Optional[List[int]] = None, decouple_h: bool = False)¶
- forward(label, pred, *args)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.decoder.FCOSDecoder(num_classes: int, strides: Sequence[int], transforms: Optional[Sequence[dict]] = None, inverse_transform_key: Optional[Sequence[str]] = None, nms_use_centerness: bool = True, nms_sqrt: bool = True, test_cfg: Optional[dict] = None, input_resize_scale: Optional[Union[float, torch.Tensor]] = None, truncate_bbox: bool = True, filter_score_mul_centerness: bool = False, meta_data_bool: bool = True, label_offset: int = 0, upscale_bbox_pred: bool = False, bbox_relu: bool = False, to_cpu: bool = False)¶
- 参数
num_classes – Number of categories excluding the background category.
strides – A list contains the strides of fcos_head output.
transforms – A list contains the transform config.
inverse_transform_key – A list contains the inverse transform info key.
nms_use_centerness – If True, use centerness as a factor in nms post-processing.
nms_sqrt – If True, sqrt(score_thr * score_factors).
test_cfg – Cfg dict, including some configurations of nms.
input_resize_scale – The scale to resize bbox.
truncate_bbox – If True, truncate the predictive bbox out of image boundary. Default True.
filter_score_mul_centerness – If True, filter out bbox by score multiply centerness, else filter out bbox by score. Default False.
meta_data_bool – Whether get shape info from meta data.
label_offset – label offset.
upscale_bbox_pred – Whether upscale bbox preds.
bbox_relu – Whether apply relu to bbox preds.
- forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.fcos.decoder.FCOSDecoder4RCNN(num_classes: int, strides: Sequence[int], input_shape: Tuple[int], nms_use_centerness: bool = True, nms_sqrt: bool = True, test_cfg: Optional[Dict] = None, input_resize_scale: Optional[Union[float, torch.Tensor]] = None)¶
Decoder for FCOS+RCNN Architecture.
- 参数
num_classes – Number of categories excluding the background category.
strides – A list contains the strides of fcos_head output.
input_shape – The shape of input_image.
nms_use_centerness – If True, use centerness as a factor in nms post-processing.
nms_sqrt – If True, sqrt(score_thr * score_factors).
rescale – Whether to map the prediction result to the orig img.
test_cfg – Cfg dict, including some configurations of nms.
input_resize_scale – The scale to resize bbox.
- forward(pred: collections.OrderedDict)¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.fcos.decoder.FCOSDecoderWithConeInvasion(num_classes: int, strides: Sequence[int], transforms: Optional[Sequence[dict]] = None, inverse_transform_key: Optional[Sequence[str]] = None, nms_use_centerness: bool = True, nms_sqrt: bool = True, test_cfg: Optional[dict] = None, input_resize_scale: Optional[Union[float, torch.Tensor]] = None, truncate_bbox: bool = True, filter_score_mul_centerness: bool = False, meta_data_bool: bool = True, label_offset: int = 0, upscale_bbox_pred: bool = False, bbox_relu: bool = False)¶
- 参数
num_classes – Number of categories excluding the background category.
strides – A list contains the strides of fcos_head output.
transforms – A list contains the transform config.
inverse_transform_key – A list contains the inverse transform info key.
nms_use_centerness – If True, use centerness as a factor in nms post-processing.
nms_sqrt – If True, sqrt(score_thr * score_factors).
test_cfg – Cfg dict, including some configurations of nms.
input_resize_scale – The scale to resize bbox.
truncate_bbox – If True, truncate the predictive bbox out of image boundary. Default True.
filter_score_mul_centerness – If True, filter out bbox by score multiply centerness, else filter out bbox by score. Default False.
meta_data_bool – Whether get shape info from meta data.
label_offset – label offset.
upscale_bbox_pred – Whether upscale bbox preds.
bbox_relu – Whether apply relu to bbox preds.
- forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.fcos.decoder.FCOSDocoderForFilter(**kwargs)¶
The basic structure of FCOSDocoderForFilter.
- 参数
kwargs – Same as FCOSDecoder.
- forward(preds, meta_data)¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.fcos.decoder.FCOSDocoderForFilterHbir(**kwargs)¶
The basic structure of FCOSDocoderForFilterHbir.
- 参数
kwargs – Same as FCOSDecoder.
- forward(outputs, meta_data)¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.fcos.decoder.VehicleSideFCOSDecoder(num_classes, strides, transforms=None, inverse_transform_key=None, nms_use_centerness=True, nms_sqrt=True, test_cfg=None, input_resize_scale=None, truncate_bbox=True, filter_score_mul_centerness=False, int8_output=True, decouple_h=False)¶
- 参数
num_classes (int) – Number of categories excluding the background category.
strides (Sequence[int]) – A list contains the strides of fcos_head output.
transforms (Sequence[dict]) – A list contains the transform config.
inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.
nms_use_centerness (bool, optional) – If True, use centerness as a factor in nms post-processing.
nms_sqrt (bool, optional) – If True, sqrt(score_thr * score_factors).
test_cfg (dict, optional) – Cfg dict, including some configurations of nms.
truncate_bbox (bool, optional) – If True, truncate the predictive bbox out of image boundary. Default True.
filter_score_mul_centerness (bool, optional) – If True, filter out bbox by score multiply centerness, else filter out bbox by score. Default False.
- forward(pred: Sequence[torch.Tensor], meta_data: Dict[str, Any])¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.fcos.fcos_loss.FCOSLoss(cls_loss: torch.nn.modules.module.Module, reg_loss: torch.nn.modules.module.Module, centerness_loss: Optional[torch.nn.modules.module.Module] = None)¶
FCOS loss wrapper.
- 参数
losses (list) – loss configs.
注解
This class is not universe. Make sure you know this class limit before using it.
- forward(pred: Tuple, target: Tuple[Dict]) Dict ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.fcos_loss.VehicleSideFCOSLoss(cls_loss: torch.nn.modules.module.Module, reg_bbox_loss: torch.nn.modules.module.Module, reg_alpha_loss: torch.nn.modules.module.Module, centerness_loss: torch.nn.modules.module.Module)¶
VehicleSide Task FCOS Loss wrapper.
- 参数
cls_loss – Classification Loss.
reg_bbox_loss – Regression Loss for Vehicle Side BBox.
reg_alpha_loss – Regression Loss for Vehicle Side Alpha.
centerness_loss – FCOS Centerness Loss.
- forward(pred: Tuple, target: Tuple[Dict]) Dict ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.filter.FCOSMultiStrideCatFilter(strides: Sequence[int], threshold: float, task_strides: Sequence[Sequence[int]], int16_output: bool = False, idx_range: Optional[Tuple[int, int]] = None)¶
A modified Filter used for post-processing of FCOS.
In each stride, concatenate the scores of each task as the first input of FilterModule, which can reduce latency in BPU.
- 参数
strides (Sequence[int]) – A list contains the strides of feature maps.
idx_range (Optional[Tuple[int, int]], optional) – The index range of values counted in compare of the first input. Defaults to None which means use all the values.
threshold (float) – The lower bound of output.
task_strides (Sequence[Sequence[int]]) – A list of out_stirdes of each task.
- forward(preds: Sequence[torch.Tensor], **kwargs) Sequence[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.filter.FCOSMultiStrideCatFilterWithConeInvasion(strides: Sequence[int], threshold: float, task_strides: Sequence[Sequence[int]], int16_output: bool = False, idx_range: Optional[Tuple[int, int]] = None)¶
A modified Filter used for post-processing of FCOS with cone invasion.
In each stride, concatenate the scores of each task as the first input of FilterModule, which can reduce latency in BPU.
- 参数
strides – A list contains the strides of feature maps.
idx_range – The index range of values counted in compare of the first input. Defaults to None which means use all the values.
threshold – The lower bound of output.
task_strides – A list of out_stirdes of each task.
- forward(preds: Sequence[torch.Tensor], **kwargs) Sequence[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.filter.FCOSMultiStrideFilter(strides: Sequence[int], threshold: float, idx_range: Optional[Tuple[int, int]] = None, for_compile: bool = False, decoder: Optional[torch.nn.modules.module.Module] = None)¶
Filter used for post-processing of FCOS.
- 参数
strides – A list contains the strides of feature maps.
idx_range – The index range of values counted in compare of the first input. Defaults to None which means use all the values.
threshold – The lower bound of output.
for_compile – Whether used for compile. if true, should not include postprocess.
decoder – Decoder module.
- forward(preds: Sequence[torch.Tensor], meta_and_label: Optional[Dict] = None, **kwargs) Sequence[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos.head.FCOSHead(num_classes: int, in_strides: Sequence[int], out_strides: Sequence[int], stride2channels: dict, upscale_bbox_pred: bool, feat_channels: int = 256, stacked_convs: int = 4, use_sigmoid: bool = True, share_bn: bool = False, dequant_output: bool = True, int8_output: bool = True, int16_output=False, nhwc_output=False, share_conv: bool = True, bbox_relu: bool = True, use_plain_conv: bool = False, use_gn: bool = False, use_scale: bool = False, add_stride: bool = False, output_dict: bool = False, set_all_int16_qconfig=False, pred_reg_channel: int = 4, skip_qtensor_check: bool = False, use_save_tensor: bool = True)¶
Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.
- 参数
num_classes – Number of categories excluding the background category.
in_strides – A list contains the strides of feature maps from backbone or neck.
out_strides – A list contains the strides of this head will output.
stride2channels – A stride to channel dict.
upscale_bbox_pred – If true, upscale bbox pred by FPN strides.
feat_channels – Number of hidden channels.
stacked_convs – Number of stacking convs of the head.
use_sigmoid – Whether the classification output is obtained using sigmoid.
share_bn – Whether to share bn between multiple levels, default is share_bn.
dequant_output – Whether to dequant output. Default: True
int8_output – If True, output int8, otherwise output int32. Default: True.
int16_output – If True, output int16, otherwise output int32. Default: False.
nhwc_output – transpose output layout to nhwc.
share_conv – Only the number of all stride channels is the same, share_conv can be True, branches share conv, otherwise not. Default: True.
bbox_relu – Whether use relu for bbox. Default: True.
use_plain_conv – If True, use plain conv rather than depth-wise conv in some conv layers. This argument works when share_conv=True. Default: False.
use_gn – If True, use group normalization instead of batch normalization in some conv layers. This argument works when share_conv=True. Default: False.
use_scale – If True, add a scale layer to scale the predictions like what original FCOS does. This argument works when share_conv=True. Default: False.
add_stride – If True, add extra out_strides. Sometimes the out_strides is not a subset of in_strides, for example, the in_strides is [4, 8, 16, 32, 64] but the out_strides is [8, 16, 32, 64, 128], then we need to add an extra stride 128 in this head. This argument works when share_conv=True. Default: False.
skip_qtensor_check – if True, skip head qtensor check. The python grammar assert not support for TorchDynamo.
output_dict – If True, forward(self) will output a dict.
use_save_tensor – If true, turn off save tensor.
- forward(feats)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- forward_single(x, i, stride)¶
Forward features of a single scale level.
- 参数
x (Tensor) – FPN feature maps of the specified stride.
i (int) – Index of feature level.
stride (int) – The corresponding stride for feature maps, only used to upscale bbox pred when self.upscale_bbox_pred is True.
- class hat.models.task_modules.fcos.head.FCOSHeadWithConeInvasion(num_classes: int, in_strides: Sequence[int], out_strides: Sequence[int], stride2channels: dict, upscale_bbox_pred: bool, upscale_invasion_scale: bool, feat_channels: int = 256, stacked_convs: int = 4, use_sigmoid: bool = True, share_bn: bool = False, dequant_output: bool = True, int8_output: bool = True, int16_output=False, nhwc_output=False, share_conv: bool = True, bbox_relu: bool = True, invasion_scale_relu: bool = True, use_plain_conv: bool = False, use_gn: bool = False, use_scale: bool = False, add_stride: bool = False, output_dict: bool = False, set_all_int16_qconfig=False, pred_reg_channel: int = 4, skip_qtensor_check: bool = False, use_save_tensor: bool = True)¶
Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.
- 参数
num_classes – Number of categories excluding the background category.
in_strides – A list contains the strides of feature maps from backbone or neck.
out_strides – A list contains the strides of this head will output.
stride2channels – A stride to channel dict.
upscale_bbox_pred – If true, upscale bbox pred by FPN strides.
feat_channels – Number of hidden channels.
stacked_convs – Number of stacking convs of the head.
use_sigmoid – Whether the classification output is obtained using sigmoid.
share_bn – Whether to share bn between multiple levels, default is share_bn.
dequant_output – Whether to dequant output. Default: True
int8_output – If True, output int8, otherwise output int32. Default: True.
int16_output – If True, output int16, otherwise output int32. Default: False.
nhwc_output – transpose output layout to nhwc.
share_conv – Only the number of all stride channels is the same, share_conv can be True, branches share conv, otherwise not. Default: True.
bbox_relu – Whether use relu for bbox. Default: True.
invasion_scale_relu – Whether use relu for cone invasion scale. Default: True.
use_plain_conv – If True, use plain conv rather than depth-wise conv in some conv layers. This argument works when share_conv=True. Default: False.
use_gn – If True, use group normalization instead of batch normalization in some conv layers. This argument works when share_conv=True. Default: False.
use_scale – If True, add a scale layer to scale the predictions like what original FCOS does. This argument works when share_conv=True. Default: False.
add_stride – If True, add extra out_strides. Sometimes the out_strides is not a subset of in_strides, for example, the in_strides is [4, 8, 16, 32, 64] but the out_strides is [8, 16, 32, 64, 128], then we need to add an extra stride 128 in this head. This argument works when share_conv=True. Default: False.
skip_qtensor_check – if True, skip head qtensor check. The python grammar assert not support for TorchDynamo.
output_dict – If True, forward(self) will output a dict.
use_save_tensor – If true, turn off save tensor.
- forward(feats)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- forward_single(x, i, stride)¶
Forward features of a single scale level.
- 参数
x (Tensor) – FPN feature maps of the specified stride.
i (int) – Index of feature level.
stride (int) – The corresponding stride for feature maps, only used to upscale bbox pred when self.upscale_bbox_pred is True.
- class hat.models.task_modules.fcos.head.VehicleSideFCOSHead(num_classes, in_strides, out_strides, stride2channels, upscale_bbox_pred, feat_channels=256, stacked_convs=4, use_sigmoid=True, share_bn=False, dequant_output=True, int8_output=True, share_conv=True, enable_act=False, use_plain_conv=False, use_gn=False, use_scale=False, add_stride=False)¶
Anchor-free head used in FCOS <https://arxiv.org/abs/1904.01355>.
- 参数
num_classes (int) – Number of categories excluding the background category.
in_strides (Sequence[int]) – A list contains the strides of feature maps from backbone or neck.
out_strides (Sequence[int]) – A list contains the strides of this head will output.
stride2channels (dict) – A stride to channel dict.
feat_channels (int) – Number of hidden channels.
stacked_convs (int) – Number of stacking convs of the head.
use_sigmoid (bool) – Whether the classification output is obtained using sigmoid.
share_bn (bool) – Whether to share bn between multiple levels, default is share_bn.
upscale_bbox_pred (bool) – If true, upscale bbox pred by FPN strides.
dequant_output (bool) – Whether to dequant output. Default: True
int8_output (bool) – If True, output int8, otherwise output int32. Default: True
share_conv (bool) – Only the number of all stride channels is the same, share_conv can be True, branches share conv, otherwise not. Default: True
use_plain_conv – If True, use plain conv rather than depth-wise conv in some conv layers. This argument works when share_conv=True. Default: False.
use_gn – If True, use group normalization instead of batch normalization in some conv layers. This argument works when share_conv=True. Default: False.
use_scale – If True, add a scale layer to scale the predictions like what original FCOS does. This argument works when share_conv=True. Default: False.
add_stride – If True, add extra out_strides. Sometimes the out_strides is not a subset of in_strides, for example, the in_strides is [4, 8, 16, 32, 64] but the out_strides is [8, 16, 32, 64, 128], then we need to add an extra stride 128 in this head. This argument works when share_conv=True. Default: False.
- forward(feats)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- forward_single(x, i, stride)¶
Forward features of a single scale levle.
- 参数
x (Tensor) – FPN feature maps of the specified stride.
i (int) – Index of feature level.
stride (int) – The corresponding stride for feature maps, only used to upscale bbox pred when self.upscale_bbox_pred is True.
- class hat.models.task_modules.fcos3d.bbox_coder.FCOS3DBBoxCoder(base_depths: Optional[Tuple[Tuple[float]]] = None, base_dims: Optional[Tuple[Tuple[float]]] = None, code_size: int = 7, norm_on_bbox: bool = True)¶
Bounding box coder for FCOS3D.
- 参数
base_depths – Depth references for decode box depth. Defaults to None.
base_dims – Dimension references for decode box dimension. Defaults to None.
code_size – The dimension of boxes to be encoded. Defaults to 7.
norm_on_bbox – Whether to apply normalization on the bounding box 2D attributes. Defaults to True.
- decode(bbox: torch.Tensor, scale: Tuple, stride: int, training: bool, cls_score: Optional[torch.Tensor] = None)¶
Decode regressed results into 3D predictions.
Note that offsets are not transformed to the projected 3D centers.
- 参数
bbox – Raw bounding box predictions in shape [N, C, H, W].
scale – Learnable scale parameters.
stride – Stride for a specific feature level.
training – Whether the decoding is in the training procedure.
cls_score – Classification score map for deciding which base depth or dim is used. Defaults to None.
- 返回
Decoded boxes.
- 返回类型
torch.Tensor
- static decode_yaw(bbox: torch.Tensor, centers2d: torch.Tensor, dir_cls: torch.Tensor, dir_offset: float, cam2img: torch.Tensor)¶
Decode yaw angle and change it from local to global.i.
- 参数
bbox – Bounding box predictions in shape [N, C] with yaws to be decoded.
centers2d – Projected 3D-center on the image planes corresponding to the box predictions.
dir_cls – Predicted direction classes.
dir_offset – Direction offset before dividing all the directions into several classes.
cam2img – Camera intrinsic matrix in shape [4, 4].
- 返回
Bounding boxes with decoded yaws.
- 返回类型
torch.Tensor
- class hat.models.task_modules.fcos3d.loss.FCOS3DLoss(num_classes: int, pred_attrs: False, num_attrs: int, group_reg_dims: Tuple[int], pred_velo: bool, use_direction_classifier: bool, dir_offset: float, dir_limit_offset: float, diff_rad_by_sin: bool, loss_cls: Dict, loss_bbox: Dict, loss_dir: Dict, loss_attr: Dict, loss_centerness: Dict, train_cfg: Dict)¶
Loss for FCOS3D.
- 参数
num_classes – Number of categories excluding the background category.
pred_attrs – Whether to predict attributes. Defaults to False.
num_attrs – The number of attributes to be predicted. Default: 9.
group_reg_dims – The dimension of each regression target group. Default: (2, 1, 3, 1, 2).
pred_velo – Whether to predict velocity. Defaults to False.
use_direction_classifier – Whether to add a direction classifier.
dir_offset – Parameter used in direction classification. Defaults to 0.
dir_limit_offset – Parameter used in direction classification. Defaults to 0.
diff_rad_by_sin – Whether to change the difference into sin difference for box regression loss. Defaults to True.
loss_cls – Config of classification loss.
loss_bbox – Config of localization loss.
loss_dir – Config of direction classifier loss.
loss_attr – Config of attribute classifier loss, which is only active when pred_attrs=True.
loss_centerness – Config of centerness loss.
train_cfg – Training config of anchor head.
- static add_sin_difference(boxes1, boxes2)¶
Convert the rotation difference to difference in sine function.
- 参数
boxes1 (torch.Tensor) – Original Boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.
boxes2 (torch.Tensor) – Target boxes in shape (NxC), where C>=7 and the 7th dimension is rotation dimension.
- 返回
boxes1
andboxes2
whose 7thdimensions are changed.
- 返回类型
tuple[torch.Tensor]
- forward(cls_scores, bbox_preds, dir_cls_preds, attr_preds, centernesses, labels_3d, bbox_targets_3d, centerness_targets, attr_targets)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- static get_direction_target(reg_targets, dir_offset=0, dir_limit_offset=0.0, num_bins=2, one_hot=True)¶
Encode direction to 0 ~ num_bins-1.
- 参数
reg_targets (torch.Tensor) – Bbox regression targets.
dir_offset (int, optional) – Direction offset. Default to 0.
dir_limit_offset (float, optional) – Offset to set the direction range. Default to 0.0.
num_bins (int, optional) – Number of bins to divide 2*PI. Default to 2.
one_hot (bool, optional) – Whether to encode as one hot. Default to True.
- 返回
Encoded direction targets.
- 返回类型
torch.Tensor
- class hat.models.task_modules.fcos3d.post_process.FCOS3DPostProcess(num_classes: int, use_direction_classifier: bool, strides: Tuple[int], group_reg_dims: Tuple[int], pred_attrs: bool, num_attrs: int, attr_background_label: int, bbox_coder: Dict, bbox_code_size: int, dir_offset: float, test_cfg: Dict, pred_bbox2d: bool = False)¶
Post-process for FOCS3D.
- 参数
num_classes – Number of categories excluding the background category.
use_direction_classifier – Whether to add a direction classifier.
strides – Downsample factor of each feature map.
group_reg_dims – The dimension of each regression target group. Default: (2, 1, 3, 1, 2).
pred_attrs – Whether to predict attributes. Defaults to False.
num_attrs – The number of attributes to be predicted. Default: 9.
attr_background_label – background label.
bbox_coder – bbox coder class.
bbox_code_size – Dimensions of predicted bounding boxes.
dir_offset – Parameter used in direction classification. Defaults to 0.
test_cfg – Testing config of anchor head.
pred_bbox2d – Whether to predict 2D boxes. Defaults to False.
- forward(cls_scores, bbox_preds, dir_cls_preds, attr_preds, centernesses, img_metas, cfg=None, rescale=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.fcos3d.target.FCOS3DTarget(num_classes: int, background_label: int, bbox_code_size: int, regress_ranges: Tuple[Tuple[int, int]], strides: Tuple[int], pred_attrs: bool, num_attrs: int, center_sampling: bool, center_sample_radius: float = 1.5, centerness_alpha: float = 2.5, norm_on_bbox: bool = True)¶
Generate cls/reg targets for FCOS3D in training stage.
- 参数
num_classes – Number of categories excluding the background category.
background_label – Label ID of background.
bbox_code_size – Dimensions of predicted bounding boxes.
regress_ranges – Regress range of multiple level points.
strides – Downsample factor of each feature map.
pred_attrs – Whether to predict attributes.
num_attrs – The number of attributes to be predicted.
center_sampling – If true, use center sampling. Default: True.
center_sample_radius – Radius of center sampling. Default: 1.5.
centerness_alpha – Parameter used to adjust the intensity attenuation from the center to the periphery. Default: 2.5.
norm_on_bbox – If true, normalize the regression targets with FPN strides. Default: True.
- forward(cls_scores, bbox_preds, gt_bboxes_list, gt_labels_list, gt_bboxes_3d_list, gt_labels_3d_list, centers2d_list, depths_list, attr_labels_list)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.ganet.decoder.GaNetDecoder(root_thr: float = 1.0, kpt_thr: float = 0.4, cluster_thr: float = 4.0, downscale: int = 8, min_points: int = 10)¶
Decoder for ganet, convert the output of the model to a prediction result in original image.
- 参数
root_thr – Threshold of select start point.
kpt_thr – Threshold of key points.
cluster_thr – Distance threshold of clustering point.
downscale – Down sampling scale for input data.
min_points – Minimum number of key points.
- forward(heat: torch.Tensor, offset: torch.Tensor, error: torch.Tensor, meta_data: Dict[str, Any])¶
Do post process for model predictions.
- 参数
pred – Prediction tensors.
meta_data – Meta data used in post processor, e.g. image width, height.
- class hat.models.task_modules.ganet.head.GaNetHead(in_channel: int)¶
A basic head module of ganet.
- 参数
in_channel – Number of channel in the input feature map.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.ganet.losses.GaNetLoss(loss_kpts_cls: torch.nn.modules.module.Module, loss_pts_offset_reg: torch.nn.modules.module.Module, loss_int_offset_reg: torch.nn.modules.module.Module)¶
The loss module of YOLOv3.
- 参数
loss_kpts_cls – Key poinits classification loss module.
loss_pts_offset_reg – Key points regiression loss module.
loss_int_offset_reg – Int error of points regiression loss module.
- forward(kpts_hm, pts_offset, int_offset, ganet_target)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.ganet.neck.GaNetNeck(fpn_module: torch.nn.modules.module.Module, attn_in_channels: List[int], attn_out_channels: List[int], attn_ratios: List[int], pos_shape: Tuple[int, int, int] = (1, 10, 25), num_feats: int = 3)¶
Neck for ganet.
- 参数
fpn_module – fpn module for ganet neck.
attn_in_channels – channels of attention layer input.
attn_out_channels – channels of attention layer input.
attn_ratios – ratios of channel in hidden layer of each attention layer.
pos_shape – Shape of pos embed.
num_feats – The number of feat map.
- forward(feats)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.ganet.target.GaNetTarget(hm_down_scale: int, radius: int = 2)¶
Target for ganet, generate info using training from label.
- 参数
hm_down_scale – The downsample scale of heatmape for input data.
radius – Gaussian circle radius.
- forward(data)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.lidar.anchor_generator.Anchor3DGeneratorStride(class_names: List[str], anchor_sizes: List[List[float]], anchor_strides: List[List[float]], anchor_offsets: List[List[float]], rotations: List[List[float]], match_thresholds: List[float], unmatch_thresholds: List[float], dtype: Any = torch.float32)¶
Lidar 3D Anchor Generator by stride.
- 参数
anchor_sizes – 3D sizes of anchors.
anchor_strides – Strides of anchors.
anchor_offsets – Offsets of anchors.
rotations – Rotations of anchors in a feature grid.
class_names – Class names of data.
match_thresholds – Match thresholds of IoU.
unmatch_thresholds – Unmatch thresholds of IoU.
- property class_name¶
Class names of data.
- forward(feature_map_size, device)¶
Forward pass, generate anchors.
- 参数
feature_map_size – Feature map size, (1, H, W).
device – device.
- 返回
Anchor list. Match thresholds of IoU. Unmatch thresholds of IoU.
- generate_anchors(feature_map_size, device=None)¶
Generate anchors.
- 参数
feature_map_size – Feature map size, (1, H, W).
device – device.
- 返回
List of Anchors.
- property match_thresholds¶
Match thresholds of IoU.
- property num_anchors_per_localization¶
Get number of anchors on per location.
- property num_of_anchor_sets¶
Get number of anchor settings.
- property unmatch_thresholds¶
Unmatch thresholds of IoU.
- class hat.models.task_modules.lidar.box_coders.GroundBox3dCoder(linear_dim: bool = False, vec_encode: bool = False, n_dim: int = 7, norm_velo: bool = False)¶
Box3d Coder for Lidar.
- 参数
linear_dim – Whether to smooth dimension. Defaults to False.
vec_encode – Whether encode angle to vector. Defaults to False.
n_dim – dims of bbox3d. Defaults to 7.
norm_velo – Whether to normalize. Defaults to False.
- decode(box_encodings, anchors)¶
Box decode for lidar bbox.
- 参数
boxes – normal boxes, shape [N, 7]: (x, y, z, w, l, h, r)
anchors – anchors, shape [N, 7]: (x, y, z, w, l, h, r)
- encode(boxes: torch.Tensor, anchors: torch.Tensor)¶
Box encode for Lidar boxes.
- 参数
boxes – normal boxes, shape [N, 7]: x, y, z, l, w, h, r
anchors – anchors, shape [N, 7]: x, y, z, l, w, h, r
- class hat.models.task_modules.lidar.pillar_encoder.PillarFeatureNet(num_input_features: int, num_filters: Tuple[int, ...] = (64,), with_distance: bool = False, voxel_size: Tuple[float, float, int] = (0.2, 0.2, 4), pc_range: Tuple[float, ...] = (0.0, - 40.0, - 3.0, 70.4, 40.0, 1.0), bn_kwargs: Optional[dict] = None, quantize: bool = False, use_4dim: bool = False, use_conv: bool = False, pool_size: Tuple[int, int] = (1, 1), normalize_xyz: bool = False, hw_reverse: bool = False)¶
- forward(features: torch.Tensor, num_voxels: Optional[torch.Tensor] = None, coors: Optional[torch.Tensor] = None, horizon_preprocess: bool = False)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.lidar.pillar_encoder.PointPillarScatter(num_input_features: int, use_horizon_pillar_scatter: bool = False, quantize=False, **kwargs)¶
- forward(voxel_features: torch.Tensor, coords: torch.Tensor, batch_size: int, input_shape: torch.Tensor)¶
Forward pass of the scatter module.
Note: batch_size has to be passed in additionally, because voxel features are concatenated on the M-channel since the number of voxels in each frame differs and there is no easy way we concat them same as image (CHW -> NCHW). M-channel concatenation would require another tensor to record number of voxels per frame, which indicates batch_size consequently.
- 参数
voxel_features (torch.Tensor) – MxC tensor of pillar features, where M is number of pillars, C is each pillar’s feature dim.
coords (torch.Tensor) – each pillar’s original BEV coordinate.
batch_size (int) – batch size of the feature.
input_shape (torch.Tensor) – shape of the expected BEC map. Derived from point-cloud range and voxel size.
- 返回
- a BEV view feature tensor with point features
scattered on it.
- 返回类型
[torch.Tensor]
- class hat.models.task_modules.lidar.target_assigner.LidarTargetAssigner(box_coder: hat.models.task_modules.lidar.box_coders.GroundBox3dCoder, class_names: List[str], positive_fraction: Optional[int] = None, sample_size: int = 512)¶
TargetAssigner for Lidar.
- 参数
box_coder – BoxCoder.
class_names – Class names.
positive_fraction – Positive fraction.
sample_size – Sample size.
- assign_per_class(classes_names, anchors_list, matched_thresholds, unmatched_thresholds, gt_boxes, gt_classes, gt_names)¶
Assign targets for each class.
- 参数
classes_names – Class names.
anchors_list – List of anchors.
match_thresholds – Match thresholds of IoU.
unmatch_thresholds – Unmatch thresholds of IoU.
gt_boxes – Ground truth boxes.
gt_classes – Ground truth classes.
gt_names – Names of Ground truth.
- 返回
Bbox classification label. bbox_targets: Bbox. reg_weights: Regression weights for each bbox.
- 返回类型
labels
- assign_targets(anchors_list: List[torch.Tensor], matched_thresholds: List[float], unmatched_thresholds: List[float], annos: Dict, device: Optional[Union[torch.device, str]] = None)¶
Generate targets.
- 参数
anchors_list – List of anchors.
match_thresholds – Match thresholds of IoU.
unmatch_thresholds – Unmatch thresholds of IoU.
annos – Annotations of ground truth.
device – The device on which the target will be generated.
- 返回
BBox targets. cls_labels: Classification label for bbox. reg_weights: Regression weights for each bbox.
- 返回类型
bbox_targets
- property box_coder¶
3D boxCoder.
- property box_ndim¶
Dimension of box.
- create_targets_single(all_anchors: torch.Tensor, gt_boxes: torch.Tensor, similarity_fn: Callable, box_encoding_fn: Callable, gt_classes: Optional[torch.Tensor] = None, matched_threshold: float = 0.6, unmatched_threshold: float = 0.45, positive_fraction: Optional[float] = None, sample_size: int = 300, norm_by_num_examples: bool = False, box_code_size: int = 7)¶
Create targets.
- 参数
all_anchors – [num_of_anchors, box_ndim] float tensor.
gt_boxes – [num_gt_boxes, box_ndim] float tensor.
similarity_fn – a function, accept anchors and gt_boxes, return similarity matrix(such as IoU).
box_encoding_fn – a function, accept gt_boxes and anchors, return box encodings(offsets).
prune_anchor_fn – a function, accept anchors, return indices that indicate valid anchors.
gt_classes – [num_gt_boxes] int tensor. indicate gt classes, must start with 1.
matched_threshold – float, iou greater than matched_threshold will be treated as positives.
unmatched_threshold – float, iou smaller than unmatched_threshold will be treated as negatives.
positive_fraction – [0-1] float or None. if not None, we will try to keep ratio of pos/neg equal to positive_fraction when sample. if there is not enough positives, it fills the rest with negatives.
rpn_batch_size – int. sample size.
norm_by_num_examples – bool. norm box_weight by number of examples.
- 返回
Bbox classification label. bbox_reg_targets: Bbox. reg_weights: Regression weights for each bbox.
- 返回类型
box_cls_labels
- forward(anchors_list: List[torch.Tensor], matched_thresholds: List[float], unmatched_thresholds: List[float], annos: Dict, device: Optional[Union[torch.device, str]] = None)¶
Forward pass, generate targets.
- 参数
anchors_list – List of anchors.
match_thresholds – Match thresholds of IoU.
unmatch_thresholds – Unmatch thresholds of IoU.
annos – Annotations of ground truth.
device – The device on which the target will be generated.
- 返回
BBox targets. cls_labels: Classification label for bbox. reg_weights: Regression weights for each bbox.
- 返回类型
bbox_targets
- nearest_iou_similarity(boxes1, boxes2)¶
Compute matrix of (negated) sq distances.
- 参数
boxlist1 – BoxList holding N boxes.
boxlist2 – BoxList holding M boxes.
- 返回
A tensor with shape [N, M] representing negated pairwise squared distance.
- property num_anchors_per_location¶
Get number of anchors per location.
- class hat.models.task_modules.lidar_multitask.decoder.LidarDetDecoder(head: torch.nn.modules.module.Module, name: str, task_feat_index: int = 0, task_weight: float = 1.0, target: Optional[torch.nn.modules.module.Module] = None, loss: Optional[torch.nn.modules.module.Module] = None, decoder: Optional[torch.nn.modules.module.Module] = None)¶
Detection decoder structure of lidar.
- class hat.models.task_modules.lidar_multitask.decoder.LidarSegDecoder(feat_upscale: int = 1, **kwargs)¶
Segmentation decoder structure of lidar.
- 参数
feat_upscale – Feature upscale factor. Defaults to 1.
**kwargs – Additional keyword arguments passed to the parent class.
- forward(feats, meta)¶
Forward pass through the LidarSegDecoder.
- 参数
feats – Input features or sequence of features.
meta – Metadata.
- 返回
Predictions and additional results.
- class hat.models.task_modules.motion_forecasting.decoders.densetnt.head.Densetnt(in_channels: int = 128, hidden_size: int = 128, num_traj: int = 384, target_graph_depth: int = 2, pred_steps: int = 30, top_k: int = 150)¶
Implements the Densetnt head.
- 参数
in_channels – input channels.
hidden_size – hidden_size.
num_traj – number of traj.
target_graph_depth – depth for traj decoder.
pred_steps – number of traj pred steps.
top_k – top k for candidates.
- forward(graph_feats: torch.Tensor, gobal_feats: torch.Tensor, traj_feats: torch.Tensor, lane_feats: torch.Tensor, instance_mask: torch.Tensor, data: Dict) Tuple[torch.Tensor, torch.Tensor, torch.Tensor] ¶
Perform forward pass.
- 参数
graph_feats – Graph features.
gobal_feats – Global features.
traj_feats – Trajectory features.
lane_feats – Lane features.
instance_mask – Instance mask.
data – Data dictionary containing goals and goals mask.
- 返回
Tuple containing goals_preds, traj_preds, and pred_goals.
- set_qconfig() None ¶
Set the quantization configuration for the model.
- class hat.models.task_modules.motion_forecasting.decoders.densetnt.loss.DensetntLoss¶
Generate Densetnt loss.
- forward(goals_target: torch.Tensor, traj_target: torch.Tensor) Dict ¶
Compute the loss.
- 参数
goals_target – Goals target containing goals_preds and goals_labels.
traj_target – Trajectory target containing traj_preds and traj_labels.
- 返回
Dictionary containing the goals_loss and traj_loss.
- class hat.models.task_modules.motion_forecasting.decoders.densetnt.post_process.DensetntPostprocess(threshold=2.0, pred_steps=30, mode_num=6)¶
postprocess for densetnt.
- 参数
threshold – threshold for nms.
pred_steps – steps for traj pred.
mode_num – number of mode.
- forward(goals_scores: torch.Tensor, traj_preds: torch.Tensor, pred_goals: torch.Tensor, data: Dict) Tuple[torch.Tensor, torch.Tensor] ¶
Perform forward pass.
- 参数
goals_scores – Goals scores.
traj_preds – Trajectory predictions.
pred_goals – Predicted goals.
data – Data dictionary.
- 返回
Tuple containing the predicted trajectories and scores.
- select_goals_by_NMS(goals_scores: torch.Tensor, traj_preds: torch.Tensor, pred_goals: torch.Tensor) Tuple[torch.Tensor, torch.Tensor] ¶
Perform non-maximum suppression on predicted goals.
- 参数
goals_scores – Predicted goals scores.
traj_preds – Predicted trajectories.
pred_goals – Predicted goals.
- 返回
Tuple containing the selected predicted trajectories and scores.
- class hat.models.task_modules.motion_forecasting.decoders.densetnt.target.DensetntTarget¶
Generate densetnt targets.
- forward(goals_preds: torch.Tensor, traj_preds: torch.Tensor, data: Dict) Tuple[torch.Tensor, torch.Tensor] ¶
Generate Densetnt targets.
- 参数
goals_preds – Predicted goals.
traj_preds – Predicted trajectories.
data – Data dictionary.
- 返回
Tuple containing the goals target and trajectory target.
- class hat.models.task_modules.motion_forecasting.encoders.vectornet.Vectornet(depth: int = 3, traj_in_channels: int = 8, traj_num_vec: int = 9, lane_in_channels: int = 16, lane_num_vec: int = 19, hidden_size: int = 128)¶
Implements the vectornet encoder.
- 参数
depth – depth for encoder layer.
traj_in_channels – Traj feat input channels.
traj_num_vec – Vector number of traj feat.
lane_in_channels – Lane fat input channels.
lane_num_vec – Vector number of lane feat.
hidden_size – hidden_size.
- forward(traj_feat: torch.Tensor, lane_feat: torch.Tensor, instance_mask: torch.Tensor) Tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor] ¶
Perform forward pass.
- 参数
traj_feat – Trajectory features.
lane_feat – Lane features.
instance_mask – Instance mask.
- 返回
- Tuple containing graph_feat, gobal_feat, traj_feat,
lane_feat, instance_mask.
- set_qconfig() None ¶
Set the quantization configuration for the model.
- class hat.models.task_modules.motr.criterion.MotrCriterion(num_classes, num_dec_layers: int = 6, cost_class: float = 2.0, cost_bbox: float = 5.0, cost_giou: float = 2.0, cls_loss_coef: float = 2, bbox_loss_coef: float = 5, giou_loss_coef: float = 2, aux_loss: bool = True, max_frames_per_seq: int = 5)¶
This class computes the loss for Motr.
- 参数
num_classes – number of object categories.
num_dec_layers – number of the decoder layers.
cost_class – weight of the classification error in the matching cost.
cost_bbox – weight of the L1 error of the bbox in the matching cost.
cost_giou – weight of the giou loss of the bbox in the matching cost.
cls_loss_coef – weight of the classification loss.
bbox_loss_coef – weight of the L1 loss of the bbox.
giou_loss_coef – weight of the giou loss of the bbox.
aux_loss – True if auxiliary decoding losses are to be used.
max_frames_per_seq – The max num frame of seq data.
- forward()¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- loss_boxes(outputs, gt_instances: List[Dict], indices: List[tuple], num_boxes)¶
Compute the losses related to the bounding boxes.
the L1 regression loss and the GIoU loss. Targets dicts must contain the key “gt_bboxes”, which containing a tensor of dim [nb_target_boxes, 4]. Target boxes are expected in format (center_x, center_y, w, h), which normalized by the image size.
- loss_labels(outputs, gt_instances: List[Dict], indices, num_boxes, log=False)¶
Classification loss (NLL).
- class hat.models.task_modules.motr.head.MotrHead(transformer: torch.nn.modules.module.Module, num_classes: int = 1, in_channels: int = 2048, max_per_img: int = 100)¶
Implements the MOTR head.
- 参数
transformer – transformer module.
num_classes – Number of categories excluding the background.
in_channels – Number of channels in the input featuremaps.
max_per_img – max number of object in single image.
- forward(feats, query_pos, ref_pts, mask_query)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.motr.motr_deformable_transformer.MotrDeformableTransformer(pos_embed: torch.nn.modules.module.Module, d_model: int = 256, nhead: int = 8, num_queries: int = 300, num_encoder_layers: int = 6, num_decoder_layers: int = 6, dim_feedforward: int = 1024, dropout: float = 0.1, return_intermediate_dec: bool = False, num_feature_levels: int = 1, enc_n_points: int = 4, dec_n_points: int = 4, extra_track_attn: bool = False)¶
Implements the motr deformable transformer.
- 参数
pos_embed – The feature pos embed module.
d_model – The feature dimension.
nhead – Parallel attention heads.
num_queries – The number of query.
num_encoder_layers – Number of TransformerEncoderLayer.
num_decoder_layers – Number of TransformerDecoderLayer.
dim_feedforward – The hidden dimension for FFNs used in both encoder and decoder.
dropout – Probability of an element to be zeroed. Default 0.1.
return_intermediate_dec – Whether to return the intermediate output from each TransformerDecoderLayer or only the last TransformerDecoderLayer. Default False.
num_feature_levels – The num of featuremap.
enc_n_points – The num of encoder deformable attention points.
dec_n_points – The num of decoder deformable attention points.
extra_track_attn – Whether enable track attention.
- forward(srcs, query_embed, ref_pts, tgt_mask, track_mask)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.motr.post_process.MotrPostProcess(max_track: int = 256, area_threshold: int = 100, prob_threshold: float = 0.7, random_drop: float = 0.1, fp_ratio: float = 0.3, score_thresh: float = 0.7, filter_score_thresh: float = 0.6, miss_tolerance: int = 5)¶
- forward(track_instances, empty_track_instance, fake_track_instance, out_hs, outputs_classes_head, outputs_coords_head, criterion=None, targets=None, seq_data=None, frame_id=None, seq_frame_id=None, seq_name=None)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.motr.qim.QueryInteractionModule(dim_in, hidden_dim, dropout=0.0)¶
- forward(query_pos_all, output_embedding, query_mask)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.petr.head.PETRDecoder(num_layer: int = 6, **kwargs)¶
PETR decoder module.
- 参数
num_layer – Number of layers.
- forward(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor, query_pos: torch.Tensor, key_pos: torch.Tensor) List[torch.Tensor] ¶
Forward pass of the module.
- 参数
query – The query tensor.
key – The key tensor.
value – The value tensor.
query_pos – The query positional tensor.
key_pos – The key positional tensor.
- 返回
The output tensors for each decode layer.
- fuse_model() None ¶
Perform model fusion on the modules.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.petr.head.PETRHead(transformer: torch.nn.modules.module.Module, num_query: int = 900, query_align: int = 8, embed_dims: int = 256, in_channels: int = 2048, num_cls_fcs: int = 2, num_reg_fcs: int = 2, reg_out_channels: int = 10, cls_out_channels: int = 10, position_range: Optional[Tuple[float]] = None, bev_range: Optional[Tuple[float]] = None, num_views: int = 6, depth_num: int = 64, depth_start: int = 1, positional_encoding: Optional[torch.nn.modules.module.Module] = None, int8_output: bool = False, dequant_output: bool = True)¶
Petr Head module.
- 参数
transformer – Transformer module for Detr3d.
num_query – Number of query.
query_align – Align number for query.
embed_dims – Embeding channels.
in_channels – Input channels.
num_cls_fcs – Number of classification layer.
num_reg_fcs – Number of classification layer.
reg_out_channels – Number of regression outoput channels.
cls_out_channels – Numbger of classification output channels,
position_range – Positon ranges
bev_range – BEV ranges.
num_views – Number of views for input.
depth_num – Number of max depth.
depth_start – start of depth.
positional_encoding – PE module.
int8_output – Whether output is int8.
dequant_output – Whether dequant output.
- export_reference_points(meta: Dict, feat_hw: Tuple[int, int])¶
Export the reference points.
- 参数
meta – Additional metadata.
feat_hw – The feature height and width.
- 返回
A dictionary containing the position embeddings and reference points.
- forward(feats: List[torch.Tensor], meta: Dict, compile_model: bool = False) Tuple[torch.Tensor] ¶
Represent the forward pass of the module.
- 参数
feats – The list of feature tensors.
meta – The metadata dictionary.
compile_model – Whether in compile model.
- 返回
The output result list
- fuse_model() None ¶
Perform model fusion on the modules.
- position_embeding(feat_hw: Tuple[int, int], meta: Dict) torch.Tensor ¶
Perform position embedding for the input feature map.
- 参数
feat – The input feature tensor.
meta – A dictionary containing additional information, such as the shape of the image tensor.
- 返回
The position embedding tensor.
- set_calibration_qconfig()¶
Set the calibration quantization configuration.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.petr.head.PETRTransformer(decoder: torch.nn.modules.module.Module)¶
Petr Transformer module.
- 参数
decoder – Decoder module for PETR.
- forward(feats: torch.Tensor, query_embed: torch.Tensor, pos_embed: torch.Tensor) torch.Tensor ¶
Forward pass of the module.
- 参数
feats – The input feature tensor.
query_embed – The query embedding tensor.
pos_embed – The positional embedding tensor.
- 返回
The output tensor.
- fuse_model() None ¶
Perform model fusion on the modules.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.pwcnet.head.PwcNetHead(in_channels: List[int], bn_kwargs: dict, use_bn: bool = False, md: int = 4, use_res: bool = True, use_dense: bool = True, flow_pred_lvl: int = 2, pyr_lvls: int = 6, bias: bool = True, act_type=None)¶
A basic head of PWCNet.
- 参数
in_channels – Number of channels in the input feature map.
bn_kwargs – Dict for BN layer.
use_bn – Whether to use BN in module.
md – search range of Correlation module.
use_res – Whether to use residual connections.
use_dense – Whether to use dense connections.
flow_pred_lvl – Which level to upsample to generate the final optical flow prediction.
pyr_lvls – Number of feature levels in the flow pyramid.
bias – Whether to use bias in module.
act_type – Activation layer.
- forward(features: List[List[torch.Tensor]]) List[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- init_weights()¶
Initialize the weights of head module.
- warp(x: torch.Tensor, up_flow: torch.Tensor, idx: int) torch.Tensor ¶
Warp an image/tensor (im2) back to im1, according to the optical flow.
- 参数
x – [B, C, H, W] (im2)
up_flow – [B, 2, H, W] flow
- class hat.models.task_modules.pwcnet.neck.PwcNetNeck(out_channels: list, use_bn: bool, bn_kwargs: dict, bias: bool = True, pyr_lvls: int = 6, flow_pred_lvl: int = 2, act_type=None)¶
A extra features module of PWCNet.
- 参数
out_channels – Channels for each block.
use_bn – Whether to use BN in module.
bn_kwargs – Dict for BN layer.
bias – Whether to use bias in module.
pyr_lvls – Number of feature levels in the flow pyramid.
flow_pred_lvl – Which level to upsample to generate the final optical flow prediction.
act_type – Activation layer.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- init_weights()¶
Initialize the weights of pwcnet module.
- class hat.models.task_modules.retinanet.filter.RetinanetMultiStrideFilter(strides: Sequence[int], threshold: float)¶
- forward(cls_scores, bbox_preds)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.retinanet.head.RetinaNetHead(num_classes: int, num_anchors: int, in_channels: int, feat_channels: int, stacked_convs: int = 4, int16_output: bool = True, dequant_output: bool = True)¶
An anchor-based head used in RetinaNet.
The head contains two subnetworks. The first classifies anchor boxes and the second regresses deltas for the anchors.
- 参数
num_classes (int) – Number of categories excluding the background category.
num_anchors (int) – Number of anchors for each pixel.
in_channels (int) – Number of channels in the input feature map.
feat_channels (int) – Number of hidden channels.
stacked_convs (int) – Number of convs before cls and reg.
int16_output (bool) – If True, output int16, otherwise output int32. Default: True
- forward(features)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- forward_single(x)¶
Forward feature of a single scale level.
- 参数
x (Tensor) – Feature of a single scale level.
- 返回
- cls_score (Tensor): Cls scores for a single scale level
the channels number is num_anchors * num_classes.
- bbox_pred (Tensor): Box energies / deltas for a single scale
level, the channels number is num_anchors * 4.
- 返回类型
tuple
- init_weights()¶
Initialize weights of the head.
- class hat.models.task_modules.retinanet.postprocess.RetinaNetPostProcess(score_thresh: float, nms_thresh: float, detections_per_img: int, topk_candidates: int = 1000)¶
The postprocess of RetinaNet.
- 参数
score_thresh (float) – Filter boxes whose score is lower than this.
nms_thresh (float) – thresh for nms.
detections_per_img (int) – Get top n boxes by score after nms.
topk_candidates (int) – Get top n boxes by score after decode.
- forward(boxes, preds, image_shapes)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.seg.decoder.SegDecoder(out_strides: List[int], decode_strides: List[int], upscale_times: Optional[List[int]] = None, transforms: Optional[List[dict]] = None, inverse_transform_key: Optional[List[str]] = None, output_names: Optional[str] = 'pred_seg')¶
Semantic Segmentation Decoder.
- 参数
out_strides – List of output strides, represents the strides of the output from seg_head.
output_names – Keys of returned results dict.
decode_strides – Strides that need to be decoded, should be a subset of out_strides.
upscale_times – Bilinear upscale times for each decode stride, default to None, which means same as decode stride.
transforms – A list contains the transform config.
inverse_transform_key – A list contains the inverse transform info key.
- class hat.models.task_modules.seg.decoder.VargNetSegDecoder(out_strides: List[int], input_padding: Sequence[int] = (0, 0, 0, 0))¶
Semantic Segmentation Decoder.
- 参数
out_strides (list[int]) – List of output strides, represents the strides of the output from seg_head.
output_names (str or list[str]) – Keys of returned results dict.
decode_strides (int or list[int]) – Strides that need to be decoded, should be a subset of out_strides.
transforms (Sequence[dict]) – A list contains the transform config.
inverse_transform_key (Sequence[str]) – A list contains the inverse transform info key.
- forward(pred: Sequence[torch.Tensor])¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.seg.head.SegHead(num_classes, in_strides, out_strides, stride2channels, feat_channels=256, stride_loss_weights=None, stacked_convs=1, argmax_output=False, dequant_output=True, int8_output=True, upscale=False, upscale_stride=4, interpolation='bilinear', output_with_bn=False, bn_kwargs=None, upsample_output_scale=None, output_conf=False, only_export_first=False, with_relu6=False)¶
Head Module for segmentation task.
- 参数
num_classes (int) – Number of classes.
in_strides (list[int]) – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.
out_strides (list[int]) – List of output strides.
stride2channels (dict) – A stride to channel dict.
feat_channels (int or list[int]) – Number of hidden channels (of each output stride).
stride_loss_weights (list[int]) – loss weight of each stride.
stacked_convs (int) – Number of stacking convs of head.
argmax_output (bool) – Whether conduct argmax on output. Default: False
dequant_output (bool) – Whether to dequant output. Default: True
int8_output (bool) – If True, output int8, otherwise output int32. Default: True
upscale (bool) – If True, stride{x}’s feature map is upsampled by 2x, then the upsampled feature is adding supervisory signal. Default is False.
upscale_stride (int) – Specify which stride’s feature need to be upsampled when upscale is True.
interpolation (str) – Interpolation method of image scaling, candidate value is [‘nearest’, ‘bilinear’].
output_with_bn (bool) – Whether add bn layer to the output conv.
bn_kwargs (dict) – Extra keyword arguments for bn layers.
upsample_output_scale (int) – Output upsample scale, only used in qat model, default is None.
with_relu6 (bool) – Whether replace relu with relu6.
- forward(feats)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- forward_single(x, stride_index=0)¶
Forward features of a single scale level.
- 参数
x (Tensor) – feature maps of the specified stride.
stride_index (int) – stride index of input feature map.
- 返回
seg predictions of input feature maps.
- 返回类型
tuple
- class hat.models.task_modules.seg.target.SegTarget(ignore_index: int = 255, label_name: str = 'gt_seg', interpolation: str = 'bilinear')¶
Generate training targets for Seg task.
- 参数
ignore_index – Index of ignore class.
label_name – The key corresponding to the gt seg in label.
interpolation – Interpolation method of image scaling, candidate value is [‘nearest’, ‘bilinear’].
- class hat.models.task_modules.seg.utils.CoordConv(in_channels: int, out_channels: int, kernel_size: Union[int, Tuple[int, int]], stride: Union[int, Tuple[int, int]] = 1, padding: Union[int, Tuple[int, int]] = 0, dilation: Union[int, Tuple[int, int]] = 1, groups: int = 1, bias: bool = True, padding_mode: str = 'zeros')¶
Coordinate Conv more detail ref to https://arxiv.org/pdf/1807.03247.pdf.
- 参数
torch.nn.Conv2d (ref to) –
- forward(feats)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.seg.vargnet_seg_head.FRCNNSegHead(group_base: int, in_strides: List, in_channels: List, out_strides: List, out_channels: List, bn_kwargs: Dict, proj_channel_multiplier: float = 1.0, with_extra_conv: bool = False, use_bias: bool = True, linear_out: bool = True, argmax_output: bool = False, with_score: bool = False, rle_label: bool = False, dequant_output: bool = True, int8_output: bool = False, no_upscale_infer: bool = False)¶
FRCNNSegHead module for segmentation task.
- 参数
group_base – Group base of group conv
in_strides – The strides corresponding to the inputs of seg_head, the inputs usually come from backbone or neck.
in_channels – Number of channels of each input stride.
out_strides – List of output strides.
out_channels – Number of channels of each output stride.
bn_kwargs – Extra keyword arguments for bn layers.
proj_channel_multiplier – Multiplier of channels of pw conv in block.
with_extra_conv – Whether to use extra conv module.
use_bias – Whether to use bias in conv module.
linear_out – Whether NOT to use to act of pw.
argmax_output – Whether conduct argmax on output.
with_score – Whether to keep score in argmax operation.
rle_label – Whether to calculate rle representation of label output.
dequant_output – Whether to dequant output.
int8_output – If True, output int8, otherwise output int32.
no_upscale_infer – Load params from x2 scale if True.
- forward(x: List[torch.Tensor]) List[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.stereonet.head.StereoNetHead(maxdisp: int = 192, refine_levels: int = 4, bn_kwargs: Optional[Dict] = None, num_groups: int = 32)¶
A basic head of StereoNet.
- 参数
maxdisp – The max value of disparity.
refine_levels – Number of refinement layers.
bn_kwargs – Dict for BN layer.
num_groups – Number of group for cost volume.
- build_concat_volume(refimg_fea: torch.Tensor, targetimg_fea: torch.Tensor, maxdisp: int) torch.Tensor ¶
Build the concat cost volume.
- 参数
refimg_fea – Left image feature.
targetimg_fea – Right image feature.
maxdisp – Maximum disparity value.
- 返回
Concatenated cost volume.
- 返回类型
volume
- build_gwc_volume(refimg_fea: torch.Tensor, targetimg_fea: torch.Tensor, maxdisp: int, num_groups: int)¶
Build the cost volume using the same approach as GWC-Net.
- 参数
refimg_fea – Left image feature.
targetimg_fea – Right image feature.
maxdisp – Maximum disparity value.
num_groups – Number of groups for groupwise correlation.
- 返回
Groupwise correlation cost volume.
- 返回类型
volume
- dis_mul(x: torch.Tensor) torch.Tensor ¶
Mul weight to the disparity.
- dis_sum(x: torch.Tensor) torch.Tensor ¶
Get the low disparity.
- forward(features: List[torch.Tensor]) List[torch.Tensor] ¶
Perform the forward pass of the model.
- 参数
features – The inputs featuremaps.
- 返回
The normalized predictions of the model.
- 返回类型
pred_pyramid_list
- fuse_model() None ¶
Perform model fusion on the specified modules within the class.
- groupwise_correlation(d: int, fea1: torch.Tensor, fea2: torch.Tensor, num_groups: int) torch.Tensor ¶
Compute groupwise correlation using the same approach as GWC-Net.
- 参数
d – Index of the FloatFunctional.
fea1 – Left image featuremap.
fea2 – Right image featuremap.
num_groups – Number of groups for groupwise correlation.
- 返回
Groupwise correlation result.
- 返回类型
cost_new
- init_weights() None ¶
Initialize the weights of head module.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.stereonet.headplus.StereoNetHeadPlus(maxdisp: int = 192, refine_levels: int = 4, bn_kwargs: Optional[Dict] = None, max_stride: int = 32, num_costvolume: int = 3, num_fusion: int = 6, hidden_dim: int = 16, in_channels: List[int] = (32, 32, 16, 16, 16))¶
An advanced head for StereoNet.
- 参数
maxdisp – The max value of disparity.
refine_levels – Number of refinement layers.
bn_kwargs – Dict for BN layer.
max_stride – The max stride for model input.
num_costvolume – The number of pyramid costvolume.
num_fusion – The number of fusion module.
hidden_dim – The hidden dim.
in_channels – The channels of input features.
- build_aanet_volume(refimg_fea, maxdisp, offset, idx)¶
Build the cost volume using the same approach as AANet.
- 参数
refimg_fea – Featuremaps.
maxdisp – Maximum disparity value.
offset – The offset of gc_mul and gc_mean floatFunctional.
idx – The idx of cat floatFunctional.
- 返回
Costvolume.
- 返回类型
volume
- dis_mul(x: torch.Tensor) torch.Tensor ¶
Mul weight to the disparity.
- dis_sum(x: torch.Tensor) torch.Tensor ¶
Get the low disparity.
- forward(features_inputs: List[torch.Tensor]) List[torch.Tensor] ¶
Perform the forward pass of the model.
- 参数
features – The inputs featuremaps.
- 返回
The low disparity. pred0_unfold: The low disparity after unfold. spx_pred: The weight of each point.
- 返回类型
pred0
- get_l_img(img: torch.Tensor, B: int) torch.Tensor ¶
Get left featuremaps.
- 参数
img – The inputs featuremaps.
B – Batchsize.
- get_offset(offset: int, idx: int) int ¶
Get offset of floatFunctional.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.stereonet.neck.StereoNetNeck(out_channels: List, use_bn: bool = True, bn_kwargs: Optional[Dict] = None, bias: bool = False, act_type: Optional[torch.nn.modules.module.Module] = None)¶
A extra features module of stereonet.
- 参数
out_channels – Channels for each block.
use_bn – Whether to use BN in module.
bn_kwargs – Dict for BN layer.
bias – Whether to use bias in module.
act_type – Activation layer.
- forward(imgs: torch.Tensor) List[torch.Tensor] ¶
Perform the forward pass of the model.
- 参数
imgs – The inputs images.
- 返回
The gwc features of left image. gwc_feature_right: The gwc features of right image. concat_feature_left: The concat features of left image. concat_feature_right: The concat features of right image. imgl: The left image.
- 返回类型
gwc_feature_left
- fuse_model() None ¶
Perform model fusion on the specified modules within the class.
- init_weights() None ¶
Initialize the weights of stereonet module.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.stereonet.post_process.StereoNetPostProcess(maxdisp: int = 192)¶
A basic post process for StereoNet.
- 参数
maxdisp – The max value of disparity.
- forward(pred_disps: List[torch.Tensor], gt_disps: Optional[List[torch.Tensor]] = None) Union[torch.Tensor, List[torch.Tensor]] ¶
Perform the forward pass of the model.
- 参数
pred_disps – The model outputs.
gt_disps – The gt disparitys.
- 返回
The prediction disparitys.
- 返回类型
pred_disps
- class hat.models.task_modules.stereonet.post_process.StereoNetPostProcessPlus(maxdisp: int = 192, low_max_stride: int = 8)¶
An advanced post process for StereoNet.
- 参数
maxdisp – The max value of disparity.
low_max_stride – The max stride of lowest disparity.
- forward(modelouts: List[torch.Tensor], gt_disps: Optional[List[torch.Tensor]] = None) Union[torch.Tensor, List[torch.Tensor]] ¶
Perform the forward pass of the model.
- 参数
modelouts – The model outputs.
gt_disps – The gt disparitys.
- class hat.models.task_modules.view_fusion.view_transformer.GKTTransformer(kernel_size: Tuple[float] = (3, 3), embed_dims: int = 160, grid_size: Optional[Tuple[float]] = None, **kwargs)¶
The GKT view transform for converting image view to bev view.
- 参数
kernel_size – Kernel size for points.
embed_dims – Dims for transformer.
- class hat.models.task_modules.view_fusion.view_transformer.LSSTransformer(in_channels: int, feat_channels: int, z_range: Tuple[float] = (- 10.0, 10.0), num_points: int = 10, depth: int = 60, mode: str = 'bilinear', padding_mode: str = 'zeros', depth_grid_quant_scale: float = 0.001953125, **kwargs)¶
The Lift-Splat-Shoot view transform for converting image view to bev view.
- 参数
in_channels – In channel of feature.
feat_channels – Feature channel of lift.
z_range – The range of Z for bev coordarin.
num_points – Num points for each voxel.
depth – Depth value.
mode – Mode for grid sample.
padding_mode – Padding mode for grid sample.
dgrid_quant_scale – Quanti scale for depth grid sample.
- fuse_model()¶
Perform model fusion on the modules.
- gen_reference_point(meta: Dict, feats: List[torch.Tensor]) Any ¶
Generate refrence points.
- 参数
meta – A dictionary containing the input data.
feats – The input for reference point generator.
- 返回
The Reference points.
- set_qconfig() None ¶
Set the quantization configuration.
- class hat.models.task_modules.view_fusion.view_transformer.WrappingTransformer(**kwargs)¶
The IPM view transform for converting image view to bev view.
- class hat.models.task_modules.view_fusion.cft_transformer.CFTAuxHead(out_size_factor: int = 1, min_radius: int = 3, upscale=4.0, loss: Optional[torch.nn.modules.module.Module] = None)¶
Auxiliary head module for the CFTTransformer.
- 参数
out_size_factor – Output size factor.
min_radius – Minimum radius of the heatmaps.
upscale – Upscale factor for resizing the features.
loss – Loss function.
- forward(feat: torch.Tensor, meta: Dict) Dict ¶
Forward pass of the CFTAuxHead.
- 参数
feat – Input feature tensor.
meta – Dictionary containing the input metadata.
- 返回
Dictionary containing the loss value.
- get_targets_single(feat: torch.Tensor, gt_bboxes: List[numpy.array]) torch.Tensor ¶
Compute the heatmap target for a single feature.
- 参数
feat – Input feature tensor.
gt_bboxes – List of ground truth bounding boxes.
- 返回
Heatmap tensor.
- 返回类型
heatmap
- set_qconfig() None ¶
Set the quantization configuration for the model.
- class hat.models.task_modules.view_fusion.cft_transformer.CFTTransformer(embed_dims: int = 256, position_range: Optional[List[float]] = None, num_heads: int = 8, feedforward_channels: int = 2048, dropout: float = 0.1, encoder_layers: int = 1, decoder_layers: int = 2, num_pos: int = 16, **kwargs)¶
Cross-View Fusion Transformer model for computer vision tasks.
- 参数
embed_dims – Embedding dimensions.
position_range – Range of position values.
num_heads – Number of attention heads.
feedforward_channels – Number of channels in the feedforward layers.
dropout – Dropout rate.
encoder_layers – Number of encoder layers.
decoder_layers – Number of decoder layers.
num_pos – Number of positions to embed.
**kwargs – Additional keyword arguments for the parent class.
- export_reference_points(meta: Dict, feat_hw: Tuple[int, int]) Dict ¶
Export refrence points.
- 参数
meta – A dictionary containing the input data.
feat_hw – View transformer input shape for generationg reference points.
- 返回
The Reference points.
- forward(feats: torch.Tensor, data: torch.Tensor, compile_model: bool) Tuple[torch.Tensor, torch.Tensor] ¶
Forward pass of the CFTTransformer.
- 参数
feats – Input feature tensor.
data – Dictionary containing the input data.
compile_model – Flag indicating whether the model is being compiled.
- 返回
Output feature tensor. ref_h: Reference height tensor.
- 返回类型
feats
- set_qconfig() torch.Tensor ¶
Set the quantization configuration for the model.
- class hat.models.task_modules.view_fusion.decoder.BevDetDecoder(loss_cls: Optional[torch.nn.modules.module.Module] = None, loss_reg: Optional[torch.nn.modules.module.Module] = None, **kwargs)¶
The detection decoder structure of bev.
- 参数
loss_cls – Classify loss module.
loss_reg – Regression loss module
- class hat.models.task_modules.view_fusion.decoder.BevDetDecoderInfer(tasks, task_keys, **kwargs)¶
The basic structure of BevDetDecoderInfer.
- 参数
tasks – The tasks for infers.
task_keys – The task keys for infers.
- forward(preds, meta)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.view_fusion.decoder.BevSegDecoder(loss: Optional[torch.nn.modules.module.Module] = None, use_bce: bool = False, **kwargs)¶
The segmentation decoder structure of bev.
- 参数
loss – loss module.
use_bce – Whether use binary cross entropy.
- class hat.models.task_modules.view_fusion.decoder.BevSegDecoderInfer(name: str, decoder: Optional[torch.nn.modules.module.Module] = None)¶
- forward(pred, meta)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.view_fusion.encoder.BevEncoder(backbone: torch.nn.modules.module.Module, neck: Optional[torch.nn.modules.module.Module] = None)¶
The basic encoder structure of bev.
- 参数
backbone – Backbone module.
neck – Neck module.
- forward(feat: torch.Tensor, meta: Dict) torch.Tensor ¶
Perform the forward pass through the model’s backbone and neck.
- 参数
feat – The input feature.
meta – The meta information.
- 返回
- The output feature after passing
through the backbone and neck.
- 返回类型
feat
- fuse_model() None ¶
Perform model fusion on the backbone and neck modules.
- set_qconfig() None ¶
Set the quantization configuration (qconfig).
- class hat.models.task_modules.view_fusion.encoder.VargBevBackbone(**kwargs)¶
The bev Backbone using varg block.
- class hat.models.task_modules.view_fusion.temporal_fusion.AddTemporalFusion(**kwargs)¶
Simple Add Temporal fusion for bev feats.
- class hat.models.task_modules.yolo.anchor.YOLOV3AnchorGenerator(anchors: List, strides: List, image_size: List)¶
Anchors generator for yolov3.
- 参数
anchors (List) – list if anchor size.
strides (List) – strides of feature map for anchors.
image_size (List) – Input size of image.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.yolo.filter.YOLOv3Filter(strides: Sequence[int], threshold: float, idx_range: Optional[Tuple[int, int]] = None, last_channels: float = 75)¶
Filter used for post-processing of YOLOv3
- 参数
strides – A list contains the strides of feature maps.
idx_range – The index range of values counted in compare of the first input. Defaults to None which means use all the values.
threshold – The lower bound of output.
last_channels – Last channels.
- forward(preds: Sequence[torch.Tensor], meta_and_label: Optional[Dict] = None, **kwargs) Sequence[torch.Tensor] ¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.yolo.head.YOLOV3Head(in_channels_list: list, feature_idx: list, num_classes: int, anchors: list, bn_kwargs: dict, bias: bool = True, reverse_feature: bool = True, int16_output: bool = True, dequant_output: bool = True)¶
Heads module of yolov3.
shared convs -> conv head (include all objs)
- 参数
in_channels_list – List of input channels.
feature_idx – Index of feature for head.
num_classes – Num classes of detection object.
anchors – Anchors for all feature maps.
bn_kwargs – Config dict for BN layer.
bias – Whether to use bias in module.
- forward(x)¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.yolo.label_encoder.YOLOV3LabelEncoder(class_encoder: torch.nn.modules.module.Module)¶
Encode gt and matching results for yolov3.
- 参数
class_encoder (torch.nn.Module) – config of class label encoder
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, match_pos_flag: torch.Tensor, match_gt_id: torch.Tensor, ig_flag: Optional[torch.Tensor] = None) Dict[str, torch.Tensor] ¶
Forward method.
- 参数
boxes (torch.Tensor) – (B, N, 4), batched predicted boxes
gt_boxes (torch.Tensor) – (B, M, 5+), batched ground truth boxes, might be padded.
match_pos_flag (torch.Tensor) – (B, N) matched result of each predicted box
match_gt_id (torch.Tensor) – (B, M) matched gt box index of each predicted box
ig_flag (torch.Tensor) – (B, N) ignore matched result of each predicted box
- class hat.models.task_modules.yolo.matcher.YOLOV3Matcher(ignore_thresh: float)¶
Bounding box classification label matcher by max iou.
Different rule and return condition with MaxIoUMatcher. YOLOv3MaxIoUMatcher will be merged with MaxIoUMatcher in future.
- 参数
ignore_thresh (float) – Boxes whose IOU larger than
ignore_thresh
is regarded as ignore samples for losses.
- forward(boxes: torch.Tensor, gt_boxes: torch.Tensor, gt_boxes_num: torch.Tensor, im_hw: Optional[torch.Tensor] = None) Tuple[torch.Tensor, torch.Tensor] ¶
Forward Method.
- 参数
boxes (torch.Tensor) – Box tensor with shape (B, N, 4) or (N, 4) when boxes are identical in the whole batch.
gt_boxes (torch.Tensor) – GT box tensor with shape (B, M, 5+). In one sample, if the number of gt boxes is less than M, the first M entries should be filled with real data, and others padded with arbitrary values.
gt_boxes_num (torch.Tensor) – GT box num tensor with shape (B). The actual number of gt boxes for each sample. Cannot be greater than M.
- 返回
tuple contains:
- flag (torch.Tensor): flag tensor with shape (B, N). Entries
with value 1 represents ignore, 0 for neg.
- matched_pred_id (torch.Tensor): matched_pred_id tensor in
(B, M). The best matched of gt_boxes.
- 返回类型
(tuple)
- class hat.models.task_modules.yolo.postprocess.YOLOV3HbirPostProcess(anchors: list, strides: list, num_classes: int, score_thresh: float = 0.01, nms_thresh: float = 0.45, topK: int = 200)¶
The postprocess of YOLOv3 Hbir.
- 参数
anchors – The anchors of yolov3.
strides – A list of strides.
num_classes – The num classes of class branch.
score_thresh – Score thresh of postprocess before nms.
nms_thresh – Nms thresh.
topK – The output num of bboxes after postprocess.
- forward(inputs: Sequence[torch.Tensor])¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class hat.models.task_modules.yolo.postprocess.YOLOV3PostProcess(anchors: list, strides: list, num_classes: int, score_thresh: float = 0.01, nms_thresh: float = 0.45, topK: int = 200)¶
The postprocess of YOLOv3.
- 参数
anchors – The anchors of yolov3.
num_classes – The num classes of class branch.
score_thresh – Score thresh of postprocess before nms.
nms_thresh – Nms thresh.
topK – The output num of bboxes after postprocess.
- forward(inputs: Sequence[torch.Tensor])¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
注解
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.