6.1.7.1. hat.data¶
Main data module for training in HAT, which contains datasets, transforms, samplers.
6.1.7.1.1. Data¶
6.1.7.1.1.1. collates¶
Merge a list of samples to form a mini-batch of Tensor(s). |
|
Merge a list of samples to form a mini-batch of Tensor(s). |
|
Merge a list of samples to form a mini-batch of Tensor(s). |
|
|
|
Merge a list of samples to form a mini-batch of Tensor(s). |
|
Merge a list of samples to form a mini-batch of Tensor(s). |
|
Merge a list of samples to form a mini-batch of Tensor(s). |
|
Merge a list of samples to form a mini-batch of Tensor(s). |
6.1.7.1.1.2. dataloaders¶
Directly pass through input example. |
6.1.7.1.1.3. datasets¶
Cityscapes provides the method of reading cityscapes data from target pack type. |
|
CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format. |
|
A wrapper of repeated dataset. |
|
Dataset wrapper for multiple datasets with precise batch size. |
|
A wrapper of resample dataset. |
|
A wrapper of concatenated dataset with group flag. |
|
ImageNet provides the method of reading imagenet data from target pack type. |
|
ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format. |
|
ImageNet from image by torchvison. |
|
Kitti 2D Detection Dataset. |
|
Kitti2DDetectionPacker is used for converting kitti2D dataset to target DataType format. |
|
Kitti2D provides the method of reading kitti2d data from target pack type. |
|
Coco provides the method of reading coco data from target pack type. |
|
Coco Detection Dataset. |
|
CocoDetectionPacker is used for packing coco dataset to target format. |
|
Coco from image by torchvision. |
|
PascalVOC provides the method of reading voc data from target pack type. |
|
VOCDetectionPacker is used for packing voc dataset to target format. |
|
VOC from image by torchvision. |
|
Dataset which gets img data from the data_path. |
|
FlyingChairs provides the method of reading flyingChairs data from target pack type. |
|
FlyingChairsPacker is used for converting FlyingChairs dataset to target DataType format. |
|
Kitti3D provides the method of reading kitti3d data from target pack type. |
|
Kitti 3D Detection Dataset. |
|
Kitti3DDetectionPacker is used for converting kitti3D dataset to target DataType format. |
6.1.7.1.1.4. samplers¶
In one epoch period, do cyclic sampling on the dataset according to iter_time. |
|
The hook api for torch.utils.data.DistributedDampler. |
|
Distributed sampler that supports user-defined indices. |
|
Sampler that restricts data loading to a subset of the dataset. |
6.1.7.1.1.5. transforms¶
ConvertLayout is used for layout convert. |
|
BgrToYuv444 is used for color format convert. |
|
OneHot is used for convert layer to one-hot format. |
|
LabelSmooth is used for label smooth. |
|
Transforms of timm. |
|
Mixup of timm. |
|
Resize image & bbox & mask & seg. |
|
Flip image & bbox & mask & seg & flow. |
|
|
|
Normalize image. |
|
|
|
Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True. |
|
|
|
Crop image with fixed position and size. |
|
Randomly change the brightness, contrast, saturation and hue of an image. |
|
|
|
Random add color disturbance. |
|
Random expand the image & bboxes. |
|
Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious. |
|
Prepare faster-rcnn input data. |
|
Convert multi-classes detection data to multi-task data. |
|
List of image tensor to be stacked vertically. |
|
Random crop on data with gt_seg label, can only be used for segmentation |
|
Calculate the weight of each category according to the area of each category. |
|
Remap labels. |
|
OneHot is used for convert layer to one-hot format. |
|
Apply resize for both image and label. |
|
Apply random for both image and label. |
|
Scale input according to a scale list. |
|
|
|
Apply noise to each GT objects in the scene. |
|
Sample GT objects to the data. |
|
|
|
Apply global rotation to a 3D scene. |
|
Apply global scaling to a 3D scene. |
|
Flip the points & bbox. |
|
|
|
Generate voxel from points. |
|
Filter objects by point cloud range. |
|
Convert list args to dict. |
|
Delete keys in input dict. |
|
Rename keys in input dict. |
|
Convert a |
|
Convert PIL Image to Tensor. |
|
Convert tensor to numpy. |
|
Iterable transformer base on rois for object detection. |
|
|
6.1.7.1.2. API Reference¶
- hat.data.collates.collate_2d(batch: List[Any]) Union[torch.Tensor, Dict] ¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in 2d task, for collating data with inconsistent shapes.
- 参数
batch (list) – list of data.
- hat.data.collates.collate_2d_with_diff_im_hw(batch: List[Any]) Union[torch.Tensor, Dict] ¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in 2d task, for collating data with different image heights or widths. These inconsisten images will be vstacked in batch transform.
- 参数
batch (list) – list of data.
- hat.data.collates.collate_3d(batch_data: List[Any])¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in bev task. * If output tensor from dataset shape is (n,c,h,w),concat on aixs 0 directly. * If output tensor from dataset shape is (c,h,w),expand_dim on axis 0 and concat.
- 参数
batch (list) – list of data.
- hat.data.collates.collate_lidar(batch_list: List[Any]) Union[torch.Tensor, Dict] ¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in rad task, for collating data with inconsistent shapes. Rad(Realtime and Accurate 3D Object Detection).
First converts List[Dict[str, …] or List[Dict]] to Dict[str, List], then process values whoses keys are related to training.
- 参数
batch (list) – list of data.
- hat.data.collates.collate_nuscenes(batch: List[Any])¶
Merge a list of samples to form a mini-batch of Tensor(s).
- 参数
batch (list) – list of data.
- hat.data.collates.collate_psd(batch: List[Any])¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in parking slot detection(psd) task. For collating data with inconsistent shapes.
- 参数
batch (list) – list of data.
- hat.data.collates.collate_seq_with_diff_im_hw(batch: List[Dict]) Union[torch.Tensor, Dict] ¶
Merge a list of samples to form a mini-batch of Tensor(s).
Used in sequence task, for collating data with different image heights or widths. These inconsisten images will be vstacked in batch transform.
- 参数
batch (list) – list of data.
- class hat.data.dataloaders.PassThroughDataLoader(data: Any, *, length: int, clone: bool = False)¶
Directly pass through input example.
- 参数
data (Any) – Input data
length (int) – Length of dataloader
clone (bool, optional) – Whether clone input data
- class hat.data.datasets.BatchTransformDataset(dataset, transforms_cfgs, epoch_steps)¶
- class hat.data.datasets.Cityscapes(data_path: str, transforms: Optional[list] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
Cityscapes provides the method of reading cityscapes data from target pack type.
- 参数
data_path (str) – The path of packed file.
pack_type (str) – The pack type.
transfroms (list) – Transfroms of cityscapes before using.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.CityscapesPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
CityscapesPacker is used for converting Cityscapes dataset in torchvision to target DataType format.
- 参数
src_data_dir (str) – The dir of original cityscapes data.
target_data_dir (str) – Path for packed file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – Num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- 参数
idx (int) – Idx for reading.
- 返回
Processed data for pack.
- class hat.data.datasets.Coco(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
Coco provides the method of reading coco data from target pack type.
- 参数
data_path (str) – The path of packed file.
transforms (list) – Transfroms of data before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.CocoDetection(root, annFile, num_classes=80, transform=None, target_transform=None, transforms=None)¶
Coco Detection Dataset.
- 参数
root (string) – Root directory where images are downloaded to.
annFile (string) – Path to json annotation file.
num_classes (int) – The number of classes of coco. 80 or 91.
transform (callable, optional) – A function transform that takes in an PIL image and returns a transformed version. E.g,
transforms.ToTensor
target_transform (callable, optional) – A function transform that takes in the target and transforms it.
transforms (callable, optional) – A function transform that takes input sample and its target as entry and returns a transformed version.
- class hat.data.datasets.CocoDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_classes: int = 80, num_samples: Optional[int] = None, **kwargs)¶
CocoDetectionPacker is used for packing coco dataset to target format.
- 参数
src_data_dir (str) – The dir of original coco data.
target_data_dir (str) – Path for packed file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – The num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_classes (int) – The num of classes produced.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- 参数
idx (int) – Idx for reading.
- 返回
Processed data for pack.
- class hat.data.datasets.CocoFromImage(*args, **kwargs)¶
Coco from image by torchvision.
The params of COCOFromImage is same as params of torchvision.dataset.CocoDetection.
- class hat.data.datasets.ComposeDataset(datasets: List[Dict], batchsize_list: List[int])¶
Dataset wrapper for multiple datasets with precise batch size.
- 参数
datasets – config for each dataset.
batchsize_list – batchsize for each task dataset.
- class hat.data.datasets.ConcatDataset(datasets, with_flag: bool = False, record_index: bool = False)¶
A wrapper of concatenated dataset with group flag.
Same as
torch.utils.data.dataset.ConcatDataset
, addititionally concatenat the group flag of all dataset.- 参数
datasets – A list of datasets.
with_flag – Whether to concatenate datasets flags. If True, concatenate all datasets flag ( all datasets must has flag attribute in this case). Default to False.
record_index – Whether to record the index. If True, record the index. Default to False.
- class hat.data.datasets.FlyingChairs(data_path: str, transforms: Optional[list] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None, to_rgb: bool = True)¶
FlyingChairs provides the method of reading flyingChairs data from target pack type.
- 参数
data_path – The path of packed file.
transforms – Transfroms of data before using.
pack_type – The pack type.
pack_kwargs – Kwargs for pack type.
to_rgb – Whether to convert to rgb color_space.
- class hat.data.datasets.FlyingChairsFromImage(data_path: str, transforms: Optional[list] = None, to_rgb: bool = True, train_flag: bool = False, image1_name: str = '_img1', image2_name: str = '_img2', image_type: str = '.ppm', flow_name: str = '_flow', flow_type: str = '.flo')¶
Dataset which gets img data from the data_path.
- 参数
data_path – The path where the image and gt_flow is stored.
transforms – List of transform.
to_rgb – Whether to convert to rgb color_space.
train_flag – Whether the data use to train or test.
image1_name – The name suffix of image1.
image2_name – The name suffix of image2.
image_type – The image type of image1 and image2.
flow_name – The name suffix of flow.
flow_type – The flow type of flow.
- class hat.data.datasets.FlyingChairsPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
FlyingChairsPacker is used for converting FlyingChairs dataset to target DataType format.
- 参数
src_data_dir – The dir of original cityscapes data.
target_data_dir – Path for packed file.
split_name – Split name of data, such as train, val and so on.
num_workers – Num workers for reading data using multiprocessing.
pack_type – The file type for packing.
num_samples – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- 参数
idx (int) – Idx for reading.
- 返回
Processed data for pack.
- class hat.data.datasets.ImageNet(data_path: str, out_pil: bool = False, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
ImageNet provides the method of reading imagenet data from target pack type.
- 参数
data_path (str) – The path of packed file.
transforms (list) – Transforms of voc before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.ImageNetFromImage(transforms=None, *args, **kwargs)¶
ImageNet from image by torchvison.
The params of ImageNetFromImage are same as params of torchvision.datasets.ImageNet.
- class hat.data.datasets.ImageNetPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
ImageNetPacker is used for converting ImageNet dataset in torchvision to DataType format.
- 参数
src_data_dir (str) – The dir of original imagenet data.
target_data_dir (str) – Path for LMDB file.
split_name (str) – Split name of data, such as train, val and so on.
num_workers (int) – Num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- 参数
idx (int) – Idx for reading.
- 返回
Processed data for pack.
- class hat.data.datasets.Kitti2D(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
Kitti2D provides the method of reading kitti2d data from target pack type.
- 参数
data_path (str) – The path of LMDB file.
transforms (list) – Transforms of voc before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.Kitti2DDetection(root: str, annFile: str, transforms: Optional[Callable] = None)¶
Kitti 2D Detection Dataset.
- 参数
root (string) – Root directory where images are downloaded to.
annFile (string) – Path to json annotation file, kitti_train.json or kitti_eval.json. ( For ground truth, we do not use the official txt file format data, but use the json file marked by the Horizon Robotics. )
transforms (callable, optional) – A function transform that takes input sample and its target as entry and returns a transformed version.
- class hat.data.datasets.Kitti2DDetectionPacker(src_data_dir: str, target_data_dir: str, annFile: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
Kitti2DDetectionPacker is used for converting kitti2D dataset to target DataType format.
- 参数
src_data_dir (str) – The dir of original kitti2D data.
target_data_dir (str) – Path for LMDB file.
annFile (string) – Path to json annotation file, kitti_train.json or kitti_eval.json.
num_workers (int) – The num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- 参数
idx (int) – Idx for reading.
- 返回
Processed data for pack.
- class hat.data.datasets.Kitti3D(data_path: str, num_point_feature: int = 4, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None, is_testing: bool = False)¶
Kitti3D provides the method of reading kitti3d data from target pack type.
- 参数
data_path (str) – The path of LMDB file.
transforms (list) – Transforms of voc before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.Kitti3DDetection(source_path, split_name, transforms: Optional[Callable] = None, num_point_feature: int = 4)¶
Kitti 3D Detection Dataset.
- 参数
root (string) – Root directory where images are downloaded to.
split_name (string) – Dataset split, ‘train’ or ‘val’.
transforms (callable, optional) – A function transform that takes input sample and its target as entry and returns a transformed version.
- remove_dontcare(anno_info)¶
Remove annotations that do not need to be cared.
- 参数
anno_info (dict) – Dict of annotation infos. The ‘DontCare’ annotations will be removed according to ann_file[‘name’].
- class hat.data.datasets.Kitti3DDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
Kitti3DDetectionPacker is used for converting kitti3D dataset to target DataType format.
- 参数
src_data_dir (str) – The dir of original kitti2D data.
target_data_dir (str) – Path for LMDB file.
annFile (string) – Path to json annotation file, kitti_train.json or kitti_eval.json.
num_workers (int) – The num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- 参数
idx (int) – Idx for reading.
- 返回
Processed data for pack.
- class hat.data.datasets.PascalVOC(data_path: str, transforms: Optional[List] = None, pack_type: Optional[str] = None, pack_kwargs: Optional[dict] = None)¶
PascalVOC provides the method of reading voc data from target pack type.
- 参数
data_path (str) – The path of packed file.
transforms (list) – Transforms of voc before using.
pack_type (str) – The pack type.
pack_kwargs (dict) – Kwargs for pack type.
- class hat.data.datasets.RandDataset(length: int, example: Any, clone: bool = True, flag: int = 1)¶
- class hat.data.datasets.RepeatDataset(dataset, times)¶
A wrapper of repeated dataset.
Using RepeatDataset can reduce the data loading time between epochs.
- 参数
dataset (torch.utils.data.Dataset) – The datasets for repeating.
times (int) – Repeat times.
- class hat.data.datasets.ResampleDataset(dataset, with_flag: bool = False, resample_interval: int = 1)¶
A wrapper of resample dataset.
- Using ResampleDataset can resample on original dataset
with specific interval.
- 参数
dataset (dict) – The datasets for resampling.
with_flag (bool) – Whether to use dataset.flag. If True, resampling dataset.flag with resample_interval ( dataset must has flag attribute in this case.)
resample_interval (int) – resample interval.
- class hat.data.datasets.VOCDetectionPacker(src_data_dir: str, target_data_dir: str, split_name: str, num_workers: int, pack_type: str, num_samples: Optional[int] = None, **kwargs)¶
VOCDetectionPacker is used for packing voc dataset to target format.
- 参数
src_data_dir (str) – Dir of original voc data.
target_data_dir (str) – Path for packed file.
split_name (str) – Split name of data, such as trainval and test.
num_workers (int) – Num workers for reading data using multiprocessing.
pack_type (str) – The file type for packing.
num_samples (int) – the number of samples you want to pack. You will pack all the samples if num_samples is None.
- pack_data(idx)¶
Read orginal data from Folder with some process.
- 参数
idx (int) – Idx for reading.
- 返回
Processed data for pack.
- class hat.data.datasets.VOCFromImage(size=416, *args, **kwargs)¶
VOC from image by torchvision.
The params of VOCFromImage is same as params of torchvision.dataset.VOCDetection.
- class hat.data.samplers.DistSamplerHook(dataset, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)¶
The hook api for torch.utils.data.DistributedDampler. Used to get local rank and num_replicas before create DistributedSampler.
- 参数
dataset – compose dataset
num_replicas – same as DistributedSampler
rank – Same as DistributedSampler
shuffle – if shuffle data
seed – random seed
- class hat.data.samplers.DistributedCycleMultiDatasetSampler(dataset: hat.data.datasets.dataset_wrappers.ComposeDataset, batchsize_list: List[int], num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0)¶
In one epoch period, do cyclic sampling on the dataset according to iter_time.
- 参数
dataset – compose dataset
num_replicas (int) – same as DistributedSampler
rank (int) – Same as DistributedSampler
shuffle – if shuffle data
seed – random seed
- class hat.data.samplers.DistributedGroupSampler(dataset, samples_per_gpu: int = 1, num_replicas: Optional[int] = None, rank: Optional[int] = None, seed: int = 0)¶
Sampler that restricts data loading to a subset of the dataset.
Each batch data indices are sampled from one group in all of the groups. Groups are organized according to the dataset flags.
注解
Dataset is assumed to be constant size and must has flag attribute. Different number in flag array represent different groups. for example, in aspect ratio group flag, there are two groups, in which 0 represent h/w >= 1 and 1 represent h/w < 1 group. Dataset flag must is numpy array instance, the dtype must is np.uint8 and length at axis 0 must equal to the dataset length.
- 参数
dataset – Dataset used for sampling.
samples_per_gpu – Number samplers for each gpu. Default is 1.
num_replicas – Number of processes participating in distributed training.
rank – Rank of the current process within num_replicas.
seed – random seed used in torch.Generator(). This number should be identical across all processes in the distributed group. Default: 0.
- set_epoch(epoch)¶
Sets the epoch for this sampler. When
shuffle=True
, this ensures all replicas use a different random ordering for each epoch. Otherwise, the next iteration of this sampler will yield the same ordering.- 参数
epoch (int) – Epoch number.
- class hat.data.samplers.SelectedSampler(indices_function: Callable, dataset: torch.utils.data.dataset.Dataset, *, num_replicas: Optional[int] = None, rank: Optional[int] = None, shuffle: bool = True, seed: int = 0, drop_last: bool = False)¶
Distributed sampler that supports user-defined indices.
- 参数
indices_function (Callable) – Callback function given by user. Input are dataset and return a indices list.
dataset – Dataset used for sampling.
num_replicas (int, optional) – Number of processes participating in distributed training. By default, world_size is retrieved from the current distributed group.
rank (int, optional) – Rank of the current process in num_replicas. By default, rank is retrieved from the current distributed group.
shuffle (bool, optional) – If
True
(default), sampler will shuffle the indices.seed (int, optional) – random seed used to shuffle the sampler if shuffle=True. This number should be identical across all processes in the distributed group. Default:
0
.drop_last (bool, optional) – if
True
, then the sampler will drop the tail of the data to make it evenly divisible across the number of replicas. IfFalse
, the sampler will add extra indices to make the data evenly divisible across the replicas. Default:False
.
警告
In distributed mode, calling the
set_epoch()
method at the beginning of each epoch before creating theDataLoader
iterator is necessary to make shuffling work properly across multiple epochs. Otherwise, the same ordering will be always used.- set_epoch(epoch: int) None ¶
Sets the epoch for this sampler. When
shuffle=True
, this ensures all replicas use a different random ordering for each epoch. Otherwise, the next iteration of this sampler will yield the same ordering.- 参数
epoch (int) – Epoch number.
- class hat.data.transforms.AugmentHSV(hgain=0.5, sgain=0.5, vgain=0.5)¶
Random add color disturbance.
Convert RGB img to HSV, and then randomly change the hue, saturation and value.
注解
Affected keys: ‘img’.
- 参数
hgain (float) – Gain of hue.
sgain (float) – Gain of saturation.
vgain (float) – Gain of value.
- class hat.data.transforms.BgrToYuv444(rgb_input=False)¶
BgrToYuv444 is used for color format convert.
注解
Affected keys: ‘img’.
- 参数
rgb_input (bool) – The input is rgb input or not.
- class hat.data.transforms.ColorJitter(brightness=0.5, contrast=(0.5, 1.5), saturation=(0.5, 1.5), hue=0.1)¶
Randomly change the brightness, contrast, saturation and hue of an image.
For det and dict input are the main differences with ColorJitter in torchvision and the default settings have been changed to the most common settings.
注解
Affected keys: ‘img’.
- 参数
brightness (float or tuple of float (min, max)) – How much to jitter brightness.
contrast (float or tuple of float (min, max)) – How much to jitter contrast.
saturation (float or tuple of float (min, max)) – How much to jitter saturation.
hue (float or tuple of float (min, max)) – How much to jitter hue.
- class hat.data.transforms.ConvertLayout(hwc2chw=True, keys=None)¶
ConvertLayout is used for layout convert.
注解
Affected keys: ‘img’.
- 参数
hwc2chw (bool) – Whether to convert hwc to chw.
keys (list) –
- class hat.data.transforms.DeleteKeys(keys: List[str])¶
Delete keys in input dict.
- 参数
keys – key list to detele
- class hat.data.transforms.FixedCrop(size=None, min_area=- 1, min_iou=- 1, dynamic_roi_params=None)¶
Crop image with fixed position and size.
注解
Affected keys: ‘img’, ‘img_shape’, ‘pad_shape’, ‘layout’, ‘before_crop_shape’, ‘crop_offset’, ‘gt_bboxes’, ‘gt_classes’.
- inverse_transform(inputs, task_type, inverse_info)¶
Inverse option of transform to map the prediction to the original image.
- 参数
inputs (array) – Prediction
task_type (str) – detection or segmentation.
inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.
- class hat.data.transforms.IterableDetRoITransform(target_wh, flip_prob, img_scale_range=(0.5, 2.0), roi_scale_range=(0.8, 1.25), min_sample_num=1, max_sample_num=1, center_aligned=True, inter_method=10, use_pyramid=True, pyramid_min_step=0.7, pyramid_max_step=0.8, pixel_center_aligned=True, min_valid_area=8, min_valid_clip_area_ratio=0.5, min_edge_size=2, rand_translation_ratio=0, rand_aspect_ratio=0, rand_rotation_angle=0, reselect_ratio=0, clip_bbox=True, rand_sampling_bbox=True, resize_wh=None, keep_aspect_ratio=False)¶
Iterable transformer base on rois for object detection.
- 参数
resize_wh (list/tuple of 2 int, optional) – Resize input image to target size, by default None
**kwargs – Please see
AffineMatFromROIBoxGenerator
andImageAffineTransform
- class hat.data.transforms.LabelRemap(mapping: Sequence)¶
Remap labels.
注解
Affected keys: ‘gt_seg’.
- 参数
mapping (Sequence) – Mapping from input to output.
- class hat.data.transforms.LabelSmooth(num_classes, eta=0.1)¶
LabelSmooth is used for label smooth.
注解
Affected keys: ‘labels’.
- 参数
num_classes (int) – Num classes.
eta (float) – Eta of label smooth.
- class hat.data.transforms.ListToDict(keys: List[str])¶
Convert list args to dict.
- 参数
keys – keys for each object in args.
- class hat.data.transforms.MinIoURandomCrop(min_ious=(0.1, 0.3, 0.5, 0.7, 0.9), min_crop_size=0.3, bbox_clip_border=True, repeat_num=50)¶
Random crop the image & bboxes, the cropped patches have minimum IoU requirement with original image & bboxes, the IoU threshold is randomly selected from min_ious.
注解
Affected keys: ‘img’, ‘gt_bboxes’, ‘gt_classes’, ‘gt_difficult’.
- 参数
min_ious (tuple) – minimum IoU threshold for all intersections with
boxes (bounding) –
min_crop_size (float) – minimum crop’s size (i.e. h,w := a*h, a*w,
min_crop_size) (where a >=) –
bbox_clip_border (bool) – Whether clip the objects outside the border of the image. Defaults to True.
repeat_num (float) – Max repeat num for finding avaiable bbox.
- class hat.data.transforms.Normalize(mean: Union[float, Sequence[float]], std: Union[float, Sequence[float]])¶
Normalize image.
注解
Affected keys: ‘img’, ‘layout’.
- 参数
mean – mean of normalize.
std – std of normalize.
- class hat.data.transforms.ObjectNoise(class_names, gt_rotation_noise, gt_loc_noise_std, global_random_rot_range, num_try=100)¶
Apply noise to each GT objects in the scene.
- 参数
class_names (list[str]) – Class names.
gt_rotation_noise (list[float]) – Object rotation range.
gt_loc_noise_std (list[float]) – Object noise std.
global_random_rot_range (list[float]) – Global rotation to the scene.
num_try (int) – Number of times to try if the noise applied is invalid.
- class hat.data.transforms.ObjectRangeFilter(point_cloud_range: List[float])¶
Filter objects by point cloud range.
- 参数
point_cloud_range – Point cloud range.
- class hat.data.transforms.ObjectSample(db_sampler, class_names, random_crop=False, remove_points_after_sample=False, remove_outside_points=False)¶
Sample GT objects to the data.
- 参数
db_sampler (dict) – Config dict of the database sampler.
class_names (list[str]) – Class names.
random_crop (bool) – Whether to random crop.
remove_points_after_sample (bool) – Whether to remove points after sample.
remove_outside_points (bool) – Whether to remove outsize points.
- class hat.data.transforms.OneHot(num_classes)¶
OneHot is used for convert layer to one-hot format.
注解
Affected keys: ‘labels’.
- 参数
num_classes (int) – Num classes.
- class hat.data.transforms.PILToTensor¶
Convert PIL Image to Tensor.
- class hat.data.transforms.PadTensorListToBatch(pad_val: int = 0, seg_pad_val: Optional[int] = 255)¶
List of image tensor to be stacked vertically.
Used for diff shape tensors list.
- 参数
pad_val – Values to be filled in padding areas for img. Default to 0.
seg_pad_val – Value to be filled in padding areas for gt_seg. Default to 255.
- class hat.data.transforms.PointGlobalRotation(rotation=0.7853981633974483)¶
Apply global rotation to a 3D scene.
- 参数
rotation (list[float]) – Range of rotation angle.
- class hat.data.transforms.PointGlobalScaling(min_scale: float = 0.95, max_scale: float = 1.05)¶
Apply global scaling to a 3D scene.
- 参数
min_scale (float) – Min scale ratio.
max_scale (float) – Max scale ratio.
- class hat.data.transforms.PointRandomFlip(probability=0.5)¶
Flip the points & bbox.
- 参数
probability (float) – The flipping probability.
- class hat.data.transforms.RandomExpand(mean=(0, 0, 0), ratio_range=(1, 4), prob=0.5)¶
Random expand the image & bboxes.
Randomly place the original image on a canvas of ‘ratio’ x original image size filled with mean values. The ratio is in the range of ratio_range.
注解
Affected keys: ‘img’, ‘gt_bboxes’.
- 参数
ratio_range (tuple) – range of expand ratio.
prob (float) – probability of applying this transformation
- class hat.data.transforms.RandomFlip(px: Optional[float] = 0.5, py: Optional[float] = 0)¶
Flip image & bbox & mask & seg & flow.
注解
Affected keys: ‘img’, ‘ori_img’, ‘img_shape’, ‘pad_shape’, ‘gt_bboxes’, ‘gt_seg’, ‘gt_flow’, ‘gt_mask’, ‘gt_ldmk’, ‘ldmk_pairs’.
- 参数
px – Horizontal flip probability, range between [0, 1].
py – Vertical flip probability, range between [0, 1].
- class hat.data.transforms.RenameKeys(keys: List[str], split='|')¶
Rename keys in input dict.
- 参数
keys – key list to rename, in “old_name | new_name” format.
- class hat.data.transforms.Resize(img_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, max_scale: Optional[Union[Sequence[int], Sequence[Sequence[int]]]] = None, multiscale_mode='range', ratio_range=None, keep_ratio=True)¶
Resize image & bbox & mask & seg.
注解
Affected keys: ‘img’, ‘ori_img’, ‘img_shape’, ‘pad_shape’, ‘resized_shape’, ‘pad_shape’, ‘scale_factor’, ‘gt_bboxes’, ‘gt_seg’, ‘gt_ldmk’.
- 参数
img_scale – See above.
max_scale – The max size of image. If the image’s shape > max_scale, The image is resized to max_scale
multiscale_mode (str) – Value must be one of “range” or “value”. This transform resizes the input image and bbox to same scale factor. There are 3 multiscale modes: ‘ratio_range’ is not None: randomly sample a ratio from the ratio range and multiply with the image scale. e.g. Resize(img_scale=(400, 500)), multiscale_mode=’range’, ratio_range=(0.5, 2.0) ‘ratio_range’ is None and ‘multiscale_mode’ == “range”: randomly sample a scale from a range, the length of img_scale[tuple] must be 2, which represent small img_scale and large img_scale. e.g. Resize(img_scale=((100, 200), (400,500)), multiscale_mode=’range’) ‘ratio_range’ is None and ‘multiscale_mode’ == “value”: randomly sample a scale from multiple scales. e.g. Resize(img_scale=((100, 200), (300, 400), (400, 500)), multiscale_mode=’value’)))
ratio_range (tuple[float]) – Value represent (min_ratio, max_ratio), scale factor range.
keep_ratio (bool) – Whether to keep the aspect ratio when resizing the image.
- inverse_transform(inputs, task_type, inverse_info)¶
Inverse option of transform to map the prediction to the original image.
- 参数
inputs (array|Tensor) – Prediction.
task_type (str) – detection or segmentation.
inverse_info (dict) – The transform keyword is the key, and the corresponding value is the value.
- class hat.data.transforms.Scale(scales: Union[numbers.Real, Sequence], mode: str = 'nearest', mul_scale: bool = False)¶
Scale input according to a scale list.
注解
Affected keys: ‘img’, ‘gt_flow’, ‘gt_ori_flow’, ‘gt_seg’.
- 参数
scales (Union[Real, Sequence]) – The scales to apply on input.
mode (str) – algorithm used for upsampling:
'nearest'
|'bilinear'
|'area'
. Default:'nearest'
mul_scale (bool) – Whether to multiply the scale coefficient.
- class hat.data.transforms.SegOneHot(num_classes: int)¶
OneHot is used for convert layer to one-hot format.
注解
Affected keys: ‘gt_seg’.
- 参数
num_classes (int) – Num classes.
- class hat.data.transforms.SegRandomAffine(degrees: Union[Sequence, float] = 0, translate: Tuple = None, scale: Tuple = None, shear: Union[Sequence, float] = None, interpolation: torchvision.transforms.functional.InterpolationMode = InterpolationMode.NEAREST, fill: Union[tuple, int] = 0, label_fill_value: Union[tuple, int] = - 1, translate_p: float = 1.0, scale_p: float = 1.0)¶
Apply random for both image and label.
Please refer to
RandomAffine
for details.注解
Affected keys: ‘img’, ‘gt_flow’, ‘gt_seg’.
- 参数
label_fill_value (tuple or int, optional) – Fill value for label. Defaults to -1.
translate_p – Translate flip probability, range between [0, 1].
scale_p – Scale flip probability, range between [0, 1].
- class hat.data.transforms.SegRandomCrop(size, cat_max_ratio=1.0, ignore_index=255)¶
- Random crop on data with gt_seg label, can only be used for segmentation
task.
注解
Affected keys: ‘img’, ‘img_shape’, ‘pad_shape’, ‘layout’, ‘gt_seg’.
- 参数
size (tuple) – Expected size after cropping, (h, w).
cat_max_ratio (float, optional) – The maximum ratio that single category could occupy.
ignore_index (int, optional) – When considering the cat_max_ratio condition, the area corresponding to ignore_index will be ignored.
- get_crop_bbox(data)¶
Randomly get a crop bounding box.
- class hat.data.transforms.SegReWeightByArea(seg_num_classes, lower_bound: int = 0.5, ignore_index: int = 255)¶
Calculate the weight of each category according to the area of each category.
For each category, the calculation formula of weight is as follows: weight = max(1.0 - seg_area / total_area, lower_bound)
注解
Affected keys: ‘gt_seg’, ‘gt_seg_weight’.
- 参数
seg_num_classes (int) – Number of segmentation categories.
lower_bound (float) – Lower bound of weight.
ignore_index (int) – Index of ignore class.
- class hat.data.transforms.SegResize(size, interpolation=InterpolationMode.BILINEAR)¶
Apply resize for both image and label.
注解
Affected keys: ‘img’, ‘gt_seg’.
- 参数
size – target size of resize.
interpolation – interpolation method of resize.
- forward(data)¶
- 参数
img (PIL Image or Tensor) – Image to be scaled.
- 返回
Rescaled image.
- 返回类型
PIL Image or Tensor
- class hat.data.transforms.TensorToNumpy¶
Convert tensor to numpy.
- class hat.data.transforms.TimmMixup(*args, **kwargs)¶
Mixup of timm.
注解
Affected keys: ‘img’, ‘labels’.
- 参数
timm.data.Mixup (args are the same as) –
- class hat.data.transforms.TimmTransforms(*args, **kwargs)¶
Transforms of timm.
注解
Affected keys: ‘img’.
- 参数
timm.data.create_transform (args are the same as) –
- class hat.data.transforms.ToFasterRCNNData(max_gt_boxes_num=500, max_ig_regions_num=500)¶
Prepare faster-rcnn input data.
Convert
gt_bboxes
(n, 4) >_classes
(n, ) togt_boxes
(n, 5),gt_boxes_num
(1, ),ig_regions
(m, 5),ig_regions_num
(m, ); Ifgt_ids
exists, it will be concated intogt_boxes
, resulting ingt_boxes
array shape expanding from nx5 to nx6.Convert key
img_shape
toim_hw
; Convert image Layout tochw
;- 参数
max_gt_boxes_num (int) – Max gt bboxes number in one image, Default 500.
max_ig_regions_num (int) – Max ignore regions number in one image, Default 500.
- 返回
- Result dict with
gt_boxes
(max_gt_boxes_num, 5 or 6),gt_boxes_num
(1, ),ig_regions
(max_ig_regions_num, 5 or 6),ig_regions_num
(1, ),im_hw
(2,)layout
convert to “chw”.
- 返回类型
dict
- class hat.data.transforms.ToMultiTaskFasterRCNNData(taskname_clsidx_map: Dict[str, int], max_gt_boxes_num: int = 500, max_ig_regions_num: int = 500)¶
Convert multi-classes detection data to multi-task data.
Each class will be convert to a detection task.
- 参数
taskname_clsidx_map – {cls1: cls_idx1, cls2: cls_idx2}.
max_gt_boxes_num – Same as ToFasterRCNNData.
max_ig_regions_num – Same as ToFasterRCNNData.
- 返回
- Result dict with
”task1”: FasterRCNNDataDict1, “task2”: FasterRCNNDataDict2,
- 返回类型
dict
- class hat.data.transforms.ToTensor(to_yuv=False)¶
Convert objects of various python types to torch.Tensor and convert the img to yuv444 format if to_yuv is True.
Supported types are: numpy.ndarray, torch.Tensor, Sequence, int, float.
注解
Affected keys: ‘img’, ‘img_shape’, ‘pad_shape’, ‘layout’, ‘gt_bboxes’, ‘gt_seg’, ‘gt_seg_weights’, ‘gt_flow’, ‘color_space’.
- 参数
to_yuv (bool) – If true, convert the img to yuv444 format.
- class hat.data.transforms.Undistortion¶
- Convert a
PIL Image
ornumpy.ndarray
to undistor
PIL Image
ornumpy.ndarray
.
- Convert a
- class hat.data.transforms.Voxelization(pc_range, voxel_size, max_points_in_voxel, max_voxel_num, use_max=False)¶
Generate voxel from points.
- 参数
pc_range (list) – [x_min, y_min, z_min, x_max, y_max, z_max].
voxel_size (list) – list [x, y, z] size of three dimension.
max_points_in_voxel (int) – Max number of points per voxel
max_voxel_num (int) – Max number of voxels.