
We hope to setup a large-scale robotic application benchmark for LiDAR semantic segmentation task. We collect a total of 38904 frames of hybrid-solid LiDAR data in different substations through an industrial robot and have annotated 25 categories.

Example of labeled cumulative point clouds in S.MID.

A scene demo of labeled cumulative point clouds in our novel dataset S.MID. A scene demo of labeled cumulative point clouds in our novel dataset S.MID. A scene demo of labeled cumulative point clouds in our novel dataset S.MID.
Semantic LiDAR dataset comparison. Frames for train/val/test. Number of classes for single frame evaluation and annotated total number in brackets.
Datasets Frames LiDAR Types of LiDAR Classes Applications
nuScenes28130/6019/6008Velodyne-HDL-32EMechanical Spinning LiDAR16 (32)Autonomous Vehicle
SemanticKITTI19130/4071/20351Velodyne-HDL-64EMechanical Spinning LiDAR19 (34)Autonomous Vehicle
S.MID13101/5000/20803Livox Mid-360Hybrid-Solid LiDAR14 (25)Industrial Robot


Figures below show the sensors equipped on our industrial robot used to collect S.MID. Please note that only data collected by Livox Mid-360 and the corresponding labels are released with SMID_beta_v1_2 and SMID_v1_3.

The sensors equipped on our industrial robot used to collect S.MID. Livox Mid-360.

Livox Mid-360 is suitable for industrial robots involving scene understanding tasks since it covers a broader range of scenes with non-repetitive scanning mode. However, it is a double-edged sword. This mode will also make the point cloud relatively sparse and randomly distributed. Therefore, the single-frame hybrid-solid LiDAR segmentation task brings more challenges to network design. (More details can be found in our paper ).

Label distributions

For single-frame segmentation task, we merge the annotated labels into 14 classes (knife switch, main transformer, arrester, voltage transformer, busbar, switch, current transformer, scaffold, support column, road, other-ground, fence, fire shelter, wall). The imbalanced count of classes is common in substation scenes. Hence, similar to imbalanced class distributions observed in autonomous driving datasets, addressing the issue of imbalanced class distribution in S.MID is an integral aspect that methods must contend with.

A diagram of number of points in each class in S.MID.

Folder structure and format

Similar to SemanticKITTI, we provide for each scan XXXXXX.bin of the hybrid folder, a file XXXXXX.label in the labels folder that contains for each point a label in binary format. The label is a 32-bit unsigned integer (aka uint32_t) for each point, where the lower 16 bits correspond to the label. You can go to our project page to learn more about how to load our dataset.