×

LRC-Net: learning discriminative features on point clouds by encoding local region contexts. (English) Zbl 1505.94004

Summary: Learning discriminative feature directly on point clouds is still challenging in the understanding of 3D shapes. Recent methods usually partition point clouds into local region sets, and then extract the local region features with fixed-size CNN or MLP, and finally aggregate all individual local features into a global feature using simple max pooling. However, due to the irregularity and sparsity in sampled point clouds, it is hard to encode the fine-grained geometry of local regions and their spatial relationships when only using the fixed-size filters and individual local feature integration, which limit the ability to learn discriminative features. To address this issue, we present a novel Local-Region-Context Network (LRC-Net), to learn discriminative features on point clouds by encoding the fine-grained contexts inside and among local regions simultaneously. LRC-Net consists of two main modules. The first module, named intra-region context encoding, is designed for capturing the geometric correlation inside each local region by novel variable-size convolution filter. The second module, named inter-region context encoding, is proposed for integrating the spatial relationships among local regions based on spatial similarity measures. Experimental results show that LRC-Net is competitive with state-of-the-art methods in shape classification and shape segmentation applications.

MSC:

94A08 Image processing (compression, reconstruction, etc.) in information and communication theory
65D10 Numerical smoothing, curve fitting
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Armeni, I.; Sener, O.; Zamir, A. R.; Jiang, H.; Brilakis, I.; Fischer, M.; Savarese, S., 3D semantic parsing of large-scale indoor spaces, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)), 1534-1543
[2] Beksi, W. J.; Papanikolopoulos, N., A topology-based descriptor for 3D point cloud modeling: theory and experiments, Image Vis. Comput., 88, 84-95 (2019)
[3] Belongie, S.; Malik, J.; Puzicha, J., Shape context: a new descriptor for shape matching and object recognition, (The Conference on Neural Information Processing Systems (NeurIPS) (2001)), 831-837
[4] Engelmann, F.; Kontogianni, T.; Hermans, A.; Leibe, B., Exploring spatial context for 3D semantic segmentation of point clouds, (The IEEE International Conference on Computer Vision Workshops (2017)), 716-724
[5] Fehr, D.; Beksi, W. J.; Zermas, D.; Papanikolopoulos, N., Covariance based point cloud descriptors for object detection and recognition, Comput. Vis. Image Underst., 142, 80-93 (2016)
[6] Gao, G.; Liu, Y. S.; Lin, P.; Wang, M.; Gu, M.; Yong, J. H., BIMTag: concept-based automatic semantic annotation of online BIM product resources, Adv. Eng. Inform., 31, 48-61 (2017)
[7] Gao, G.; Liu, Y. S.; Wang, M.; Gu, M.; Yong, J. H., A query expansion method for retrieving online BIM resources based on industry foundation classes, Autom. Constr., 56, 14-25 (2015)
[8] Golovinskiy, A.; Kim, V. G.; Funkhouser, T., Shape-based recognition of 3D point clouds in urban environments, (The IEEE International Conference on Computer Vision (ICCV) (2009)), 2154-2161
[9] Han, Z.; Liu, X.; Liu, Y. S.; Zwicker, M., Parts4Feature: learning 3D global features from generally semantic parts in multiple views, (The International Joint Conference on Artificial Intelligence (IJCAI) (2019)), 766-773
[10] Han, Z.; Liu, Z.; Han, J.; Vong, C. M.; Bu, S.; Chen, C., Mesh convolutional restricted Boltzmann machines for unsupervised learning of features with structure preservation on 3D meshes, IEEE Trans. Neural Netw. Learn. Syst., 28, 2268-2281 (2017)
[11] Han, Z.; Liu, Z.; Han, J.; Vong, C. M.; Bu, S.; Chen, C., Unsupervised learning of 3D local features from raw voxels based on a novel permutation voxelization strategy, IEEE Trans. Cybern., 49, 481-494 (2019)
[12] Han, Z.; Liu, Z.; Han, J.; Vong, C. M.; Bu, S.; Li, X., Unsupervised 3D local feature learning by circle convolutional restricted Boltzmann machine, IEEE Trans. Image Process., 25, 5331-5344 (2016) · Zbl 1408.94230
[13] Han, Z.; Liu, Z.; Vong, C. M.; Liu, Y. S.; Bu, S.; Han, J.; Chen, C. P., BoSCC: bag of spatial context correlations for spatially enhanced 3D shape representation, IEEE Trans. Image Process., 26, 3707-3720 (2017) · Zbl 1409.94215
[14] Han, Z.; Liu, Z.; Vong, C. M.; Liu, Y. S.; Bu, S.; Han, J.; Chen, C. P., Deep spatiality: unsupervised learning of spatially-enhanced global and local 3D features by deep neural network with coupled softmax, IEEE Trans. Image Process., 27, 3049-3063 (2018)
[15] Han, Z.; Lu, H.; Liu, Z.; Vong, C. M.; Liu, Y. S.; Zwicker, M.; Han, J.; Chen, C. P., 3D2SeqViews: aggregating sequential views for 3D global feature learning by CNN with hierarchical attention aggregation, IEEE Trans. Image Process., 28, 3986-3999 (2019) · Zbl 07122957
[16] Han, Z.; Shang, M.; Liu, Y. S.; Zwicker, M., View Inter-Prediction GAN: unsupervised representation learning for 3D shapes by learning global shape memories to support local view predictions, (The AAAI Conference on Artificial Intelligence (AAAI) (2019)), 8376-8384
[17] Han, Z.; Shang, M.; Liu, Z.; Vong, C. M.; Liu, Y. S.; Zwicker, M.; Han, J.; Chen, C. P., SeqViews2SeqLabels: learning 3D global features via aggregating sequential views by RNN with attention, IEEE Trans. Image Process., 28, 658-672 (2018) · Zbl 1409.94216
[18] Han, Z.; Shang, M.; Wang, X.; Liu, Y. S.; Zwicker, M., Y2Seq2Seq: cross-modal representation learning for 3D shape and text by joint reconstruction and prediction of view and word sequences, (Proceedings of the AAAI Conference on Artificial Intelligence (2019)), 126-133
[19] Han, Z.; Wang, X.; Liu, Y. S.; Zwicker, M., Multi-Angle Point Cloud-VAE: unsupervised feature learning for 3D point clouds from multiple angles by joint self-reconstruction and half-to-half prediction, (The IEEE International Conference on Computer Vision (ICCV) (2019))
[20] Han, Z.; Wang, X.; Vong, C. M.; Liu, Y. S.; Zwicker, M.; Chen, C., 3DViewGraph: learning global features for 3D shapes from a graph of unordered views with attention, (The International Joint Conference on Artificial Intelligence (IJCAI) (2019)), 758-765
[21] Hu, T.; Han, Z.; Shrivastava, A.; Zwicker, M., Render4Completion: synthesizing multi-view depth maps for 3D shape completion, (The IEEE International Conference on Computer Vision Workshops (0-0 2019))
[22] Kim, Y., Convolutional neural networks for sentence classification, (The Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)), 1746-1751
[23] Klokov, R.; Lempitsky, V., Escape from cells: deep kd-networks for the recognition of 3D point cloud models, (The IEEE International Conference on Computer Vision (ICCV), IEEE (2017)), 863-872
[24] Komarichev, A.; Zhong, Z.; Hua, J., A-CNN: annularly convolutional neural networks on point clouds, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)), 7421-7430
[25] Krizhevsky, A.; Sutskever, I.; Hinton, G. E., ImageNet classification with deep convolutional neural networks, (The Conference on Neural Information Processing Systems (NeurIPS) (2012)), 1097-1105
[26] Li, J.; Chen, B. M.; Hee Lee, G., SO-Net: self-organizing network for point cloud analysis, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)), 9397-9406
[27] Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B., PointCNN: convolution on X-transformed points, (The Conference on Neural Information Processing Systems (NeurIPS) (2018)), 820-830
[28] Li, Y.; Pirk, S.; Su, H.; Qi, C. R.; Guibas, L. J., FPNN: field probing neural networks for 3D data, (The Conference on Neural Information Processing Systems (NeurIPS) (2016)), 307-315
[29] Liu, M.; Liu, Y. S.; Ramani, K., Computing global visibility maps for regions on the boundaries of polyhedra using Minkowski sums, Comput. Aided Des., 41, 668-680 (2009)
[30] Liu, X.; Han, Z.; Liu, Y. S.; Zwicker, M., Point2Sequence: learning the shape representation of 3D point clouds with an attention-based sequence to sequence network, (The AAAI Conference on Artificial Intelligence (AAAI) (2019)), 8778-8785
[31] Liu, X.; Han, Z.; Wen, X.; Liu, Y. S.; Zwicker, M., L2G auto-encoder: understanding point clouds by local-to-global reconstruction with hierarchical self-attention, (The ACM International Conference on Multimedia (ACM MM) (2019)), 989-997
[32] Liu, Y.; Fan, B.; Xiang, S.; Pan, C., Relation-shape convolutional neural network for point cloud analysis, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2019)), 8895-8904
[33] Liu, Y. S.; Ramani, K., Robust principal axes determination for point-based shapes using least median of squares, Comput. Aided Des., 41, 293-305 (2009)
[34] Liu, Y. S.; Ramani, K.; Liu, M., Computing the inner distances of volumetric models for articulated shape description with a visibility graph, IEEE Trans. Pattern Anal. Mach. Intell., 33, 2538-2544 (2011)
[35] Mao, J.; Wang, X.; Li, H., Interpolated convolutional networks for 3D point cloud understanding, (The IEEE International Conference on Computer Vision (ICCV) (2019))
[36] Maturana, D.; Scherer, S., VoxNet: a 3D convolutional neural network for real-time object recognition, (The IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), IEEE (2015)), 922-928
[37] Nair, V.; Hinton, G. E., Rectified linear units improve restricted Boltzmann machines, (The International Conference on Machine Learning (ICML) (2010)), 807-814
[38] Qi, C. R.; Liu, W.; Wu, C.; Su, H.; Guibas, L. J., Frustum pointnets for 3D object detection from rgb-d data, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)), 918-927
[39] Qi, C. R.; Su, H.; Mo, K.; Guibas, L. J., PointNet: deep learning on point sets for 3D classification and segmentation, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)), 652-660
[40] Qi, C. R.; Yi, L.; Su, H.; Guibas, L. J., PointNet++: deep hierarchical feature learning on point sets in a metric space, (The Conference on Neural Information Processing Systems (NeurIPS) (2017)), 5099-5108
[41] Riegler, G.; Ulusoy, A. O.; Geiger, A., OctNet: learning deep 3D representations at high resolutions, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)), 3577-3586
[42] Rusu, R. B.; Marton, Z. C.; Blodow, N.; Dolha, M.; Beetz, M., Towards 3D point cloud based object maps for household environments, (Robotics and Autonomous Systems (2008)), 927-941
[43] Savva, M.; Yu, F.; Su, H.; Aono, M.; Chen, B.; Cohen-Or, D.; Deng, W.; Su, H.; Bai, S.; Bai, X., SHREC’16 track large-scale 3D shape retrieval from ShapeNet core55, (The Eurographics Workshop on 3D Object Retrieval (2016)), 89-98
[44] Shen, Y.; Feng, C.; Yang, Y.; Tian, D., Mining point cloud local structures by kernel correlation and graph pooling, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018))
[45] Skrodzki, M.; Jansen, J.; Polthier, K., Directional density measure to intrinsically estimate and counteract non-uniformity in point clouds, Comput. Aided Geom. Des., 64, 73-89 (2018) · Zbl 1505.65092
[46] Srivastava, S.; Lall, B., DeepPoint3D: learning discriminative local descriptors using deep metric learning on 3D point clouds, Pattern Recognit. Lett., 127, 27-36 (2019)
[47] Wang, P. S.; Liu, Y.; Guo, Y. X.; Sun, C. Y.; Tong, X., O-CNN: octree-based convolutional neural networks for 3D shape analysis, ACM Trans. Graph. (TOG), 36, 72 (2017)
[48] Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S. E.; Bronstein, M. M.; Solomon, J. M., Dynamic graph CNN for learning on point clouds, ACM Trans. Graph. (TOG) (2018)
[49] Wen, X.; Han, Z.; Liu, X.; Liu, Y. S., Point2SpatialCapsule: aggregating features and spatial relationships of local regions on point clouds using spatial-aware capsules (2019), arXiv preprint · Zbl 07586516
[50] Wen, X.; Li, T.; Han, Z.; Yu-Shen, L., Point cloud completion by skip-attention network with hierarchical folding, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020))
[51] Wu, Z.; Song, S.; Aditya, K.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J., 3D ShapeNets: a deep representation for volumetric shapes, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)), 1912-1920
[52] Xie, S.; Liu, S.; Chen, Z.; Tu, Z., Attentional ShapeContextNet for point cloud recognition, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)), 4606-4615
[53] Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Y., 2018. SpiderCNN: Deep Learning on point sets with parameterized convolutional filters. In: ECCV.
[54] Yi, C.; Lu, D.; Xie, Q.; Liu, S.; Li, H.; Wei, M.; Wang, J., Hierarchical tunnel modeling from 3D raw LiDAR point cloud, Comput. Aided Des., 114, 143-154 (2019)
[55] Zhao, H.; Tang, M.; Ding, H., HoPPF: a novel local surface descriptor for 3D object recognition, Pattern Recognit., Article 107272 pp. (2020)
[56] Zheng, Y.; Li, G.; Xu, X.; Wu, S.; Nie, Y., Rolling normal filtering for point clouds, Comput. Aided Geom. Des., 62, 16-28 (2018) · Zbl 1505.65112
[57] Zhong, S.; Zhong, Z.; Hua, J., Surface reconstruction by parallel and unified particle-based resampling from point clouds, Comput. Aided Geom. Des., 71, 43-62 (2019) · Zbl 1505.65113
[58] Zhou, Y.; Tuzel, O., VoxelNet: end-to-end learning for point cloud based 3D object detection, (The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)), 4490-4499
[59] Zhu, Y.; Mottaghi, R.; Kolve, E.; Lim, J. J.; Gupta, A.; Fei-Fei, L.; Farhadi, A., Target-driven visual navigation in indoor scenes using deep reinforcement learning, (The IEEE International Conference on Robotics and Automation (ICRA) (2017)), 3357-3364
[60] Zou, Y.; Wang, X.; Zhang, T.; Liang, B.; Song, J.; Liu, H., BRoPH: an efficient and compact binary descriptor for 3D point clouds, Pattern Recognit., 76, 522-536 (2018)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.