HPNet: Detecting human parts in the wild / HPNet: Detectando partes humanas na natureza

Claudemir Casa, Jhonatan Souza, Tiago Mota de Oliveira

Abstract


The objective of this work is to present a model of neural network for the detection and segmentation of human’s body parts (HPNet). We offer a multiplatform real-time solution that can be run on ordinary computers, as well as on mobile devices and embedded systems. Our proposal is characterized by presenting a compact solution, and by investigating a part of object detection still little explored. One of the striking features presented is the ability to recognize parts of the human body even in uncontrolled environments, due to the use of a random subset of Google’s public database (Open Images Dataset) that contains images with objects in the most varied sizes, positions, lighting and occlusion conditions. At first, we offer a solution only for the detection and segmentation of the common parts of the human body, but we intend to expand its capabilities to detect other more specific parts and regions. The main purpose of our model is its use to solve specific problems that require the detection and segmentation of human’s body parts, for example, in user authentication.


Keywords


Human part detection, compact model, real time model, neural networks.

Full Text:

PDF

References


W. Zhiqiang and L. Jun, “A review of object detection based on convolutional neural network,” in 2017 36th Chinese Control Conference (CCC). IEEE, 2017, pp. 11 104–11 109.

R. B. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” CoRR, vol. abs/1311.2524, 2013. [Online]. Available: http://arxiv.org/abs/1311.2524

J. Dai, Y. Li, K. He, and J. Sun, “R-fcn: Object detection via region-based fully convolutional networks,” in Proceedings of the 30th International Conference on Neural Information Processing Systems, ser. NIPS’16. USA: Curran Associates Inc., 2016, pp. 379–387. [Online]. Available: http://dl.acm.org/citation.cfm?id=3157096.3157139

R. Girshick, “Fast r-cnn,” in The IEEE International Conference on Computer Vision (ICCV), December 2015.

S. Ren, K. He, R. B. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” CoRR, vol. abs/1506.01497, 2015. [Online]. Available: http://arxiv.org/abs/1506. 01497

K. He, G. Gkioxari, P. Dolla´r, and R. B. Girshick, “Mask R-CNN,” CoRR, vol. abs/1703.06870, 2017. [Online]. Available: http://arxiv.org/abs/1703.06870

Z. Cai, Q. Fan, R. S. Feris, and N. Vasconcelos, “A unified multi-scale deep convolutional neural network for fast object detection,” CoRR, vol. abs/1607.07155, 2016. [Online]. Available: http://arxiv.org/abs/1607.07155

J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,” arXiv, 2018.

M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin et al., “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” arXiv preprint arXiv:1603.04467, 2016.

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga et al., “Pytorch: An imperative style, high-performance deep learning library,” in Advances in neural information processing systems, 2019, pp. 8026–8037.

M. Coskun, A. Ucar, O¨ . Yildirim, and Y. Demir, “Face recognition based on convolutional neural network,” in 2017 International Conference on Modern Electrical and Energy Systems (MEES). IEEE, 2017, pp. 376– 379.

M. Sajjad, M. Nasir, F. U. M. Ullah, K. Muhammad, A. K. Sangaiah, and S. W. Baik, “Raspberry pi assisted facial expression recognition framework for smart security in law-enforcement services,” Information Sciences, vol. 479, pp. 416–431, 2019.

M. Coskun, A. Ucar, O. Yildirim, and Y. Demir, “Face recognition based on convolutional neural network,” in 2017 International Conference on Modern Electrical and Energy Systems (MEES). IEEE, 2017, pp. 376– 379.

L. P. e Silva, F. H. d. B. Zavan, O. R. Bellon, and L. Silva, “Nose based rigid face tracking,” in Iberoamerican Congress on Pattern Recognition. Springer, 2018, pp. 556–563.

V. R. R. Chirra, S. ReddyUyyala, and V. K. K. Kolli, “Deep cnn: A machine learning approach for driver drowsiness detection based on eye state.” Revue d’Intelligence Artificielle, vol. 33, no. 6, pp. 461–466, 2019.

——, “Deep cnn: A machine learning approach for driver drowsiness detection based on eye state.” Revue d’Intelligence Artificielle, vol. 33, no. 6, pp. 461–466, 2019.

A. Jalal, A. Nadeem, and S. Bobasu, “Human body parts estimation and detection for physical sports movements,” in 2019 2nd International Conference on Communication, Computing and Digital systems (C- CODE), 2019, pp. 104–109.

P. Samangouei, V. M. Patel, and R. Chellappa, “Facial attributes for active authentication on mobile devices,” Image and Vision Computing, vol. 58, pp. 181 – 192, 2017. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0262885616300828

S. Yang, P. Luo, C.-C. Loy, and X. Tang, “From facial parts responses to face detection: A deep learning approach,” in The IEEE International Conference on Computer Vision (ICCV), December 2015.

H.-W. Chen and M. McGurr, “Moving human full body and body parts detection, tracking, and applications on human activity estimation, walking pattern and face recognition,” in Automatic Target Recognition XXVI, F. A. Sadjadi and A. Mahalanobis, Eds., vol. 9844, International Society for Optics and Photonics. SPIE, 2016, pp. 213 – 246. [Online]. Available: https://doi.org/10.1117/12.2224319

S. Yang, P. Luo, C.-C. Loy, and X. Tang, “From facial parts responses to face detection: A deep learning approach,” in The IEEE International Conference on Computer Vision (ICCV), December 2015.

W. Yang, W. Ouyang, H. Li, and X. Wang, “End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.

Y. Tian, P. Luo, X. Wang, and X. Tang, “Deep learning strong parts for pedestrian detection,” in The IEEE International Conference on Computer Vision (ICCV), December 2015.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds. Curran Associates, Inc., 2015, pp. 91–99.

J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016.

T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.

G. A. Lima, D. T. Bravo and S. A. de Araújo, “Utilização de redes neurais convolucionais para a detecção de objetos em imagens aéreas adquiridas por drones,” Brazilian Journal of Development, vol. 6, no. 7, pp. 50 702–50 713, 2020.

Z. Liu, J. Zhu, J. Bu, and C. Chen, “A survey of human pose estimation: the body parts parsing based methods,” Journal of Visual Communication and Image Representation, vol. 32, pp. 10–19, 2015.

N. Dalal and B. Triggs, “Histograms of Oriented Gradients for Human Detection,” in International Conference on Computer Vision & Pattern Recognition (CVPR ’05), C. Schmid, S. Soatto, and C. Tomasi, Eds., vol. 1. San Diego, United States: IEEE Computer Society, Jun. 2005, pp. 886–893. [Online]. Available: https://hal.inria.fr/inria-00548512

S. Bak, E. Corvee, F. Bremond, and M. Thonnat, “Person re- identification using spatial covariance regions of human body parts,” in 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance. IEEE, 2010, pp. 435–440.

P. I. Wilson and J. Fernandez, “Facial feature detection using haar classifiers,” Journal of Computing Sciences in Colleges, vol. 21, no. 4, pp. 127–133, 2006.

M. M. Derakhshani, S. Masoudnia, A. H. Shaker, O. Mersa, M. A. Sadeghi, M. Rastegari, and B. N. Araabi, “Assisted excitation of activations: A learning technique to improve object detectors,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.

A. Kuznetsova, H. Rom, N. Alldrin, J. Uijlings, I. Krasin, J. Pont-Tuset, S. Kamali, S. Popov, M. Malloci, T. Duerig, and V. Ferrari, “The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale,” arXiv:1811.00982, 2018.

I. Krasin, T. Duerig, N. Alldrin, V. Ferrari, S. Abu-El-Haija, A. Kuznetsova, H. Rom, J. Uijlings, S. Popov, S. Kamali, M. Malloci, J. Pont-Tuset, A. Veit, S. Belongie, V. Gomes, A. Gupta, C. Sun, G. Chechik, D. Cai, Z. Feng, D. Narayanan, and K. Murphy, “Openimages: A public dataset for large-scale multi-label and multi-class image classification.” Dataset available from https://storage.googleapis.com/openimages/web/index.html, 2017.

A. Vittorio, “Toolkit to download and visualize single or multiple classes from the huge open images v4 dataset,” https://github.com/EscVM/ OIDv4 ToolKit, 2018.

D. P. Papadopoulos, J. R. R. Uijlings, F. Keller, and V. Ferrari, “We don’t need no bounding-boxes: Training object class detectors using only human verification,” 2016.

L. Pauly and D. Sankar, “Detection of drowsiness based on hog features and svm classifiers,” in 2015 IEEE International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), 2015, pp. 181–186.

Z. Orman, A. Battal, and E. Kemer, “A study on face, eye detection and gaze estimation,” IJCSES, vol. 2, no. 3, pp. 29–46, 2011.

B. Rehman, O. W. Hong, and A. T. C. Hong, “Hybrid model with margin-based real-time face detection and tracking,” in Multi- disciplinary Trends in Artificial Intelligence, S. Phon-Amnuaisuk, S.-P. Ang, and S.-Y. Lee, Eds. Cham: Springer International Publishing, 2017, pp. 360–369.

A. S. Kaitake and V. M. Suryawanshi, “Yawning detection to prevent road accidents,” 2017.

H. Ahamed, I. Alam, and M. M. Islam, “Hog-cnn based real time face recognition,” in 2018 International Conference on Advancement in Electrical and Electronic Engineering (ICAEEE), 2018, pp. 1–4.

B. Wu, F. N. Iandola, P. H. Jin, and K. Keutzer, “Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving,” CoRR, vol. abs/1612.01051, 2016. [Online]. Available: http://arxiv.org/abs/1612.01051

A. Wong, M. Famouri, M. J. Shafiee, F. Li, B. Chwyl, and J. Chung, “YOLO nano: a highly compact you only look once convolutional neural network for object detection,” CoRR, vol. abs/1910.01271, 2019. [Online]. Available: http://arxiv.org/abs/1910.01271




DOI: https://doi.org/10.34117/bjdv7n2-221

Refbacks

  • There are currently no refbacks.