Deep Convolutional neural networks

In computer science, a convolutional neural network is a type of feed-forward neural network where the individual neurons are tiled in such a way that they respond to overlapping regions in the visual field.[1] Convolutional networks were inspired by biological processes[2] and are variations of multilayer perceptrons which are designed to use minimal amounts of preprocessing.[3] They are widely used models for image-recognition.

In computer science, a convolutional neural network is a type of feed-forward neural network where the individual neurons are tiled in such a way that they respond to overlapping regions in the visual field.[1] Convolutional networks were inspired by biological processes[2] and are variations of multilayer perceptrons which are designed to use minimal amounts of preprocessing.[3] They are widely used models for image-recognition.[4]

 

Contents

  [hide
  • 1 Architecture
  • 2 History
  • 3 Usage
  • 4 See also
  • 5 References
  • 6 External links

 

Architecture[edit]

Convolutional neural networks consist of multiple layers of small neuron collections which look at small portions of the input image. The results of these collections are then tiled so that they overlap to obtain a better representation of the original image; this is repeated for every such layer. Because of this, they are able to tolerate translation of the input image.[4] Most convolutional networks include local pooling or max pooling layers, which simplify and combine the outputs of neighboring neurons; essentially, if the outputs are integrated into an image, pooling layers reduce its resolution.[5] They also consist of various combinations of convolutional layers and fully connected layers, with pointwise nonlinearity applied at the end of or after each layer.[6]

Some time-delay neural networks also use a very similar architecture to convolutional neural networks, especially those for image recognition and/or classification tasks, since the "tiling" of the neuron outputs can easily be carried out in timed stages in a manner useful for analysis of images.[7]

History[edit]

Convolutional neural networks were introduced in a 1980 paper by Kunihiko Fukushima.[6][8] Their design was later improved in 1998 by Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner,[9] generalized in 2003 by Sven Behnke,[10] and simplified by Patrice Simard, David Steinkraus, and John C. Platt in the same year.[11] In 2011, they were refined by Dan Ciresan et al. and were implemented on a GPU with impressive performance results.[12] In 2012, Dan Ciresan et al. broke all existing records for multiple image databases, including the MNIST database, the NORB database, the HWDB1.0 dataset (Chinese characters), and the CIFAR10 dataset (a subset of the 80 million tiny objects database).[6]

Usage[edit]

Convolutional neural networks are often used in image recognition systems. When applied to hand tracking in a video stream and gesture recognition, they had almost perfect performance.[13] They have achieved performance double that of humans on the problem of recognizing traffic signs and an error rate of 0.23 percent on the MNIST database, which is the lowest ever achieved on the database to date.[6] Another paper on using convolutional neural networks for image classification reported that the learning process of convolutional neural networks was "surprisingly fast"; in the same paper, the best published results at the time were achieved in the MNIST database and the NORB database.[12]

They have also been confirmed to have lower error rates than both deep neural networks and regular neural networks for large-vocabulary voice recognition tasks.[14]

When applied to facial recognition, they were able to contribute to a large decrease in error rate.[15] In another paper, they were able to achieve a 97.6 percent recognition rate on "5,600 still images of more than 10 subjects".[2] Convolutional neural networks have been used to assess video quality in an objective way after being manually trained; the resulting system had a very low root mean square error.[7]

See also[edit]

References[edit]

  1. Jump up^ "Convolutional Neural Networks (LeNet) - DeepLearning 0.1 documentation"DeepLearning 0.1. LISA Lab. Retrieved 31 August 2013.
  2. Jump up to:a b Matusugu, Masakazu; Katsuhiko Mori, Yusuke Mitari, and Yuji Kaneda (2003). "Subject independent facial expression recognition with robust face detection using a convolutional neural network"Neural Networks 16 (5): 555–559. Retrieved 17 November 2013.
  3. Jump up^ LeCun, Yann. "LeNet-5, convolutional neural networks". Retrieved 16 November 2013.
  4. Jump up to:a b Korekado, Keisuke; Takashi Morie, Osamu Nomura, Hiroshi Ando, Teppei Nakano, Masakazu Matsugu, and Atsushi Iwata (2003). "A Convolutional Neural Network VLSI for Image Recognition Using Merged/Mixed Analog-Digital Architecture".Knowledge-Based Intelligent Information and Engineering Systems: 169–176. Retrieved 16 November 2013.
  5. Jump up^ Krizhevsky, Alex. "ImageNet Classification with Deep Convolutional Neural Networks". Retrieved 17 November 2013.
  6. Jump up to:a b c d Ciresan, Dan; Meier, Ueli; Schmidhuber, Jürgen (June 2012). "Multi-column deep neural networks for image classification"2012 IEEE Conference on Computer Vision and Pattern Recognition (New York, NY: Institute of Electrical and Electronics Engineers (IEEE)): 3642–3649. arXiv:1202.2745v1doi:10.1109/CVPR.2012.6248110ISBN 9781467312264.OCLC 812295155. Retrieved 2013-12-09.
  7. Jump up to:a b Le Callet, Patrick; Christian Viard-Gaudin, and Dominique Barba (2006). "A Convolutional Neural Network Approach for Objective Video Quality Assessment"IEEE Transactions on Neural Networks 17 (5): 1316–1327. Retrieved 17 November 2013.
  8. Jump up^ Fukushima, Kunihiko (1980). "Neocognitron: A Self-organizing Neural Network Model for a Mechanism of Pattern Recognition Unaffected by Shift in Position"Biological Cybernetics 36 (4): 193–202. Retrieved 16 November 2013.
  9. Jump up^ LeCun, Yann; Léon Bottou, Yoshua Bengio, and Patrick Haffner (1998). "Gradient-based learning applied to document recognition"Proceedings of the IEEE 86 (11): 2278–2324. Retrieved 16 November 2013.
  10. Jump up^ S. Behnke. Hierarchical Neural Networks for Image Interpretation, volume 2766 of Lecture Notes in Computer Science. Springer, 2003.
  11. Jump up^ Simard, Patrice, David Steinkraus, and John C. Platt. "Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis." In ICDAR, vol. 3, pp. 958-962. 2003.
  12. Jump up to:a b Ciresan, Dan; Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jurgen Schmidhuber (2011). "Flexible, High Performance Convolutional Neural Networks for Image Classification"Proceedings of the Twenty-Second international joint conference on Artificial Intelligence-Volume Volume Two 2: 1237–1242. Retrieved 17 November 2013.
  13. Jump up^ Nowlan, Steven J.; John C. Platt (1995). "A convolutional neural network hand tracker"Advances in Neural Information Processing Systems: 901–908. Retrieved 16 November 2013.
  14. Jump up^ Sainath, Tara N.; Abdel-rahman Mohamed, Brian Kingsbury, Bhuvana Ramabhadran (2013). "Deep convolutional neural networks for LVCSR"International Conference on Acoustics, Speech, and Signal Processing: 8614–8618. Retrieved 31 August 2013.
  15. Jump up^ Lawrence, Steve; C. Lee Giles, Ah Chung Tsoi, and Andrew D. Back (1997). "Face Recognition: A Convolutional Neural Network Approach"Neural Networks, IEEE Transactions on 8 (1): 98–113. Retrieved 16 November 2013.

External links[edit]

RELATED ARTICLESExplain
Machine Learning Methods & Algorithms
Deep Learning
Deep Convolutional neural networks
Deep belief networks
Deep Boltzmann machines
Deep Recurrent neural networks
t-Distributed Stochastic Neighbor Embedding (t-SNE)
Graph of this discussion
Enter the title of your article


Enter a short (max 500 characters) summation of your article
Enter the main body of your article
Lock
+Comments (0)
+Citations (0)
+About
Enter comment

Select article text to quote
welcome text

First name   Last name 

Email

Skip