Neural network
{{citations}}
missing image!
- Neural network example.png -
Simplified view of an artificial neural network
Traditionally, the term
neural network had been used to refer to a network or circuit of
biological neurons. The modern usage of the term often refers to
artificial neural networks, which are composed of
artificial neurons or nodes. Thus the term has two distinct usages:
- Biological neural networks are made up of real biological neurons that are connected or functionally-related in the peripheral nervous system or the central nervous system. In the field of neuroscience, they are often identified as groups of neurons that perform a specific physiological function in laboratory analysis.
- Artificial neural networks are made up of interconnecting artificial neurons (programming constructs that mimic the properties of biological neurons). Artificial neural networks may either be used to gain an understanding of biological neural networks, or for solving artificial intelligence problems without necessarily creating a model of a real biological system.
This article focuses on the relationship between the two concepts; for detailed coverage of the two different concepts refer to the separate articles:
Biological neural network and
Artificial neural network.
Characterization
In general a biological neural network is composed of a group or groups of chemically connected or functionally associated neurons. A single neuron may be connected to many other neurons and the total number of neurons and connections in a network may be extensive. Connections, called
synapses, are usually formed from
axons to
dendrites, though dendrodendritic microcircuits
(1) and other connections are possible. Apart from the electrical signaling, there are other forms of signaling that arise from
neurotransmitter diffusion, which have an effect on electrical signaling. As such, neural networks are extremely complex.
Artificial intelligence and
cognitive modeling try to simulate some properties of neural networks. While similar in their techniques, the former has the aim of solving particular tasks, while the latter aims to build mathematical models of biological neural systems.In the
artificial intelligence field, artificial neural networks have been applied successfully to
speech recognition,
image analysis and adaptive
control, in order to construct
software agents (in
computer and video games) or
autonomous robots. Most of the currently employed artificial neural networks for artificial intelligence are based on
statistical estimation,
optimization and
control theory.The
cognitive modelling field involves the physical or mathematical modeling of the behaviour of neural systems; ranging from the individual neural level (e.g. modelling the spike response curves of neurons to a stimulus), through the neural cluster level (e.g. modelling the release and effects of dopamine in the basal ganglia) to the complete organism (e.g. behavioural modelling of the organism's response to stimuli).
The brain, neural networks and computers
Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is debated.A subject of current research in theoretical neuroscience is the question surrounding the degree of complexity and the properties that individual neural elements should have to reproduce something resembling animal intelligence.Historically, computers evolved from the
von Neumann architecture, which is based on sequential processing and execution of explicit instructions. On the other hand, the origins of neural networks are based on efforts to model information processing in biological systems, which may rely largely on parallel processing as well as implicit instructions based on recognition of patterns of 'sensory' input from external sources. In other words, at its very heart a neural network is a complex statistical processor (as opposed to being tasked to sequentially process and execute).
Neural networks and artificial intelligence
An
artificial neural network (ANN), also called a
simulated neural network (SNN) or commonly just
neural network (NN) is an interconnected group of
artificial neurons that uses a
mathematical or computational model for
information processing based on a
connectionistic approach to
computation. In most cases an ANN is an
adaptive system that changes its structure based on external or internal information that flows through the network.In more practical terms neural networks are
non-linear statistical data modeling or
decision making tools. They can be used to model complex relationships between inputs and outputs or to
find patterns in data.
Background
An
artificial neural network involves a network of simple processing elements (
artificial neurons) which can exhibit complex global behaviour, determined by the connections between the processing elements and element parameters. One classical type of artificial neural network is the
Hopfield net.In a neural network model simple
nodes, which can be called variously "neurons", "neurodes", "Processing Elements" (PE) or "units", are connected together to form a network of nodes — hence the term "neural network". While a neural network does not have to be adaptive
per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.In modern
software implementations of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing. In some of these systems neural networks, or parts of neural networks (such as
artificial neurons) are used as components in larger systems that combine both adaptive and non-adaptive elements.The concept of a neural network appears to have first been proposed by
Alan Turing in his 1948 paper "Intelligent Machinery".
Applications
The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations and also to use it. This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical.
Real life applications
The tasks to which artificial neural networks are applied tend to fall within the following broad categories:
Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition, etc.), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications,
data mining (or knowledge discovery in databases, "KDD"), visualization and
e-mail spam filtering.
Neural network software
Main article: Neural network softwareNeural network software is used to
simulate,
research,
develop and apply
artificial neural networks,
biological neural networks and in some cases a wider array of
adaptive systems.
Learning paradigms
There are three major learning paradigms, each corresponding to a particular abstract learning task. These are
supervised learning,
unsupervised learning and
reinforcement learning. Usually any given type of network architecture can be employed in any of those tasks.
- Supervised learning
In
supervised learning, we are given a set of example pairs
(x y) x ∈ X y ∈ Y
and the aim is to find a function
f
in the allowed class of functions that matches the examples. In other words, we wish to
infer how the mapping implied by the data and the cost function is related to the mismatch between our mapping and the data.
- Unsupervised learning
In
unsupervised learning we are given some data
x
, and a cost function which is to be minimized which can be any function of
x
and the network's output,
f
. The cost function is determined by the task formulation. Most applications fall within the domain of
estimation problems such as
statistical modeling,
compression,
filtering,
blind source separation and
clustering.
- Reinforcement learning
In
reinforcement learning, data
x
is usually not given, but generated by an agent's interactions with the environment. At each point in time
t
, the agent performs an action
yarg∈-→(:4(x;font-size:12(x;">t
and the environment generates an observation
xarg∈-→(:4(x;font-size:12(x;">t
and an instantaneous cost
carg∈-→(:4(x;font-size:12(x;">t
, according to some (usually unknown) dynamics. The aim is to discover a
policy for selecting actions that minimises some measure of a long-term cost, i.e. the expected cumulative cost. The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated. ANNs are frequently used in reinforcement learning as part of the overall algorithm. Tasks that fall within the paradigm of reinforcement learning are
control problems,
games and other
sequential decision making tasks.
Learning algorithms
There are many algorithms for training neural networks; most of them can be viewed as a straightforward application of
optimization theory and
statistical estimation.
Evolutionary computation methods,
simulated annealing,
expectation maximization and
non-parametric methods are among other commonly used methods for training neural networks. See also
machine learning.Recent developments in this field also saw the use of
particle swarm optimization and other
swarm intelligence techniques used in the training of neural networks.
Neural networks and neuroscience
Theoretical and
computational neuroscience is the field concerned with the theoretical analysis and computational modeling of biological neural systems.Since neural systems are intimately related to cognitive processes and behaviour, the field is closely related to cognitive and behavioural modeling.The aim of the field is to create models of biological neural systems in order to understand how biological systems work. To gain this understanding, neuroscientists strive to make a link between observed biological processes (data), biologically plausible mechanisms for neural processing and learning (
biological neural network models) and theory (statistical learning theory and
information theory).
Types of models
Many models are used in the field, each defined at a different level of abstraction and trying to model different aspects of neural systems. They range from models of the short-term behaviour of
individual neurons, through models of how the dynamics of neural circuitry arise from interactions between individual neurons, to models of how behaviour can arise from abstract neural modules that represent complete subsystems. These include models of the long-term and short-term plasticity of neural systems and its relation to learning and memory, from the individual neuron to the system level.
Current research
While initially research had been concerned mostly with the electrical characteristics of neurons, a particularly important part of the investigation in recent years has been the exploration of the role of
neuromodulators such as
dopamine,
acetylcholine, and
serotonin on behaviour and learning.
Biophysical models, such as
BCM theory, have been important in understanding mechanisms for
synaptic plasticity, and have had applications in both computer science and neuroscience. Research is ongoing in understanding the computational algorithms used in the brain, with some recent biological evidence for
radial basis networks and
neural backpropagation as mechanisms for processing data.
History of the neural network analogy
The concept of neural networks started in the late-1800s as an effort to describe how the human mind performed. These ideas started being applied to computational models with
Turing's B-type machines and the
Perceptron.In early 1950s
Friedrich Hayek was one of the first to posit the idea of
spontaneous order {{Fact|date=May 2008}} in the brain arising out of decentralized networks of simple units (neurons). In the late 1940s,
Donald Hebb made one of the first hypotheses for a mechanism of neural plasticity (i.e. learning),
Hebbian learning. Hebbian learning is considered to be a 'typical' unsupervised learning rule and it (and variants of it) was an early model for
long term potentiation.The
Perceptron is essentially a linear classifier for classifying data
x ∈ Rarg∈-→(:-4(x;font-size:12(x;">n
specified by parameters
w ∈ Rarg∈-→(:-4(x;font-size:12(x;">n b ∈ R
and an output function
f = w'x + b
. Its parameters are adapted with an ad-hoc rule similar to stochastic steepest gradient descent. Because the
inner product is a
linear operator in the input space, the Perceptron can only perfectly classify a set of data for which different classes are
linearly separable in the input space, while it often fails completely for non-separable data. While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.The
Cognitron (1975) was an early multilayered neural network with a training algorithm. The actual structure of the network and the methods used to set the interconnection weights change from one neural strategy to another, each with its advantages and disadvantages. Networks can propagate information in one direction only, or they can bounce back and forth until self-activation at a node occurs and the network settles on a final state. The ability for bi-directional flow of inputs between neurons/nodes was produced with the
Hopfield's network (1982), and specialization of these node layers for specific purposes was introduced through the first
hybrid network.The
parallel distributed processing of the mid-1980s became popular under the name
connectionism.The rediscovery of the
backpropagation algorithm was probably the main reason behind the repopularisation of neural networks after the publication of "Learning Internal Representations by Error Propagation" in 1986 (Though backpropagation itself dates from 1974). The original network utilised multiple layers of weight-sum units of the type
f = g(w'x + b)
, where
g
was a
sigmoid function or
logistic function such as used in
logistic regression. Training was done by a form of stochastic steepest gradient descent. The employment of the chain rule of differentiation in deriving the appropriate parameter updates results in an algorithm that seems to 'backpropagate errors', hence the nomenclature. However it is essentially a form of gradient descent. Determining the optimal parameters in a model of this type is not trivial, and steepest gradient descent methods cannot be relied upon to give the solution without a good starting point. In recent times, networks with the same architecture as the backpropagation network are referred to as
Multi-Layer Perceptrons. This name does not impose any limitations on the type of algorithm used for learning.The backpropagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signalling was not obvious at the time, but most importantly because there was no plausible source for the 'teaching' or 'target' signal.
Criticism
A. K. Dewdney, a former
Scientific American columnist, wrote in 1997,
“Although neural nets do solve a few toy problems, their powers of computation are so limited that I am surprised anyone takes them seriously as a general problem-solving tool.” (Dewdney, p.82)Arguments against Dewdney's position are that neural nets have been successfully used to solve many complex and diverse tasks, ranging from autonomously flying aircraft
weblink to detecting credit card fraud
weblink.Technology writer
Roger Bridgman commented on Dewdney's statements about neural nets:
Neural networks, for instance, are in the dock not only because they have been hyped to high heaven, (what hasn't?) but also because you could create a successful net without understanding how it worked: the bunch of numbers that captures its behaviour would in all probability be "an opaque, unreadable table...valueless as a scientific resource".In spite of his emphatic declaration that science is not technology, Dewdney seems here to pillory neural nets as bad science when most of those devising them are just trying to be good engineers. An unreadable table that a useful machine could read would still be well worth having.(2)
See also
References
-
[Arbib, p.666]
-
[Roger Bridgman's defence of neural networks]
- BOOK, Arbib, Michael A. (Ed.), The Handbook of Brain Theory and Neural Networks, 1995,
- Alspector, {{US patent|4874963}} "Neuromorphic learning networks". October 17, 1989.
- BOOK, Agre, Philip E., et al., Comparative Cognitive Robotics: Computation and Human Experience, Cambridge University Press, 1997, ISBN 0-521-38603-9, , p. 80
- BOOK, Bar-Yam, Yaneer, Dynamics of Complex Systems, Chapter 2, 2003,
- BOOK, Bar-Yam, Yaneer, Dynamics of Complex Systems, Chapter 3, 2003,
- BOOK, Bar-Yam, Yaneer, Making Things Work, 2005, See chapter 3.
- BOOK, Bertsekas, Dimitri P., Nonlinear Programming, 1999,
- BOOK, Bertsekas, Dimitri P. & Tsitsiklis, John N., Neuro-dynamic Programming, 1996,
- JOURNAL, Bhadeshia H. K. D. H., 1992,
title=
Neural Networks in Materials Science, ISIJ International, 39, 966–979, 10.2355/isijinternational.39.966,
- BOOK, Boyd, Stephen & Vandenberghe, Lieven, Convex Optimization, 2004,
- BOOK, Dewdney, A. K., Yes, We Have No Neutrons: An Eye-Opening Tour through the Twists and Turns of Bad Science, 1997, Wiley, 192 pp, See chapter 5.
- JOURNAL, Egmont-Petersen, M., de Ridder, D., Handels, H., 2002,
title=Image processing with neural networks - a review, Pattern Recognition, 35, 10, 2279–2301, 10.1016/S0031-3203(01)00178-9,
- JOURNAL, Fukushima, K., 1975, Cognitron: A Self-Organizing Multilayered Neural Network, Biological Cybernetics, 20, 121–136, 10.1007/BF00342633,
- JOURNAL, Frank, Michael J., 2005, Dynamic Dopamine Modulation in the Basal Ganglia: A Neurocomputational Account of Cognitive Deficits in Medicated and Non-medicated Parkinsonism, Journal of Cognitive Neuroscience, 17, 51–72, 10.1162/0898929052880093,
- JOURNAL, Gardner, E.J., & Derrida, B., 1988,
title=Optimal storage properties of neural network models, Journal of Physics a, 21, 271–284, 10.1088/0305-4470/21/1/031,
- JOURNAL, Krauth, W., & Mezard, M., 1989,
title=Storage capacity of memory with binary couplings, Journal de Physique, 50, 3057–3066, 10.1051/jphys:0198900500200305700,
- JOURNAL, Maass, W., & Markram, H., 2002,
title=
On the computational power of recurrent circuits of spiking neurons, Journal of Computer and System Sciences, 69(4), 593–616,
- BOOK, MacKay, David, Information Theory, Inference, and Learning Algorithms, 2003,
- BOOK, Mandic, D. & Chambers, J., Recurrent Neural Networks for Prediction: Architectures, Learning algorithms and Stability, Wiley, 2001,
- BOOK, Minsky, M. & Papert, S., An Introduction to Computational Geometry, MIT Press, 1969,
- JOURNAL, Muller, P. & Insua, D.R., 1995, Issues in Bayesian Analysis of Neural Network Models, Neural Computation, 10, 571–592,
- JOURNAL, Reilly, D.L., Cooper, L.N. & Elbaum, C., 1982, A Neural Model for Category Learning, Biological Cybernetics, 45, 35–41, 10.1007/BF00387211,
- BOOK, Rosenblatt, F., Principles of Neurodynamics, Spartan Books, 1962,
- BOOK, Sutton, Richard S. & Barto, Andrew G., Reinforcement Learning : An introduction, 1998,
- PAPER, Van den Bergh, F. Engelbrecht, AP
publisher=CIRG 2000,
- JOURNAL, Wilkes, A.L. & Wade, N.J., 1997, Bain on Neural Networks, Brain and Cognition, 33, 295–305, 10.1006/brcg.1997.0869,
- BOOK, Wasserman, P.D., Neural computing theory and practice, Van Nostrand Reinhold, 1989,
- Jeffrey T. Spooner, Manfredi Maggiore, Raul Ord onez, and Kevin M. Passino, Stable Adaptive Control and Estimation for Nonlinear Systems: Neural and Fuzzy Approximator Techniques, John Wiley and Sons, NY, 2002.
- weblink
- weblink
- BOOK, Peter Dayan, L.F. Abbott,
title=Theoretical Neuroscience, MIT Press,
- BOOK, Wulfram Gerstner, Werner Kistler,
title=Spiking Neuron Models:Single Neurons, Populations, Plasticity, Cambridge University Press, | External links
{{externallinks}}
الشبكة العصبيّةNeuronales NetzRed neuronal artificialRéseau de neurones신경망Neuronska mrežaRete neuraleרשת עצביתNeurális hálózatNeuraal netwerkニューラルネットワークSieć neuronowaRede neuralReţea neuronalăИскусственная нейронная сетьNeurónová sieťNevronska mrežaNeuroverkotNeurala nätverkMạng nơ-ron神经网络
(...as imported from WP)
article has not been saved locally