Search results
Results from the WOW.Com Content Network
The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is nonlinear . [ 1 ]
Sigmoid functions most often show a return value (y axis) in the range 0 to 1. Another commonly used range is from −1 to 1. A wide variety of sigmoid functions including the logistic and hyperbolic tangent functions have been used as the activation function of artificial neurons.
The standard logistic function is the logistic function with parameters =, =, =, which yields = + = + = / / + /.In practice, due to the nature of the exponential function, it is often sufficient to compute the standard logistic function for over a small range of real numbers, such as a range contained in [−6, +6], as it quickly converges very close to its saturation values of 0 and 1.
For a concrete example, consider a typical recurrent network defined by = (,,) = + + where = (,) is the network parameter, is the sigmoid activation function [note 2], applied to each vector coordinate separately, and is the bias vector.
The Gudermannian function is a sigmoid function, and as such is sometimes used as an activation function in machine learning. The (scaled and shifted) Gudermannian function is the cumulative distribution function of the hyperbolic secant distribution. A function based on the Gudermannian provides a good model for the shape of spiral galaxy arms ...
Also, certain non-continuous activation functions can be used to approximate a sigmoid function, which then allows the above theorem to apply to those functions. For example, the step function works. In particular, this shows that a perceptron network with a single infinitely wide hidden layer can approximate arbitrary functions.
More specialized activation functions include radial basis functions (used in radial basis networks, another class of supervised neural network models). In recent developments of deep learning the rectified linear unit (ReLU) is more frequently used as one of the possible ways to overcome the numerical problems related to the sigmoids.
It has been demonstrated for the first time in 2011 to enable better training of deeper networks, [27] compared to the widely used activation functions prior to 2011, i.e., the logistic sigmoid (which is inspired by probability theory; see logistic regression) and its more practical [28] counterpart, the hyperbolic tangent.