Artificial intelligence


Learning

Subheading

Artificial Neural Networks (ANN)

Universal Approximation Theorem

Class of theorems asserting that a specific type of ANNs are dense in a specified function space

More informally, for any function of a specified function space, there exists an ANN that can predict it within an arbitrary degree of accuracy

\forall f \in X, \exists \{ \phi_n \} : \lim_{n \to \infty} \phi_n = f

FNN-C UAT

\forall f \in C, \exists \{ \phi_n \} : \lim_{n \to \infty} \phi_n = f

Artificial Neural Networks (ANN)

Weights

Numerical value assigned to a node to denote strength and direction of a node

Cost function

Function that determines the error between a predicted value and true value, and employs some weighting scheme to punish higher errors. The output of this function is known as the cost, and this function serves as a way of quantifying the quality of a NN's estimate.

Quadratic cost function

\(\lambda (x) = C (t-x)^2 \)

Activation function

Function that takes the input and weights of an ANN and determines some output. This introduces nonlinearity to a NN, making it more powerful than linear regression techniques.

Rectifier

Analogous to a half-wave rectifier in electronics, scalar function that returns the value iff it is positive

\(r(x) = \max (0,x)\)

Saturation

Gradient descent

Pooling

A function appied to each \(n \times n\) square on some image, or generaly, some portioon of data of size \(n\) (let's call this a pool)

Max pooling

Returning the maximum element in the pool and replacing the pool with this element

Support Vector Machines (SVM)

Supervised models used for classifying data using regression analysis to partition different types of data using a hyperplane

Kernel

Different to the definition in Linear Algebra, denotes a function that is a weighted sum or intergal

Kernel trick

Convolutional Neural Networks (CNN)

Region Based Convolutional Neural Networks (RBCNN)

Residual Neural Netowrks (ResNN)

Attention

Soft weights

Weights that differ between runtimes

Hard weights

Weights that gradually become frozen (converge to a value) between runtimes

Softmax

Vector function that returns the exponential probabilities of each element in a vector

\( \sigma (\vec{z})_{i} = \frac{e^{z_{i}}}{ \sum_{j=1}^{|\vec{z}|} e^{z_{j}} \)

Overfitting

Training a model with a certain dataset to the extent where it can perfectly predict the dataset, but ineffective with other datastes

Genetic algorithm

Fitness function

Genetic representation

Generative Adverarial Network (GAN)

Recurrent Neural Network (RNN)

Transformer

Generative Pretrained Transformer (GPT)

K-nearest neighbors

Confusion matrix

Matrix \(C : c_{ij}\) where \(c_{ij}\) is the amount of times an object with label \(i\) was classified as \(j\) by a machine learning algorithm. A diagonal matrix is therefore the ideal.

Supervision

Backpropagation

Estimating the gradient of a cost function with respect to a NN's weights by employing the chain rule

Deep learning

Optimization