Differential calculus of functions with several variables


Basically

If a function with 2 input variables is differentiable, it’s derivation with respect to one variable is defined like:

DiffCalc


This is called partial derivative and can generally be written for a function of variables x1…xn:


DiffCalc


Of course, a derivation is not written like this. It is carried out in the same manner as the derivation of a function with just one variable is done (see Differential calculus). Only all the other variables in the function are regarded as constant for the derivation and this is done with respect to each included variable.For instance for the function


DiffCalc

DiffCalc

and

DiffCalc




Gradient

All partial derivatives combined in one vector build the Gradient which is usually indicated by the Nabla-operator ∇:

DiffCalc






Directional derivative

The directional derivative should not get mixed up with the Gradient. The directional derivative adds the derivations with respect to each variable multiplied by a unit vector pointing into a certain direction. Its result is a single value.

With the unit vector a, that would look like:


DiffCalc




Hessian Matrix

If the function f(x1…xn) can be differentiated twice and all second derivatives are built, we get the Hessian Matrix:



DiffCalc




The Hessian matrix is symmetrical to the main diagonal as


DiffCalc





Taylor's theorem

According to Taylor each n times differentiable function can be expressed as a polynomial function at a place x0 plus remainder F(a):


DiffCalc



where

DiffCalc




Is the k-th derivation of p at the place x0.
(see Taylor Polynomials)


This formulation can be extended to functions with more than one variable as well.

For a function with 2 variables x and y that means we have to replace

DiffCalc



For instance, for k = 2 that would be:

DiffCalc
DiffCalc



Now, as

DiffCalc



With the Binominal coefficients

DiffCalc



We can write

DiffCalc
DiffCalc




With the remainder:


DiffCalc




With (ξ,η) a not known point between (x0,y0) and (x,y)




If, for instance, the function

DiffCalc



With (ξ,η) a not known point between (x0,y0) and (x,y) shall be approximated at the point x = 0.3 and y = 0.3 by the Taylor function built an the position x0 = 0 and y0 = 0, the first derivations are:

DiffCalc



and the second:

DiffCalc



With these the approximation becomes:

DiffCalc
DiffCalc



Whereas the origin function:

DiffCalc




For a function with n variables things become really complicate

DiffCalc




Let’s leave that like this :-)



If m = 2, the Tailor polynomial is the quadratic approximation of p.

With the direction vector

DiffCalc




the Gradient of f

DiffCalc




and the Hessian matrix

DiffCalc




The formulation for the quadratic approximation can be written as:

DiffCalc



A formulation often mentioned in books about machine learning which looks a bit simpler with a few Matrix operations and it can easily be extended to n variables :-)