nlcpy.cov

nlcpy.cov(a, y=None, rowvar=True, bias=False, ddof=None, fweights=None, aweights=None)

Estimates a covariance matrix, given data and weights.

Covariance indicates the level to which two variables vary together. If we examine N-dimensional samples, X = [x_1, x_2, ... x_N]^T, then the covariance matrix element C_{ij} is the covariance of x_i and x_j. The element C_{ii} is the variance of x_i. See the notes for an outline of the algorithm.

Parameters
marray_like

A 1-D or 2-D array containing multiple variables and observations. Each row of m represents a variable, and each column a single observation of all those variables. Also see rowvar below.

yarray_like, optional

An additional set of variables and observations. y has the same form as that of m.

rowvarbool, optional

If rowvar is True (default), then each row represents a variable, with observations in the columns. Otherwise, the relationship is transposed: each column represents a variable, while the rows contain observations.

biasbool, optional

Default normalization (False) is by (N - 1), where N is the number of observations given (unbiased estimate). These arguments had no effect on the return values of the function and can be safely ignored in this and previous versions of nlcpy. If bias is True, then normalization is by N.

ddofint, optional

If not None the default value implied by bias is overridden. Note that ddof=1 will return the unbiased estimate, even if both fweights and aweights are specified, and ddof=0 will return the simple average. See the notes for the details. The default value is None.

fweightsarray_like, int, optional

1-D array of integer frequency weights; the number of times each observation vector should be repeated.

aweightsarray_like, optional

1-D array of observation vector weights. These relative weights are typically large for observations considered “important” and smaller for observations considered less “important”. If ddof=0 the array of weights can be used to assign probabilities to observation vectors.

Returns
outndarray

The covariance matrix of the variables.

See also

corrcoef

Normalized covariance matrix

Note

Assume that the observations are in the columns of the observation array m and let f = fweights and a = aweights for brevity. The steps to compute the weighted covariance are as follows:

>>> import nlcpy as vp
>>> m = vp.arange(10, dtype=vp.float64)
>>> f = vp.arange(10) * 2
>>> a = vp.arange(10) ** 2.
>>> ddof = 9 # N - 1
>>> w = f * a
>>> v1 = vp.sum(w)
>>> v2 = vp.sum(w * a)
>>> m -= vp.sum(m * w, axis=None, keepdims=True) / v1
>>> cov = vp.dot(m * w, m.T) * v1 / (v1**2 - ddof * v2)

Note that when a == 1, the normalization factor v1 / (v1**2 - ddof * v2) goes over to 1 / (vp.sum(f) - ddof) as it should.

Examples

Consider two variables, x_0 and x_1, which correlate perfectly, but in opposite directions:

>>> import nlcpy as vp
>>> x = vp.array([[0, 2], [1, 1], [2, 0]]).T
>>> x
array([[0, 1, 2],
       [2, 1, 0]])

Note how x_0 increases while x_1 decreases. The covariance matrix shows this clearly:

>>> vp.cov(x) 
array([[ 1., -1.],
       [-1.,  1.]])

Note that element C_{0,1}, which shows the correlation between x_0 and x_1, is negative.

Further, note how x and y are combined:

>>> x = [-2.1, -1,  4.3]
>>> y = [3,  1.1,  0.12]
>>> vp.cov(x, y) 
array([[11.71      , -4.286     ],
       [-4.286     ,  2.14413333]])
>>> vp.cov(x)    
array(11.71)