Heinrich Hartmann opinion consulting about

Generative Models For Time Series

« 2014-12-11 »

In this note we explore several mechanisms of how to construct time series. For us a time series is a sequence of real numbers

where $t$ is the time index parameter. A model for a time series is a sequence of random variables

Given a time series model we can generate many different time series (instances) by drawing random samples $Y_t(\omega)$ for $\omega \in \Omega$.

The following classes of time series will be discussed:

As we will see this constructions allow us to create a great variety of different time series. The generation of sample time series is the first step in gaining an understanding of these models. Time series analysis is concerned with reversing these constructions and infere model type and parameters from time series data.

Realization in Python

The easiest representation of a time series is using a simple array:

y = [1,2,1,0,-3,-1]

Elements can be addressed with the syntax y[t]. The drawback of these representation is that there is an explicit length of the time series, which was supposed to go on forever. Also the time dependence is very explicit. It is often not meaningful to say, give me the sample at time $0$, but instead we have a sample $y_t$ at a time $t=now$, and we want to have the next 5 samples, say.

I turns out that, python comes with a built in iterator type that is well suited for representing time series. Iterators are simple objects I that provide a method I.next() that returns the next value of the series. Note that there is no way to access previously generated values.

Iterator object can be conveniently generated using the yield keyword. The following example constructs a white noise iterator:

# Example construction of an iterator
def WhiteNoise():
    while True:
         yield np.random.random()

eps = WhiteNoise()
print eps.next(), eps.next(), eps.next()
0.963354997711 0.175658838635 0.995743134001

For convenience we provide a class Series which subclasses the iterator class and implements methods for infix operators, e.g. __add__. In this way we can write Z=Y + 4 for the time series with Z.next() = Y.next() + 4.

Also we have written some plotting helper function

The source code for these extensions can be found on GitHub.

Fundamental Examples of Time Series Models

We start by constucting some fundamental examples of time series.

The constant series, $Const(c)$ with value $c \in \mathbb{R}$ is defined as a series with .

Given a probability distribution $\mathcal{D}$ on the real line, we say that a process is $\mathcal{D}$-iid noise if

and all random variables $Y_t$ are independent. The constant series is IID noise with distribution $\delta_c$.

Standard white noise $\epsilon$ is defined as $IID(\mathcal{N}(0,1))$ noise with standard gaussian normal distribution.

Other important distributions for our purposes are the Bernoulli Distribution $\mathcal{D} = Ber_p$, with

and the Binomial Distribution $\mathcal{B}(n,p)$ with

def IID(Dist):
    while True:
        yield Dist()

def Const(c):
    "Constant series"
    return IID(lambda: c)
def Ber(p):
    "Bernoulli noise"
    return IID(stats.bernoulli(p).rvs)

def B(n,p):
    "Binomial noise"
    return IID(stats.binom(n,p).rvs)
def N(mean, variance):
    "Normal noise"
    return IID(stats.norm(mean, variance).rvs)

def Exp(l):
    "Exponentaial noise"
    return IID(stats.expon.rvs)

eps = N(0,1)
nul = Const(0)
# Normal Noise

# Random Bernoulli spikes
FancyPlot(Ber(0.05), 'o--',N=500)

# Binomial spikes

# Binomial spikes with exponential height

Arithmetic with time series

Our convenience class Series, allows us to do simple arithemtic on time series using the familiar infix notation.



Integration of Series and Random Walks

Random wals are realized as comulative sums over a series.

def Walk(I):
    s = 0
    for y in I:
        s += y
        yield s
# Standard random walk over white noise

# random walk over squared white noise

# Walking a Bernoulli process

# Walking spikes of random height

Differencing Time Series

Differencing is the inverse to comulative sums.

def Diff(I):
    l = I.next()
    for y in I:
        yield y - l
        l = y
def DiffK(k,I):
    "returns the k-fold iterated difference of I"
    if k == 0: return I
    if k >  0: return DiffK(k-1,Diff(I))
    if k <  0: return DiffKa(k+1,Walk(I))
# Differences of white noise looks like whie noise

# Differencing reverses a random walk (up to one sample dealy)
Y = Sample(eps, N=100)
FancyPlot(Walk(Y),'b--')  # random walk in blue
Plot(Y,'g')             # original samples in green
Plot(Diff(Walk(Y)),'r') # differences in red

# Differencing a Bernoulli series

Smoothing and Moving Averages

import copy
def Window(N,I):
    H = deque(maxlen=N)
    for i in range(N): H.appendleft(I.next())
    for y in I:
        yield list(H)
def Conv(C,I):
    "Convolution of a Series I with a vector C"
    for W in Window(len(C),I):
        yield sum(map(lambda x: x[0]*x[1], zip(C[::-1],W)))
def Smooth(N,I):
    return Conv([1./N]*N,I)

def ExpSmooth(l, I):
    g = I.next()
    for y in I:
        g = l * y + (1-l) * g
        yield g
# Smoothing a random walk
Y = Sample(Walk(eps), N = 100)
Plot(Smooth(10, Series(Y)))

# Exponential Smoothing of a Bernoulli series
Y = Sample(B(3,0.05), N=500)
Plot(Y,'o--', alpha=0.5)

Markov Models

A markov chain consists of 1. a set of states $S$ which we will assume to be natural numbers $S={ 1,2,3,…,N }$, 2. a probability transition matrix $T_{i,j} \in [0,1]$, where $i,j \in S$ 3. an initial distribution $\pi$, over $S$.

A realization of a markov chain is a sequence of random variables $Y_t \in S$, such that:

  1. The distribution of $Y_0$ is equal to $\pi$.
  2. The transition probabilities are given by:

  3. The markov property has to hold:

Makrov chains are very easy to generate. We start with a random choice of start state $y_0 \in S$ sampled according to $\pi$. Then, we iteratively generate the further states $y_t$. If we are in a state the next state is ranomly sampled from $1, \dots, N$ according to the the distribution .

def RandIndex(W):
    returns random index of P, where 
        P[RandIndex(W) = i] = W[i].
    t = np.random.random()
    for i,p in enumerate(np.cumsum(W)):
        if p > t: return i

def Markov(T, pi=None, start=0):
    Markov chain series with:
    T     - transition matrix
    pi    - initial distribution either as
    start - initial state (if pi == None)
    # determine initial state
    if pi is None:
        s = start
        s = RandIndex(pi)
    # Walk in chain
    while True:
        yield s
        s = RandIndex(T[s])
def MBer(p): 
    "Ber_p experiement as markov chain"
    return Markov(
    [[ 1-p, p ],
     [ 1-p, p ]]

def M1(p): 
    "Change state with probability p"
    return Markov(
    [[1-p,   p   ],
     [p,     1-p ]]


# Three state markov model
def M2(p): return Markov(
    [[1-p, p  , 0  ], # state 0: change to 1 with prob p
     [0  , 1-p, p  ], # state 1: cahnge to 2 with prob p
     [1  , 0  , 0  ]] # state 3: drop back to 1


Two-Step Modles

Given a time series model $S_t$ with discrete values in ${1, \dots, K}$, and for each $k = 1…K$ another time series model $Y^k_t$, we can produce a two step model as follows. Depending on the value $S_t$ we select one of the time series $Y^1_t,\dots,Y^K_t$ to draw the sample from:

def Selector(S,D):
    Two step time series model:
    S -- selector series with values in [1..len(D)]
    D -- list of series to be selected from
    for s in S:
        yield D[s].next()

Example: Random Jumps

We have $Y_t ~ \N(0,s)$, and $s = 1$ or $s = 10$ depending on a Bernoulli experiment.

FancyPlot(Selector(Ber(0.01),[eps, 10*eps]))

Example: Latent Markov Model

In this case the selector process $S_t$ is a markov model.

S = Sample(M1(0.01))
FancyPlot(Selector(S, [eps, 5*eps]))

# Change in mean according to three step process

S = Sample(M2(0.01))
D = [eps, 5+eps, 10+eps]
FancyPlot(Selector(S, D))