Maximum a Posteriori

using Distributions
using HMMBase
using PyPlot
using Seaborn

Let's consider a simple time series with one outlier:

y = rand(1000)
y[100] = 10000
plot(y)

An MLE approach for the observations distributions parameters may fail with a singularity (variance = 0) if an outlier becomes the only observation associated to some state:

hmm = HMM([0.9 0.1; 0.1 0.9], [Normal(0, 1), Normal(5, 1)])
try
    fit_mle(hmm, y, display = :iter)
catch e
    println(e)
end
Iteration 0: logtot = -1230.6657301979515
Iteration 1: logtot = -186.8339402902164
Iteration 2: logtot = -11.713017504274575
Iteration 3: logtot = NaN
ArgCheck.CheckError("isprobvec(hmm.a) must hold. Got\nhmm.a => [NaN, NaN]")

We can avoid this by putting a prior on the variance:

import ConjugatePriors: InverseGamma, NormalKnownMu, posterior_canon
import StatsBase: Weights

function fit_map(::Type{<:Normal}, observations, responsibilities)
    μ = mean(observations, Weights(responsibilities))

    ss = suffstats(NormalKnownMu(μ), observations, responsibilities)
    prior = InverseGamma(2, 1)
    posterior = posterior_canon(prior, ss)
    σ2 = mode(posterior)

    Normal(μ, sqrt(σ2))
end

hmm, hist = fit_mle(hmm, y, estimator = fit_map, display = :iter)
plot(hist.logtots)
hmm.B
2-element Array{Distribution{Univariate,S} where S<:ValueSupport,1}:
 Normal{Float64}(μ=0.5002249742142206, σ=0.29004634497715975)
 Normal{Float64}(μ=10000.0, σ=0.5345224838248488)            

This page was generated using Literate.jl.