Time Series Analysis in the Presence of Missing and Inconsistent Data

The analysis of data-rich time series, such as financial product transactions, is one of the most studied fields in data science. But what happens when data is missing, such as when a financial product trades only rarely? Or when data is inconsistent, as when items that are indistinguishable in the data trade at widely divergent prices simultaneously?

This is the challenge presented by cannabis trading data. While there are thousands of purportedly distinct cannabis products, the overwhelming majority trade only sporadically. When they do trade, since there is no centralized marketplace, it is not uncommon for apparently identical products to trade in nearby locations for different prices at the same time. These characteristics present a serious business challenge for market participants, who must decide what to grow months before they bring a product to market, and cannot hedge using forward or future contracts.

This talk will present approaches based on the Multivariate Bayesian Structural Time Series model of Rao Jammalamadaka, Qiu and Ning (2018). The original MBSTS model is presented as implemented in STAN. An overview is presented of the models limitations, as well as practical issues related to convergence, identifiability, prediction, and the reconstruction of missing time series data. The costs and benefits of Hierarchical Shrinkage Priors (Griffin & Brown, 2017) are discussed.

A practical simplification of the model is then presented which improves performance, convergence, and identifiability while retaining the model's ability to reconstruct missing time series data. This model turns out to offer significant improvements, but is challenged as the amount of missing data increases.

The model is then extended using Gaussian processes to enable it to reconstruct and predict even when a large portion of the data is missing.

Presenter biography:
Amos Elberg

Amos is a data scientist who has spent the past three years focused on the unique problems of the nascent cannabis industry. Amos was originally trained in law and economics. As head of data science for Confident Cannabis, Amos developed economic and statistical models for wholesale cannabis trading. He was then Director of Data Science at Tilray, where he worked on modelling the health effects of cannabis, and on the complex problems of assessing market demand and pricing for cannabis products globalling.