The purpose of this vignette is to present the calculations for a peicewise quantile regression where for each time step there are multiple independent observations.
In the follow variables identified by Greek letters are considered unknown.
Data belongs to group \(k\) whose time stamps are the set \(t \in T_{k}\) which have common regression parameters \(\theta_{k}\) and residual variance \(\sigma_{k}\) At time step \(t\) the vector of iid observations \(\mathbf{y}_{t}=\left\{y_{t,1},\ldots,t_{t,n_{t}}\right\}\) is explained by the design matrix \(\mathbf{X}_{t}\).
For a given quantile \(\tau\) and using the check function \(\rho\left(u,\tau\right) = u\left(\tau - I\left(u<0\right)\right)\) Koenker and Bassett (1978) show that an estimate of \(\theta\) in QR model can be obtained by solving the convex optimization problem \[ \min_{\theta} \left( \sum_{i=1}^{n_{t}} \rho\left(\mathbf{y}_{t,i}- \mathbf{X}_{t,i}\left(\mathbf{m}_{t}+\theta_{k}\right),\tau\right) \right) \]
Solving this gives the maximum likelihood estimator of the asymmetric Laplace (AL) distributions (Geraci and Bottai, 2007 and Yu, Lu, and Stander, 2003) which has likelihood \[ L\left(\mathbf{y}_{t} \left| \theta_k\right.\right) = \tau^{n_{t}}\left(1-\tau\right)^{n_{t}}\exp\left(- \sum_{t=1}^{n_{t}} \rho\left(\mathbf{y}_{t,i}- \mathbf{X}_{t,i}\left(\mathbf{m}_{t}+\theta_{k}\right),\tau\right) \right) \]
With \(\hat{\mathbf{y}}_{t} = \mathbf{y}_{t} - \mathbf{X}_{t} \mathbf{m}_{t}\) the log likelihood is given by \[ l\left(\mathbf{y}_{t} \left| \theta_k,\sigma_k \right.\right) = n_{j}\log \left(\tau \left(1-\tau\right)\right) - \sum_{i=1}^{n_{t}} \rho\left(\hat{\mathbf{y}}_{t,i} - \mathbf{X}_{t,i}\theta_{k},\tau\right) \]
The log-likelihood of \(\mathbf{y}_{t \in T_{k}}\) is with \(n_{k}=\sum\limits_{t\in T_{k}} n_{t}\) \[ l\left(\mathbf{y}_{t \in T_{k}} \left| \theta_k,\sigma_k,\mathbf{X}_{t}\right.\right) = n_{k}\log\left(\tau \left(1-\tau\right)\right) - \sum_{t \in T_{k}}\sum_{i=1}^{n_{t}} \rho\left(\hat{\mathbf{y}}_{t,i} - \mathbf{X}_{t,i}\theta_{k},\tau\right) \]
with the cost being twice the negative log likelihood plus a penalty \(\beta\) giving
\[ C\left(\mathbf{y}_{t \in T_{k}} \left| \mu_t,m_k,\sigma_k,s_k\right.\right) = \sum_{t \in T_{k}}\sum_{i=1}^{n_{t}} \rho\left(\hat{\mathbf{y}}_{t,i} - \mathbf{X}_{t,i}\theta_{k},\tau\right) - 2n_{k}\log\left(\tau \left(1-\tau\right)\right) + \beta \]
Here \(\theta_{k}=0\) and is no penalty so \(\beta = 0\)
Estimate \(\theta_{k}\) using ??? and then with penalty \(\beta\) \[ C\left(\mathbf{y}_{t \in T_{k}} \left| \mu_t,m_k,\sigma_k,s_k\right.\right) = \sum_{t \in T_{k}}\sum_{i=1}^{n_{t}} \rho\left(\hat{\mathbf{y}}_{t,i} - \mathbf{X}_{t,i}\hat{\theta}_{k},\tau\right) - 2n_{k}\log\left(\tau \left(1-\tau\right)\right) + \beta \]
if \(n_t > 0\) then could proceed like a collective anomaly. Otherwise select \(\hat{\theta}\) such that \(\mathbf{y}}_{t,i} - \mathbf{X}_{t,i}\hat{\theta}_{k}= 0\)