YASIN SIMSEK PhD Candidate at Duke University

Job Market Paper

Beyond Returns: A Candlestick-Based Approach to Spot Covariance Estimation -(updated frequently)
Abstract · Latest Version Spot covariance estimation is commonly based on high-frequency open-to-close return data over short time windows, but such approaches face a trade-off between statistical accuracy and localization. In this paper, I introduce a new estimation framework using high-frequency candlestick data that include open, high, low, and close prices, effectively addressing this trade-off. By exploiting the information contained in candlesticks, the pro- posed method improves estimation accuracy relative to the benchmarks while preserving local structure. I further develop a test for spot covariance inference based on candlesticks, which demonstrates reasonable size control and a notable increase in power, particularly in small samples. Motivated by recent work in the finance literature, I test empirically the market neutrality of the iShares Bitcoin Trust ETF (IBIT) using 1-minute candlestick data for the full year of 2024. The results show systematic deviations from market neutral- ity, especially in periods of market stress. An event study around FOMC announcements further illustrates the new method’s ability to detect subtle shifts in response to relatively mild information events.

Working Papers

Intraday Variation in Systematic Risks and Information Flows (with Andrew J. Patton)
Abstract · SSRN · VTSS Workshop Recording This paper analyzes variation in the factor structure of asset returns within a trade day by combining non-parametric kernel methods with principal component analysis. We estimate the model on a collection of over 400 high-frequency US equity returns over the period 1996-2020 and show that the proposed model has superior explanatory power relative to a collection of well-known observable factor models and standard PCA. We present a stylized model of asset prices and information flows and show that the factor structure of asset returns varies with the arrival of news. Using data on individual firm earnings announcements, FOMC announcements, and other macroeconomic announcements, we provide evidence consistent with our stylized model, that the superior performance of the proposed model is due to time variation in the factor structure of asset returns around times of information flows.

Generalized Autoregressive Score Trees and Forests (with Andrew J. Patton)
Revised and resubmitted at Journal of Business & Economic Statistics.
Abstract · SSRN · Online Appendix We propose methods to improve the forecasts from generalized autoregressive score (GAS) models (Creal et. al, 2013; Harvey, 2013) by localizing their parameters using decision trees and random forests. These methods avoid the curse of dimensionality faced by kernel-based approaches, and allow one to draw on information from multiple state variables simultaneously. We apply the new models to four distinct empirical analyses, and in all applications the proposed new methods significantly outperform the baseline GAS model. In our applications to stock return volatility and density prediction, the optimal GAS tree model reveals a leverage effect and a variance risk premium effect. Our study of stock-bond dependence finds evidence of a flight-to-quality effect in the optimal GAS forest forecasts, while our analysis of high-frequency trade durations uncovers a volume-volatility effect.

Publications

Bridging the Covid-19 Data and the Epidemiological Model using the Time-varying Parameter SIRD Model
(with Cem Cakmakli)
Journal of Econometrics, 242/1, May 2024.
Abstract · Published Version This paper extends the canonical model of epidemiology, the SIRD model, to allow for time-varying parameters for real-time measurement and prediction of the trajectory of the Covid-19 pandemic. Time variation in model parameters is captured using the score-driven modeling structure designed for the typical daily count data related to the pandemic. The resulting specification permits a flexible yet parsimonious model with a low computational cost. The model is extended to allow for unreported cases using a mixed-frequency setting. Results suggest that these cases’ effects on the parameter estimates might be sizeable. Full sample results show that the flexible framework accurately captures the successive waves of the pandemic. A real-time exercise indicates that the proposed structure delivers timely and precise information on the pandemic’s current stance.