LETS Forecast: Learning Embedology for Time Series Forecasting ICML 2025
- Abrar Majeedi
- Viswanatha Reddy Gajjala
- Satya Sai Srinath Namburi GNVV
- Nada Magdi Elkordi
- Yin Li University of Wisconsin - Madison
Accepted at ICML 2025!

Abstract
Real-world time series are often governed by complex nonlinear dynamics. Understanding these underlying dynamics is crucial for precise future prediction. While deep learning has achieved major success in time series forecasting, many existing approaches do not explicitly model the dynamics. To bridge this gap, we introduce DeepEDM, a framework that integrates nonlinear dynamical systems modeling with deep neural networks. Inspired by empirical dynamic modeling (EDM) and rooted in Takens' theorem, DeepEDM presents a novel deep model that learns a latent space from time-delayed embeddings, and employs kernel regression to approximate the underlying dynamics, while leveraging efficient implementation of softmax attention and allowing for accurate prediction of future time steps. To evaluate our method, we conduct comprehensive experiments on synthetic data of nonlinear dynamical systems as well as real-world time series across domains. Our results show that DeepEDM is robust to input noise, and outperforms state-of-the-art methods in forecasting accuracy.
Background
DeepEDM builds on the foundations of Takens’ Theorem and Empirical Dynamic Modeling (EDM). Takens’ Theorem guarantees that the state space of a deterministic dynamical system can be reconstructed from time-delayed embeddings of univariate observations, under certain smoothness and dimensionality conditions. EDM operationalizes this by forming a time-delay embedding of the observed signal and using local geometric methods, such as Simplex projection, to forecast future values. However, Simplex is sensitive to noise, models each time series independently, and is fundamentally limited to short-term forecasting.
Method

DeepEDM extends traditional EDM by introducing a learnable neural framework that preserves the core principle of dynamical system reconstruction while addressing key practical limitations. The method operates on an extended time series formed by concatenating the original lookback window with initial predictions from a simple base model, effectively removing the constraint that forecasting horizons must be shorter than the lookback window. Rather than working directly with noisy time-delayed embeddings, DeepEDM employs a learned encoder to project these embeddings into a robust latent space where meaningful comparisons can be made despite measurement noise.
The forecasting process replaces traditional nearest neighbor search with differentiable kernel regression using the Nadaraya-Watson estimator, where attention weights are computed across all data points in the latent space rather than selecting discrete neighbors. This soft attention mechanism, implemented efficiently through optimized softmax operations, allows the model to leverage information from the entire lookback window while maintaining the theoretical foundation of Simplex projection. A final decoder network reconstructs predictions from the time-delayed outputs and provides additional denoising capabilities. The entire architecture is trained end-to-end using a combined loss function that minimizes both prediction errors and temporal differences, enabling the model to learn generalizable patterns across multiple time series while maintaining the dynamical systems perspective of EDM.
Experiments and Results

Results on Synthetic Data Experiments

Results on Standard Forecasting Benchmarks
