SAMformer
model paperLightweight transformer for multivariate long-term time series forecasting using sharpness-aware minimization and channel-wise attention. Identifies attention as responsible for poor generalization in forecasting transformers and addresses it via SAM optimization. Surpasses TSMixer by 14.33% on average with 4x fewer parameters. Presented as an oral at ICML 2024 from Huawei Paris Noah's Ark Lab.
Outputs 2
SAMformer
modelSAMformer: Unlocking the Potential of Transformers in Time Series Forecasting with Sharpness-Aware Minimization and Channel-Wise Attention
paperarXiv: 2402.10198