SPIRAL

Self-supervised Perturbation-Invariant Representation Learning for speech pre-training. Learns denoising representations of perturbed data in a teacher-student framework. Achieves competitive or better results than wav2vec 2.0 with 80% training cost reduction for BASE and 65% for LARGE models. Published at ICLR 2022.

No results found