Abstract: A generative adversarial network (GAN)-based vocoder trained with an adversarial discriminator is commonly used for speech synthesis because of its fast, lightweight, and high-quality ...
Abstract: Denoising diffusion probabilistic models (diffusion models for short) require a large number of iterations in inference to achieve the generation quality that matches or surpasses the ...