Voice ai 관련 중요한 한국인 저자 논문
https://arxiv.org/abs/2106.06103
Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Several recent end-to-end text-to-speech (TTS) models enabling single-stage training and parallel sampling have been proposed, but their sample quality does not match that of two-stage TTS systems. In this work, we present a parallel end-to-end TTS method
arxiv.org
vits
https://music-audio-ai.tistory.com/22
[논문리뷰] Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech (ICML21)
논문제목: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech 저자: Jaehyeon Kim, Jungil Kong, Juhee Son 소속: Kakao Enterprise, KAIST 발표: ICML 2021 논문: https://arxiv.org/abs/2106.06103 코드: https:
music-audio-ai.tistory.com
vits 논문 리뷰
https://arxiv.org/abs/2111.12203
KUIELab-MDX-Net: A Two-Stream Neural Network for Music Demixing
Recently, many methods based on deep learning have been proposed for music source separation. Some state-of-the-art methods have shown that stacking many layers with many skip connections improve the SDR performance. Although such a deep and complex archit
arxiv.org
mdx-net
Voice Conversion(음성 변환) 관련해서 한국분들이 쓰신 논문이 많네요
AI cover에서 목소리를 입힐 때 RVC를 많이 사용하는데 RVC는 핵심 모델은 vits입니다
원곡의 보컬과 반주를 분리할 때 UVR5를 많이 쓰는데 핵심 모델은 mdx-net입니다