컴퓨터/인공지능

Voice ai 관련 중요한 한국인 저자 논문

지알오알지 2024. 12. 26. 16:25

https://arxiv.org/abs/2106.06103

 

Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Several recent end-to-end text-to-speech (TTS) models enabling single-stage training and parallel sampling have been proposed, but their sample quality does not match that of two-stage TTS systems. In this work, we present a parallel end-to-end TTS method

arxiv.org

vits

 

 

https://music-audio-ai.tistory.com/22

 

[논문리뷰] Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech (ICML21)

논문제목: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech 저자: Jaehyeon Kim, Jungil Kong, Juhee Son 소속: Kakao Enterprise, KAIST 발표: ICML 2021 논문: https://arxiv.org/abs/2106.06103 코드: https:

music-audio-ai.tistory.com

vits 논문 리뷰

 

 

https://arxiv.org/abs/2111.12203

 

KUIELab-MDX-Net: A Two-Stream Neural Network for Music Demixing

Recently, many methods based on deep learning have been proposed for music source separation. Some state-of-the-art methods have shown that stacking many layers with many skip connections improve the SDR performance. Although such a deep and complex archit

arxiv.org

mdx-net

 

 

Voice Conversion(음성 변환) 관련해서 한국분들이 쓰신 논문이 많네요

AI cover에서 목소리를 입힐 때 RVC를 많이 사용하는데 RVC는 핵심 모델은 vits입니다

원곡의 보컬과 반주를 분리할 때 UVR5를 많이 쓰는데 핵심 모델은 mdx-net입니다