Microsoft AI creates realistic speech with little training

The system relies in part on Transformers, or deep neural networks that roughly emulate neurons in the brain. Transformers weigh every input and output on the fly like synaptic links, helping them process even lengthy sequences very efficiently — say, a complex sentence. Combine that with a noise-removing encoder component and the AI can do a lot with relatively little.

The results aren’t perfect with a slight robotic sound, but they’re highly accurate with a word intelligibility of 99.84 percent. More importantly, this could make text to speech more accessible. You wouldn’t need to spend much effort to get realistic voices, putting it within reach of small companies and even amateurs. This also bodes well for the future. Researchers hope to train on unmatched data, so it might require even less work to create realistic dialogue.

bitcoin
Bitcoin (BTC) $ 44,638.00
ethereum
Ethereum (ETH) $ 1,358.04
cardano
Cardano (ADA) $ 1.24
tether
Tether (USDT) $ 0.999231
binance-coin
Binance Coin (BNB) $ 208.89
polkadot
Polkadot (DOT) $ 31.08
xrp
XRP (XRP) $ 0.414495
litecoin
Litecoin (LTC) $ 160.55
chainlink
Chainlink (LINK) $ 24.06
usd-coin
USD Coin (USDC) $ 1.00