AI auto-corrects notes that are sung off key

The end to out-of-tune karaoke! AI auto-corrects notes that are sung off key (but it doesn’t make them sound robotic like conventional autotune)

The study using AI looked specifically at pitch correction in a karaoke setting
An algorithm learns to estimate the amount of tuning required to stay in tune
It works without using a musical score as reference unlike commercial systems
The system corrects off-key notes to harmonise with the accompanying music
Nuances in vocal are maintained and stops corrected vocals sounding robotic

Published: 16:54 GMT, 19 February 2019 | Updated: 00:46 GMT, 20 February 2019

Listening to your friends butcher your favourite songs during karaoke could become a thing of the past, thanks to software created by scientists.

Their AI system can bring your pitch closer to the original artist’s intentions, without making your voice sound robotic or artificial.

That means nuances in your voice are kept and that intentional vocal flourishes aren’t completely erased.

Any minor wobbles, or even massive deviations from the pitch of a song, are simply moved closer to how the song was meant to be sung.

The software, which is not yet commercially available, does this by shifting the pitch of individual sung notes to align them more closely with the accompanying music.

Scroll down for audio clips

Listening to your friends butcher your favourite songs during karaoke nights could become a thing of the past, thanks to software created by scientists. Their AI system can bring your pitch closer to the original artist’s, without making it sound overly robotic or artificial (stock image)

Most commercial autotuning systems require the user to input a melody score, or instructions to modulate pitch by a particular pitch or scale.

Sanna Wager, a PHD candidate and main author of the study at Indiana University, told New Scientist: ‘When looking at how to correct the current note, we look at what the singer did over the past few seconds.’

The current tool must be applied to recordings after they have been made, but the end product could used to make changes on-the-fly.

Her paper, published on the pre-print repository Arxiv.org, contains audio samples of how a voice altered by the commercial version of the product could sound (below).

The AI system can bring your pitch closer to the original artist’s intentions, without making your voice sound robotic or artificial. That means nuances in your voice are kept and that intentional vocal flourishes aren’t completely erased (stock image)

The current tool must be applied to recordings after they have been made, but the end product could used to make changes on-the-fly. This clip is taken from an original karaoke recording of Frank Sinatra’s The Way You Look Tonight

This clip is of the same recording of Frank Sinatra’s The Way You Look Tonight which has been processed by the AI software

To create the system, researchers at Indiana University Bloomington used 4,702 amateur voice recordings from the online karaoke platform Smule to ‘train’ their AI algorithm to recognise and correction off-key notes.

The team selected 500 tracks that were performed ‘in-tune’ and split the tracks into separate files, one for voice and one for the accompanying music.

They then intentionally created an ‘out-of-tune’ version of the voice track by randomly shifting notes up to a semitone higher, while the accompaniment music was kept the same.

The AI learnt to predict the amount that each voice note needed to be adjusted in order to stay ‘in-pitch’ with the instrumental accompaniment.

This modulation was then applied to all the off-key notes in each solo voice recording to correct the entire voice track.

Writing in the paper, its authors said: ‘This approach differs from commercially used automatic pitch correction systems, where notes in the vocal tracks are shifted to be centered around notes in a user-defined score or mapped to the closest pitch among the twelve equal-tempered scale degrees.’

Researchers at Indiana University Bloomington used 4,702 amateur voice recordings from the online karaoke platform Smule to ‘train’ their AI algorithm to recognise and correct off-key notes. This clip is an original karaoke performance of R Kelly’s I Believe I Can Fly

Experts intentionally created an ‘out-of-tune’ version of the voice track by randomly shifting notes up to a semitone higher, while the accompaniment music was kept the same. This clip is the version of I Believe I Can Fly after random notes have been shifted to sound ‘off-tune’

The AI learnt to predict the amount that each voice note needed to be adjusted in order to stay ‘in-pitch’ with the instrumental accompaniment. This clip is the AI autocorrected version of the off-pitch version I Believe I Can Fly performance

HOW DOES ARTIFICIAL INTELLIGENCE LEARN?

AI systems rely on artificial neural networks (ANNs), which try to simulate the way the brain works in order to learn.

ANNs can be trained to recognise patterns in information – including speech, text data, or visual images – and are the basis for a large number of the developments in AI over recent years.

Conventional AI uses input to ‘teach’ an algorithm about a particular subject by feeding it massive amounts of information.

AI systems rely on artificial neural networks (ANNs), which try to simulate the way the brain works in order to learn. ANNs can be trained to recognise patterns in information – including speech, text data, or visual images

Practical applications include Google’s language translation services, Facebook’s facial recognition software and Snapchat’s image altering live filters.

The process of inputting this data can be extremely time consuming, and is limited to one type of knowledge.

A new breed of ANNs called Adversarial Neural Networks pits the wits of two AI bots against each other, which allows them to learn from each other.

This approach is designed to speed up the process of learning, as well as refining the output created by AI systems.

READ SOURCE