In the second idea the user need know sing, in this idea the voice going to be modulated from change the ring like seen in the movie terminator 2 when the T-1000 talk on the phone to try to deceive to connor this in real time.

Do you think they can make this possible?
Honestly i think that the way is the AI in music tools and FL Studio could be the first in implementing this as you did in the native implementation of the stem separator, now Logic Pro and others DAW have this funtion.
Please consider it
