this post was submitted on 18 Sep 2023
428 points (96.5% liked)
Technology
59148 readers
2428 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm sorry, but as somebody who's tried out the tech, the amount of vocal processing required is still many hours of data. Even the more professional AI cloning web sites that allow you to clone your own voice require that you submit "a couple of hours" of your voice data. The reason why musicians and voice actors get into the middle of this is because they already have many hours of voice work just out there. And in many cases, the speech-to-text transcription, which is required to train a voice model, is already available. For example, an audio book.
You think scam call centers are going to spend the time to look for voice clips, parse them out, transpose them into text, put them in a model, train that model for many hours, realize the Python code needs some goddamn dependency that will take many more to debug, fix parameter settings, and then get a subpar voice model that couldn't fool anybody because they don't have enough voice clips.
They can't even be bothered to look up public information about the caller they are making the call to. Fuck, the last call I got was from a "support center for your service", and when I asked "which service?", they immediately hung up. They do not give a fuck about trying to come prepared with your personal details. They want the easiest mark possible that doesn't ask questions and can get scammed without even knowing their name.
Who's Gamgam?
Yeah, sorry, you need more than a "few times" or a "few voice clips".
Imagine making this post refuting new Ai tech, but being unable to figure out that Gamgam is grandmother.
Shit like this has already happened.
I dunno... have you tried Googling it? I figured it was some word for grandmother, but I've never heard of it, and neither has Google, apparently.
Huh, duckduckgo came up with favorite Southern grandma names and 50 best grandma names as the second and third articles. You do have to know what to type in and not always look at the first thing that comes up. I searched "who is a gamgam" and found tons of stuff about grandmas.
It's not at all clear that what you're saying is true now: They thought loved ones were calling for help. It was an AI scam.
And it's a nailed-on guarantee that it won't remain true for very long at all.
This is the kind of thing that AI actually is good at. Hollywood will use it to make out like bandits and so will criminals.
There's a lot of hyped-up scaremongering about AI but this particular cat is out of this particular bag.
https://arstechnica.com/information-technology/2023/01/microsofts-new-ai-can-simulate-anyones-voice-with-3-seconds-of-audio/
I have personally used VALL-E and tried it out. What they are claiming is absolute bullshit. It is "a" voice, but it's certainly nowhere close to "your" voice. Don't believe me? You can try it out yourself.
Real AI training requires putting in the work.
Well that's weird, since the article says:
You think maybe that Github release, which isn't from Microsoft, might not be the same thing despite the name?
All of the AI tech is based on research papers, which are semi-publicly available.
Amazon showed off voice cloning over a year ago, and iirc it was claimed to not require hours of content. You’re lagging in your understanding of current capabilities, nevermind the fact that I was talking about the near future.