this post was submitted on 12 Jan 2025

666 points (98.0% liked)

Technology

69299 readers

5273 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

[email protected]

666

VLC player demos real-time AI subtitling for videos (www.theverge.com)

submitted 3 months ago* (last edited 3 months ago) by [email protected] to c/[email protected]

107 comments fedilink hide all child comments

cross-posted from: https://lemmy.ca/post/37011397

[email protected]

The popular open-source VLC video player was demonstrated on the floor of CES 2025 with automatic AI subtitling and translation, generated locally and offline in real time. Parent organization VideoLAN shared a video on Tuesday in which president Jean-Baptiste Kempf shows off the new feature, which uses open-source AI models to generate subtitles for videos in several languages.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 282 points 3 months ago (3 children)

Finally, some good fucking AI

[–] [email protected] 169 points 3 months ago (1 children)

I was just thinking, this is exactly what AI should be used for. Pattern recognition, full stop.

[–] [email protected] 67 points 3 months ago (2 children)

Yup, and if it isn't perfect that is ok as long as it is close enough.

Like getting name spellings wrong or mixing homophones is fine because it isn't trying to be factually accurate.

[–] [email protected] 34 points 3 months ago (7 children)

Problem ist that now people will say that they don't get to create accurate subtitles because VLC is doing the job for them.

Accessibility might suffer from that, because all subtitles are now just "good enough"

[–] [email protected] 32 points 3 months ago

Or they can get OK ones with this tool, and fix the errors. Might save a lot of time

[–] [email protected] 25 points 3 months ago

Regular old live broadcast closed captioning is pretty much 'good enough' and that is the standard I'm comparing to.

Actual subtitles created ahead of time should be perfect because they have the time to double check.

[–] [email protected] 11 points 3 months ago

I have a feeling that if you care enough about subtitles you're going to look for good ones, instead of using "ok" ai subs.

load more comments (3 replies)

[–] [email protected] 15 points 3 months ago (1 children)

I'd like to see this fix the most annoying part about subtitles, timing. find transcript/any subs on the Internet and have the AI align it with the audio properly.

load more comments (1 replies)

[–] [email protected] 187 points 3 months ago (7 children)

What’s important is that this is running on your machine locally, offline, without any cloud services. It runs directly inside the executable

YES, thank you JB

load more comments (7 replies)

[–] [email protected] 148 points 3 months ago (6 children)

This sounds like a great thing for deaf people and just in general, but I don't think AI will ever replace anime fansub makers who have no problem throwing a wall of text on screen for a split second just to explain an obscure untranslatable pun.

[–] [email protected] 58 points 3 months ago

Bless those subbers. I love those walls of text.

[–] [email protected] 31 points 3 months ago

Translator's note: keikaku means plan

[–] [email protected] 23 points 3 months ago

[–] [email protected] 22 points 3 months ago

They are like the * in any Terry Pratchett (GNU) novel, sometimes a funny joke can have a little more spice added to make it even funnier

[–] [email protected] 9 points 3 months ago (1 children)

It's unlikely to even replace good subtitles, fan or not. It's just a nice thing to have for a lot of content though.

[–] [email protected] 11 points 3 months ago* (last edited 3 months ago) (1 children)

I have family members who can't really understand spoken English because it's a bit fast, and can't read English subtitles again, because again, too fast for them.

Sometimes you download a movie and all the Estonian subtitles are for an older release and they desynchronize. Sometimes you can barely even find synchronized English subtitles, so even that doesn't work.

This seems like a godsend, honestly.

Funnily enough, of all the streaming services, I'm again going to have to commend Apple TV+ here. Their shit has Estonian subtitles. Netflix, Prime, etc, do not. Meaning if I'm watching with a family member who doesn't understand English well, I'll watch Apple TV+ with a subscription, and everything else is going to be pirated for subtitles. So I don't bother subscribing anymore. We're a tiny country, but for some reason Apple of all companies has chosen to acknowledge us. Meanwhile, I was setting up an Xbox for someone a few years ago, and Estonia just... straight up doesn't exist. I'm not talking about language support - you literally couldn't pick it as your LOCATION.

load more comments (1 replies)

[–] [email protected] 71 points 3 months ago* (last edited 3 months ago) (1 children)

Now I want some AR glasses that display subtitles above someone's head when they talk à la Cyberpunk that also auto-translates. Of course, it has to be done entirely locally.

[–] [email protected] 20 points 3 months ago (5 children)

I guess we have most of the ingredients to make this happen. Software-wise we're there, hardware wise I'm still waiting for AR glasses I can replace my normal glasses with (that I wear 24/7 except for sleep). I'd accept having to carry a spare in a charging case so I swap them out once a day or something but other than that I want them to be close enough in terms of weight and comfort to my regular glasses and just give me AR like overlaid GPS, notifications, etc, and indeed instant translation with subtitles would be a function that I could see having a massive impact on civilization tbh.

[–] [email protected] 8 points 3 months ago

I believe you can put prescription lenses in most AR glasses out there, but I suppose the battery is a concern..

I'm in the same boat, I gotta wear my glasses 24/7.

load more comments (4 replies)

[–] [email protected] 49 points 3 months ago* (last edited 3 months ago) (5 children)

As vlc is open source, can we expect this technology to also be available for, say, jellyfin, so that I can for once and for all have subtitles.done right?

Edit: I think it's great that vlc has this, but this sounds like something many other apps could benefit from

[–] [email protected] 22 points 3 months ago (4 children)

It's already available for anyone to use. https://github.com/openai/whisper

They're using OpenAI's Whisper model for this: https://code.videolan.org/videolan/vlc/-/merge_requests/5155

load more comments (4 replies)

[–] [email protected] 12 points 3 months ago* (last edited 3 months ago) (1 children)

I hope it's available for Stash App. I wanna know what this JAV girls are saying.

load more comments (1 replies)

load more comments (2 replies)

[–] [email protected] 48 points 3 months ago (1 children)

This might be one of the few times I’ve seen AI being useful and not just slapped on something for marketing purposes.

[–] [email protected] 15 points 3 months ago (1 children)

And not to do evil shit

[–] [email protected] 7 points 3 months ago

But the toppings contains potassium benzoate.

[–] [email protected] 39 points 3 months ago (1 children)

As long as the models are OpenSource I have no complains

[–] [email protected] 32 points 3 months ago

And the data stays local.

[–] [email protected] 27 points 3 months ago (3 children)

And yet they turned down having thumbnails for seeking because it would be too resource intensive. 😐

[–] [email protected] 15 points 3 months ago (2 children)

I mean, it would. For example Jellyfin implements it, but it does so by extracting the pictures ahead of time and saving them. It takes days to do this for my library.

load more comments (2 replies)

[–] [email protected] 10 points 3 months ago (1 children)

Video decoding is resource intensive. We're used to it, we have hardware acceleration for some of it, but spewing something around 52 million pixels every second from a highly compressed data source is not cheap. I'm not sure how both compare, but small LLM models are not that costly to run if you don't factor their creation in.

load more comments (1 replies)

[–] [email protected] 26 points 3 months ago (1 children)

I hope Mozilla can benefit of a good local translation engine that could come out of it as well.

[–] [email protected] 16 points 3 months ago* (last edited 3 months ago) (1 children)

They technically already do with Project Bergamot.

[–] [email protected] 5 points 3 months ago (2 children)

I know they do, but it's lacking so many languages.

load more comments (2 replies)

[–] [email protected] 23 points 3 months ago (1 children)

The nice thing is, now at least this can be used with live tv from other countries and languages.

Think you want to watch Japanese tv or Korean channels with out bothering about downloading, searching and syncing subtitles

[–] [email protected] 13 points 3 months ago (1 children)

I prefer watching Mexican football announcers, and it would be nice to know what they're saying. Though that might actually detract from the experience.

[–] [email protected] 6 points 3 months ago (2 children)

GOOOOOOAAAAAAAAALLLLLLLLLL

load more comments (2 replies)

[–] [email protected] 19 points 3 months ago (1 children)

Will it be possible to export these AI subs?

[–] [email protected] 8 points 3 months ago

Imagine the possibilities!

[–] [email protected] 19 points 3 months ago (1 children)

Amazing. I can finally find out exactly what that nurse is yelling about while she gets railed by the local basketball team.

load more comments (1 replies)

[–] [email protected] 12 points 3 months ago (7 children)

The technology is nowhere near being good though. On synthetic tests, on the data it was trained and tweeked on, maybe, I don't know.
I corun an event when we invite speakers from all over the world, and we tried every way to generate subtitles, all of them run on the level of YouTube autogenerated ones. It's better than nothing, but you can't rely on it really.

[–] [email protected] 6 points 3 months ago (1 children)

No, but I think it would be super helpful to synchronize subtitles that are not aligned to the video.

load more comments (1 replies)

load more comments (6 replies)

[–] [email protected] 9 points 3 months ago

Haven't watched the video yet, but it makes a lot of sense that you could train an AI using already subtitled movies and their audio. There are times when official subtitles paraphrase the speech to make it easier to read quickly, so I wonder how that would work. There's also just a lot of voice recognition everywhere nowadays, so maybe that's all they need?

load more comments