this post was submitted on 30 Nov 2023
265 points (91.0% liked)
Technology
59207 readers
2845 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I agree with you in general, but for Stable Diffusion, "2.0/2.1" was not an incremental direct improvement on "1.5" but was trained and behaves differently. XL is not a simple upgrade from 2.0, and since they say this Turbo model doesn't produce as detailed images it would be more confusing to have SDXL 2.0 that is worse but faster than base SDXL, and then presumably when there's a more direct improvement to SDXL have that be called SDXL 3.0 (but really it's version 2) etc.
It's less like Windows 95->Windows 98 and more like DOS->Windows NT.
That's not to say it all couldn't have been better named. Personally, instead of 'XL' I'd rather they start including the base resolution and something to reference whether it uses a refiner model etc.
(Note: I use Stable Diffusion but am not involved with the AI/ML community and don't fully understand the tech -- I'm not trying to claim expert knowledge this is just my interpretation)
AFAIU SDXL is actually an erm genetic descendant of SD1.5, with its architecture expanded, weights transferred from 1.5, and then trained on bigger inputs (512x512 in the end is awfully small). SD2.0 is a completely new model, trained from scratch and as far as I'm aware noone's actually using it. Also noone is using the SDXL refiner if you go to civitai it's all models with detailer capabilities baked in, what you do see is workflows that generate an image, add some noise at the very end and repeat the last couple of steps. Using the base sdxl refiner on the output of other sdxl models is sometimes right-out comical because it sometimes has no idea what it's looking at and then produced exquisitely surface texture details of the wrong material. Say a silk keyboard because it doesn't realise that it's supposed to be ABS and, well, black silk exists.
Yeah I got some good replies to my comment explaining it. Makes more sense now.