Ask Lemmy
A Fediverse community for open-ended, thought provoking questions
Please don't post about US Politics.
Rules: (interactive)
1) Be nice and; have fun
Doxxing, trolling, sealioning, racism, and toxicity are not welcomed in AskLemmy. Remember what your mother said: if you can't say something nice, don't say anything at all. In addition, the site-wide Lemmy.world terms of service also apply here. Please familiarize yourself with them
2) All posts must end with a '?'
This is sort of like Jeopardy. Please phrase all post titles in the form of a proper question ending with ?
3) No spam
Please do not flood the community with nonsense. Actual suspected spammers will be banned on site. No astroturfing.
4) NSFW is okay, within reason
Just remember to tag posts with either a content warning or a [NSFW] tag. Overtly sexual posts are not allowed, please direct them to either [email protected] or [email protected].
NSFW comments should be restricted to posts tagged [NSFW].
5) This is not a support community.
It is not a place for 'how do I?', type questions.
If you have any questions regarding the site itself or would like to report a community, please direct them to Lemmy.world Support or email [email protected]. For other questions check our partnered communities list, or use the search function.
Reminder: The terms of service apply here too.
Partnered Communities:
Logo design credit goes to: tubbadu
view the rest of the comments
You did what? That first sentence is entirely meaningless to me. Mind explaining some more?
Stable Diffusion is a popular offline, open source text to image AI model. You type text, it makes images. It is like ChatGPT but for images, where ChatGPT is text to text chat.
Basically it is a similar type of neural network model that takes an image with mathematically random static and processes it in a series of steps that slowly create an image based on the text prompt.
If you follow so far, Stable Diffusion is trained on billions of images that all have descriptive captions. This means you can only generate images using text that was in the captions of the original model data set. If, let's say, you want to generate images of yourself in places around the world, it is very unlikely that you are already defined in the model unless you are a public figure or celebrity. However, it is possible to add yourself into the AI without retraining the entire neural network. If you had to retrain everything, it would basically require you to own or rent some serious data center level hardware that runs around a half million dollars. It is possible to kinda patch on a layer onto the neural network of the model so that it knows what you look like and associates it with text.
If you get the settings wrong for one of these patched training layers, you can get all kinds of crazy errors, like turning people into abstract art. This was the result of my first test. Maybe I'll do better today. I found some faster training tools to try. The first attempt took 4 hours.