TheHobbyist

joined 1 year ago
[–] [email protected] 51 points 3 days ago

This. I will resume my recommendation of Bitwarden.

[–] [email protected] 3 points 3 days ago (1 children)

I didn't say it can't. But I'm not sure how well it is optimized for it. From my initial testing it queues queries and submits them one after another to the model, I have not seen it batch compute the queries, but maybe it's a setup thing on my side. vLLM on the other hand is designed specifically for the multi co current user use case and has multiple optimizations for it.

[–] [email protected] 22 points 4 days ago* (last edited 3 days ago) (5 children)

I run the Mistral-Nemo(12B) and Mistral-Small (22B) on my GPU and they are pretty code. As others have said, the GPU memory is one of the most limiting factors. 8B models are decent, 15-25B models are good and 70B+ models are excellent (solely based on my own experience). Go for q4_K models, as they will run many times faster than higher quantization with little performance degradation. They typically come in S (Small), M (Medium) and (Large) and take the largest which fits in your GPU memory. If you go below q4, you may see more severe and noticeable performance degradation.

If you need to serve only one user at the time, ollama +Webui works great. If you need multiple users at the same time, check out vLLM.

Edit: I'm simplifying it very much, but hopefully should it is simple and actionable as a starting point. I've also seen great stuff from Gemma2-27B

Edit2: added links

Edit3: a decent GPU regarding bang for buck IMO is the RTX 3060 with 12GB. It may be available on the used market for a decent price and offers a good amount of VRAM and GPU performance for the cost. I would like to propose AMD GPUs as they offer much more GPU mem for their price but they are not all as supported with ROCm and I'm not sure about the compatibility for these tools, so perhaps others can chime in.

Edit4: you can also use openwebui with vscode with the continue.dev extension such that you can have a copilot type LLM in your editor.

[–] [email protected] 10 points 6 days ago* (last edited 6 days ago) (4 children)

I'm wondering, the integrated RAM like Intel did for Lunar Lake, could the same performance be achieved with the latest CAMM modules? The only real way to go integrated to get the most out of it is doing it with HBM, anything else seems like a bad trade-off.

So either you go HBM with real bandwidth and latency gains or CAMM with decent performance and upgradeable RAM sticks. But the on-chip ram like Intel did is neither providing the HBM performance nor the CAMM modularity.

[–] [email protected] 16 points 1 week ago

They used PimEyes, nothing new.

Of importance: they do not want to release the tool but use it as a way to raise awareness.

[–] [email protected] 9 points 1 week ago (1 children)

You mean between the French article and the English comment? :)

[–] [email protected] 2 points 2 weeks ago

Sure, anytime, create a new post, tag me if you need me specifically to have a look. I've used docker on synology for years, have gone through major updates and while I'm certainly no expert, I've learned some things which could be helpful.

[–] [email protected] 3 points 2 weeks ago (2 children)

I know what you're talking about, happens to us all when we're learning something new.

Want to share the details of a specific issue you're facing, blocking you?

[–] [email protected] 1 points 2 weeks ago* (last edited 2 weeks ago) (4 children)

I understand your position. There is a learning curve to containers, but I can assure you that getting your basics on the topic will open a whole new world of possibilities and also make everything much easier for yourself. The vast majority of people run containers which make the services less brittle because they have their own tailored environment and don't depend on the host libraries and packages and also brings increased security because the services can't easily escape their boundaries rendering their potential vulnerabilities less of an issue compared to running those same services bare metal.

I started on synology too. There is a website called Marius hosting which focuses on tutorials for containers on synology, but his instructions have been updated the last few years to focus on spinning up containers manually rather than through the UI, which makes it more intimidating than it needs to be for beginners... I'll link it here just as a reference. I'll see if on the way back machine he shows the easier way and report back if I find something.

Edit: yes here is an original tutorial for Jellyfin (this method still works for me and is still how I use docker lately): https://web.archive.org/web/20210305002024/https://mariushosting.com/how-to-install-jellyfin-on-your-synology-nas/

[–] [email protected] 7 points 2 weeks ago* (last edited 2 weeks ago) (6 children)

To answer your question more specifically, most people set up the pi with docker, using services which have a front end accessible in the browser. They basically use their browser to navigate to the front end of the service they want to use and administer it like that. For instance portainer to manage their docker containers, or pihole for managing their firewall, or even jellyfin for their media which is both the website to consume the media and has an administrator dashboard.

Edit: this is in complement to using something like tailscale which basically allows you to access these services away from home. They work in conjunction.

[–] [email protected] 9 points 2 weeks ago* (last edited 2 weeks ago) (8 children)

Tailscale is a good option.

Edit: I'm assuming you mean away from home, but if you mean in your local network just use SSH?

[–] [email protected] 34 points 2 weeks ago (9 children)

From what I understand, bsky's architecture seems to allow federation at multiple levels. On one side the individual profiles are actually websites and the app aggregates the content almost as an RSS reader. I do see some profiles which are independent like Jeff Gierling's, so yes federation at the profile level seems to work.

And this is really important because it is one way to prevent your data from being hostage by the service. Then there is another level of federation. I'm not entirely sure of the terminology here, but there is one aggregator aspect, which is quite compute intensive. And that one I don't know if there is another instance of it. But functionally speaking, I'm quite impressed by the technical aspect of bsky. There has been a lot of thought put into it.

And monetizing it is not the issue, the problem is mostly how. That they have some paid features is fine, it's even important that there are ways to monetize it without milking their users of their privacy.

Let's hope this works out and becomes sustainable while respecting the users!

 

Hi folks,

I'm seeing there are multiple services which externalise the task of "identity provider" (e.g. login with Facebook, google or what not).

In my case, I am curious about Tailscale, a VPN service which allows one to chose an identity provider/SSO between Google, Microsoft, Github, Apple and OIDC.

How can I find out what data is actually communicates to the identity provider? Their task should simply be to decide whether I am who I claim to be, nothing more. But I'm guessing there may be some subtleties.

In the case of Tailscale, would the identity provider know where I'm trying to connect? Or more?

Answers and insights much appreciated! The topic does not seem to have much information online.

 

Hi folks, I'm looking for a specific YouTube video which I watched around 5 months ago.

The gist of the video is that it was comparing the transcoding performance of an Intel iGPU when used natively, compared to when passed through to a VM. From what I recall there was a significant performance hit and it was around 50% or so (in terms of fps transcoding). I believe the test was performed on jellyfin. I don't remember whether it was using xcpng, proxmox or another OS. I don't remember which channel published this video nor when it was published, just that I watched it sometime between April and June this year.

Anyone recall or know what video I'm talking about? Possible keywords include: quicksync, passthrough, sriov, iommu, transcoding, iGPU, encoding.

Thank you in advance!

 

Hi y'all,

I am exploring TrueNAS and configuring some ZFS datasets. As ZFS provides with some parameters to fine-tune its setup to the type of data, I was thinking it would be good to take advantage of it. So I'm here with the simple task of choosing the appropriate "record size".

Initially I thought, well this is simple, the dataset is meant to store videos, movies, tv shows for a jellyfin docker container, so in general large files and a record size of 1M sounds like a good idea (as suggested in Jim Salter's cheatsheet).

Out of curiosity, I ran Wendell's magic command from level1 tech to get a sense for the file size distribution:

find . -type f -print0 | xargs -0 ls -l | awk '{ n=int(log($5)/log(2)); if (n<10) { n=10; } size[n]++ } END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' | sort -n | awk 'function human(x) { x[1]/=1024; if (x[1]>=1024) { x[2]++; human(x) } } { a[1]=$1; a[2]=0; human(a); printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'

Turns out, that's when I discovered it was not as simple. The directory is obviously filled with videos, but also tiny small files, for subtitiles, NFOs, and small illustration images, valuable for Jellyfin's media organization.

That's where I'm at. The way I see it, there are several options:

    1. Let's not overcomplicate it, just run with the default 64K ZFS dataset recordsize and roll with it. It won't be such a big deal.
    1. Let's try to be clever about it, make 2 datasets, one with a recordsize of 4K for the small files and one with a recordsize of 1M for the videos, then select one as the "main" dataset and use symbolic links for each file to the other dataset such that all content is "visible" from within one file structure. I haven't dug too much in how I would automate it, but might not play nicely with the *arr suite? Perhaps overly complicated...
    1. Make all video files MKV files, embed the subtitles, rename the videos to make NFOs as unnecessary as possible for movies and tv shows (though this will still be useful for private videos, or YT downloads etc)
    1. Other?

So what do you think? And also, how have your personally set it up? Would love to get some feedback, especially if you are also using ZFS and have a videos library with a dedicated dataset. Thanks!

Edit: Alright, so I found the following post by Jim Salter which goes through more detail regarding record size. It clarifies my misconception about recordsize not being the same as the block size, but also it can easily be changed at any time. It's just the size of the chunks of data to be read. So I'll be sticking to 1M recordsize and leave it at that despite having multiple smaller files, because the important will be to effectively stream the larger files. Thank you all!

view more: next ›