The best option is to run them models locally. You'll need a good enough GPU - I have an RTX 3060 with 12 GB of VRAM, which is enough to do a lot of local AI work.
I use Ollama, and my favourite model to use with it is Mistral-7b-Instruct. It's a 7 billion parameter model optimised for instruction following, but usable with 4 bit quantisation, so the model takes about 4 GB of storage.
You can run it from the command line rather than a web interface - run the container for the server, and then something like docker exec -it ollama ollama run mistral
, giving a command line interface. The model performs pretty well; not quite as well on some tasks as GPT-4, but also not brain-damaged from attempts to censor it.
By default it keeps a local history, but you can turn that off.
I think that you are right as to why the publishers picked them specifically to go after in the first place. I don't think they should have done the "emergency library".
That said, the publishers arguments show they have an anti-library agenda that goes beyond just the emergency library.
The trouble is that the publishers are not just going after them for infinite lend-outs. The publishers are arguing that they shouldn't be allowed to lend out any digital copies of a book they've scanned from a physical copy, even if they lock away the corresponding numbers of physical copies.
Worse, they got a court to agree with them on that, which is where the appeal comes in.
The publishers want it to be that physical copies can only be lent out as physical copies, and for digital copies the libraries have to purchase a subscription for a set number of library patrons and concurrent borrows, specifically for digital lending, and with a finite life. This is all about growing publisher revenue. The publishers are not stopping at saying the number of digital copies lent must be less than or equal to the number of physical copies, and are going after archive.org for their entire digital library programme.