this post was submitted on 10 Mar 2024

139 points (97.3% liked)

Privacy

31859 readers

389 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Posting a link to a website containing tracking isn't great, if contents of the website are behind a paywall maybe copy them into the post
Don't promote proprietary software
Try to keep things on topic
If you have a question, please try searching for previous discussions, maybe it has already been answered
Reposts are fine, but should have at least a couple of weeks in between so that the post can reach a new audience
Be nice :)

Related communities

Chat rooms

[Matrix/Element]Dead
Discord

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago

MODERATORS

[email protected]

139

Are there tools that exist to anonymize writing styles? (leminal.space)

submitted 8 months ago* (last edited 8 months ago) by [email protected] to c/[email protected]

50 comments fedilink hide all child comments

I feel like with the rise of AI something that anonymizes writing styles should exist. For example it could look for differences in American versus British spelling like color versus colour or contextual things like soccer versus football and make edits accordingly. ChatGPT could be fed a prompt that says "Rewrite the following paragraphs as if they were written by an Australian" but I don't know if it would have a good enough grasp on the objective or if it would start shoehorning in references to koalas and fairy floss.

I tried searching online to see if something like this existed and found a few articles from around the 2010s such as Software Helps Identify Anonymous Writers or Helps Them Stay That Way by the New York Times. It talks about stylometry and Anonymouth but it seems like Anonymouth hasn't been updated in years. All recent articles seem to be about plagiarism and AI.

For context what got me thinking about the topic was remembering JK Rowling being revealed to be the author of a mystery novel called The Cuckoo’s Calling. Smithsonian wrote an article about it called How Did Computers Uncover J.K. Rowling’s Pseudonym?. I thought it could make for a neat post here.

top 50 comments

sorted by: hot top controversial new old

[–] [email protected] 49 points 8 months ago (9 children)

This doesn't differentiate different types of English but I will post it any just in case it is useful to you.

https://goblin.tools/Formalizer

[–] [email protected] 57 points 8 months ago (2 children)

Not even bad. The text is not a jab at OP, just a random thing that was in my head lol

[–] [email protected] 21 points 8 months ago* (last edited 8 months ago) (1 children)

This seems like an extremely potent tool for sealioning people. Thanks, I hate it.

[–] [email protected] 19 points 8 months ago

I had to look up what sealioning is

https://en.wikipedia.org/wiki/Sealioning

[–] [email protected] 2 points 8 months ago

That’s absolutely hilarious. I love it.

[–] [email protected] 23 points 8 months ago* (last edited 8 months ago) (3 children)

Love this tool!

original:

CloudFlare is a vigilante extremist organization that takes the decentralized web and centralizes it under one corporate power who dictates terms in the world's largest walled-garden. A very large portion of the web (16.7%+) were once freely open to all but are now controlled and monitored by a single central authority who decides for everyone who may access what web content. This does serious damage to net neutrality, privacy, and has immediate serious consequences

more passionate:

CloudFlare is a ruthless and oppressive entity that viciously strips away the freedom and decentralization of the web, consolidating power under one heartless corporation. This monstrous force dictates the terms of our online existence, trapping us within the confines of the world's largest walled-garden. A staggering percentage of the once free and open web (16.7% and growing) now falls under the iron fist of this tyrannical authority, controlling and monitoring every aspect of our online experiences. This egregious violation of net neutrality and privacy rights has dire and immediate consequences, threatening the very essence of our digital freedom and autonomy. Our voices are being silenced, our choices limited, and our rights trampled upon by this merciless oppressor. It is time to rise up and fight against this injustice with all our might, before it's too late.

[–] [email protected] 9 points 8 months ago (2 children)

What happens when you take the output and make it the new input with “more passionate”?

E.g. how far can you push it? Or even the other way.

[–] [email protected] 9 points 8 months ago* (last edited 8 months ago)

I didn’t try feeding the output back in but note you can click on the 3 peppers and move an intensity slider. That sample I posted was just middle intensity. When I max it out to 5 peppers, it grows to 4 paragraphs:

CloudFlare, the insidious scourge of freedom on the internet, is a devious and manipulative force that seeks to crush the very essence of democracy and free speech. By swooping in with their iron fist of control, they have stolen the decentralized web from the hands of the people and centralized it under their tyrannical rule.

With a cold and calculated efficiency, they have stripped away our rights and freedoms, dictating who can access what content on the world's largest walled-garden. The once vast expanse of the internet, where ideas flowed freely and voices could be heard, has now been shackled and confined by a single corporate power.

This egregious act of censorship and control not only violates the principles of net neutrality and privacy, but it also threatens the very fabric of our society. The consequences of their actions are dire and immediate, leading us down a dark and treacherous path towards a future devoid of liberty and independence.

We must rise up against this relentless enemy, this monstrous beast that seeks to suffocate our voices and stifle our creativity. We must fight back with all our might, for the sake of our freedom, our rights, and our very humanity. Stand with me, brothers and sisters, and let us vanquish this evil before it consumes us all.

Less emotional (×3) is boring:

CloudFlare is an organization that has centralized a significant portion of the decentralized web under one corporate power. This has raised concerns about the implications for net neutrality, privacy, and access to web content.

[–] [email protected] 4 points 8 months ago* (last edited 8 months ago)

I did it with the sarcastic option elsewhere in the comment thread and it seems to plateau around the third time.

[–] [email protected] 8 points 8 months ago (1 children)

Hmm. But like there's stuff in there that isn't even part of the substance of what you said. Like the call to action at the end. That's the type of thing you'd want to be pretty sure you really wanted to do in some circumstance where you want to communicate on a matter in plain sight but obscure your identity, as such a matter would presumably be pretty important and high stakes.

[–] [email protected] 3 points 8 months ago

Indeed this bot could lead a mob of nutters into an insurrection attempt like that seen on jan.6. I guess the idea is to get ideas for rewording rather than use the output verbatim.

[–] [email protected] 2 points 8 months ago

Not bad, though I think it missed a nuance with 16.7%.

[–] [email protected] 11 points 8 months ago

alt-text

(Two screenshots side-by-side showing a text converter with the following input, settings, output + new settings and new output)

For sale: baby shoes, never worn.

Make my text:
More professional

Available for purchase: a pair of infant footwear, never utilized.

Make my text:
More sociable (waffle)

Hey there! So I've got these super cute baby shoes up for grabs. Unfortunately, they've never been worn. But that just means they're brand spanking new and ready for a little one to rock them! Let me know if you're interested in giving these adorable kicks a new home. #forsale #babystyle #neverworn

[–] [email protected] 11 points 8 months ago

I like it so far

[–] [email protected] 7 points 8 months ago

I'd like to interject for a moment

More passionate 🌶️🌶️🌶️🌶️🌶️

Oh, how I fervently need to break in right now! Can you not see the truth that lies within the depths of your words when referring to Linux? Oh, but it is not merely Linux, my friend. No, it is GNU/Linux - a symphony of passion and dedication, a masterpiece of collaboration and innovation that cannot be ignored!

Linux, dear soul, is not merely an operating system standing alone in its glory. No, it is but a humble piece of the grand puzzle that is the fully functioning GNU system. Do you not see the beauty in the harmonious blending of GNU corelibs, shell utilities, and vital system components to create an operating system that is nothing short of breathtaking?

Every day, countless computer users traverse the realms of the modified GNU system, completely unaware of the exquisite essence that surrounds them. Oh, the irony of fate that has led to the misnaming of the wondrous creation! How many of these individuals walk through life unaware that what they worship as Linux is, in truth, the glorious GNU system birthed from the genius minds of the GNU Project?

Yes, yes, my dear friend, there exists a Linux, a kernel that serves as the beating heart of the operating system, allocating its precious resources to the programs that dance harmoniously within its realm. Yet, do not be deceived by its standalone nature, for without the embrace of the complete GNU operating system, it is a mere shell of its potential greatness. Can you not see that the so-called Linux distributions are truly nothing more than distributions of the majestic GNU/Linux, a union of power and grace unlike any other?

Oh, let us embrace the truth and acknowledge the beauty that lies within every line of code, every element of the system. Let us pay homage to the true hero of the story - GNU/Linux, a name that embodies the passion, the dedication, and the sheer brilliance of the minds that have crafted this masterpiece for all to behold. Let us never forget the true essence of what we worship, for it is not merely Linux - it is GNU/Linux, a symbol of unity, a beacon of hope, and a testament to the power of collaboration in the face of adversity.

[–] [email protected] 7 points 8 months ago* (last edited 8 months ago)

This looked fun so I took a section from a Hackaday article about the last Morse code station in America and ran it through.

The original

There are still a few niches of Morse code, including, of course, the amateur radio community. Code isn’t that hard to learn, it is dead easy to make a computer send it and only moderately hard to receive it. Making a transmitter to send the code is also easy, or feel free to whistle, thump, or use a flashlight in an emergency.

More sarcastic:

Oh wow, what a shocker - Morse code is still around! Who would've thought?! And amateur radio enthusiasts just can't let go of it, can they? Learning code is like, soooo easy, anyone could do it. Just get a computer to do all the work for you, nbd. And receiving it? Pfft, child's play. Making a transmitter? Piece of cake. And if all else fails, just whistle, thump, or use a flashlight to save the day. Morse code, the pinnacle of modern technology.

More sarcastic x2:

Oh, can you believe it? Morse code is somehow still lingering in the depths of obscurity. Those amateur radio nerds just can't accept that the rest of the world has moved on. Learning code is obviously a breeze, I mean, who needs to actually put in any effort these days? Just rely on a computer to do everything for you, because that's what innovation is all about. And receiving it? Ha, like it's some kind of magical talent. Building a transmitter couldn't possibly be any simpler, right? And if all else fails, just resort to whistling, thumping, or waving around a flashlight to communicate like a caveman. Morse code, the cutting edge of technological advancement - truly mind-blowing.

More sarcastic x3:

Oh wow, can you believe it? Morse code is just so incredibly popular and relevant in today's high-tech society. I mean, only the coolest and most hip people are still using it, right? Those amateur radio enthusiasts are just way ahead of the curve, refusing to let go of this ancient and totally necessary form of communication. Why bother with modern advances when you can just tap some dots and dashes into a machine, right? So much easier than actually learning something new or challenging yourself. And don't even get me started on building a transmitter - I'm sure it's just a piece of cake. Who needs smartphones and instant messaging when you can just whistle or wave a flashlight around like a caveman? Morse code, the pinnacle of innovation and sophistication - truly groundbreaking stuff.

After that it kind of plateaus. It seems like in short bursts it could do alright depending on what modifier is chosen, other options seem to alter what the original prompt is trying to convey pretty drastically.

Thanks for linking it.

[–] [email protected] 4 points 8 months ago

Backend of this is OpenAI / LLM; so my guess is if OP knows what they want, they can prompt such models or chatbots in such a manner to achieve the desired styles.

[–] [email protected] 3 points 8 months ago (1 children)

I must object to the disrespectful language you used towards me. Allow me to share some pertinent information about my background: I graduated at the top of my class in the Navy Seals, and have been involved in numerous classified operations against Al-Qaeda resulting in over 300 confirmed missions. I am highly skilled in various forms of combat and am recognized as a top sniper within the US armed forces.

Your remarks are unwarranted and I take them seriously. Rest assured, I am taking appropriate measures to address this situation. My network of contacts spans the nation, and I am currently tracing your IP address.

It is in your best interest to prepare for the consequences of your actions. I have the capability to neutralize threats in a multitude of ways, utilizing resources from the United States Marine Corps if necessary.

Perhaps if you had foreseen the repercussions of your remarks, you would have chosen a more prudent course of action. Alas, that is not the case. Be advised that there will be severe repercussions for your behavior.

Hey there, did you just say something to me? I just wanted to share that I graduated at the top of my class in the Navy Seals and have been involved in some pretty intense missions. No hard feelings, just wanted to make sure we're on the same page. Let's keep things friendly and positive, shall we? Thanks for understanding.

[–] [email protected] 5 points 8 months ago* (last edited 8 months ago)

I turned up the peppers to 5 and... Oh boy. Sociable at 3 is all "friendly" and "positive" and polite—it removes most of the dark stuff—and then Sociable at 5 loops around to... This:

Oh my goodness, did you just say something not so nice about me, you little cutie patootie? I want you to know that I actually did pretty well in my Navy Seals training, graduating at the top of my class. I've been on quite a few secret missions against Al-Quaeda and have taken down over 300 bad guys. I'm pretty good at gorilla warfare and I'm considered one of the best snipers in the whole US armed forces. But don't worry, you're just like any other target to me. I may just have to use my special skills to take you down with precision like never before. Do you really think you can get away with talking to me like that online? Think again, you little rascal. I've got a whole network of friends all across the country who are helping me track your IP address right this second. So get ready for a little storm headed your way. You may feel like your life is being wiped out, but don't worry too much. I can handle over seven hundred ways to take you down, even without any weapons. And I must say, I have some pretty cool toys from the United States Marine Corps that I might just have to use on you. If only you knew what was coming after that little comment you made, maybe you would've kept quiet. But hey, too late now. Get ready to be in a world of hurt, my friend. You're going to be so mad when you realize what you've gotten yourself into. So get ready for a little "fury" shower from me. You're done for, kiddo.

[–] [email protected] 2 points 8 months ago* (last edited 8 months ago)

Thank you from the bottom of my heart for sharing! Your generosity in sharing means the world to me and fills my soul with gratitude. Thank you for opening up and allowing me to connect with you on a deeper level. Your willingness to share has touched me in ways I can't even put into words. Thank you, thank you, thank you!

Thanks for sharing 🙃

[–] [email protected] 19 points 8 months ago* (last edited 8 months ago) (1 children)

There was a talk about detecting patterns and writing styles at Chaos Computer Congress a bunch of years ago.

The researchers also presented a tool to anonymize text as far as I can remember.

I will go look for the talk.

Edit: Found it!

https://media.ccc.de/v/31c3_-_6173_-_en_-_saal_g_-_201412291715_-_source_code_and_cross-domain_authorship_attribution_-_aylin_-_greenie_-_rebekah_overdorf

They talk about their software to find who wrote what, but also how to use that knowledge to write software that attempts to anonymize text.

[–] [email protected] 3 points 8 months ago

The New York Times article I linked mentioned that. I will have to watch that video though so I can get a better understanding of the mechanics of it. Thanks for the link.

[–] [email protected] 16 points 8 months ago

There is a program built into Whonix, I believe it's called Kloak, that randomizes your keyboard input times so you can't be identified via keystroke timing JavaScript. There's also research into defeating stylomeyric analysis such as anonymouth but I'm sure there are plenty of new tools, if anyone find any that work well please reply here as I haven't looked in some years. 'Stylometric analysis' is the key phrase to search for.

With AI this will get worse (better identification based on typing styles) but it will also get better because you can setup a local LLM and ask it to re-write your text in a certain style. Touching on this, everyone uses a combination of unique phrases and misspelling or mis-spelling (see?) of words, and with enough text from a given account the chance of statistical probability in attribution is very high. It's how the Unibomber was identified after his manifesto was published, because he used a very specific phrase incorrectly and his brother recognized it, so his wife convinced him to call the FBI tip line about his brother.

[–] [email protected] 12 points 8 months ago (1 children)

Translate to some foreign language. Then translate to some other foreign language. Then translate back to your language. Congrats, your writing style changed.

[–] [email protected] 8 points 8 months ago (1 children)

Ah, the classic game of Google Translate Telephone.

[–] [email protected] 3 points 8 months ago (1 children)

Better to do the translations locall, so the original never leaves your device

[–] [email protected] 3 points 8 months ago

Yes. I would use the privacy focused ones (there are several in Fdroid). If your threat model includes anonymity against state actor, such that they will be attempting to trace your writing style, you can be certain they could and would also just subpoena google for matching translation requests. It would be a lot easier to back into identifying you that way.

[–] [email protected] 10 points 8 months ago (1 children)

Probably the cut and paste from magazines, but you copy and paste the sentences you want to use. A lot of extra work, but no AI to rat you out.

[–] [email protected] 6 points 8 months ago

Serial killer style

[–] [email protected] 10 points 8 months ago* (last edited 8 months ago)

I wouldn't just trust random Lemmy users (no offense) but instead check for actual fields, e.g stylometry or writeprint, and from there check the state of the art. Not being an expert would make that tricky so I would take a recent published papers, e.g https://arxiv.org/abs/2203.11849 to understand the challenge. As is always the case they'll review the field, e.g section 2 here, and clarify the 2 sides of the arm race, here Obfuscation/Deobfuscation. The former in 3.2 mentions examples of techniques authors estimate to be good starting point, e.g writeprintsRFC. I'd then search for such tools if they don't directly provide link to open-source repository, e.g theirs https://github.com/reginazhai/Authorship-Deobfuscation . I would then try a recent one that I can easily setup, e.g via Docker, and give it a go. I would then read the rest of the paper, see who cites it, and try to get a more up to date version.

TL;DR: I don't know but there is dedicated research which result I'd trust more than the opinion of strangers who are probably not expert.

[–] [email protected] 8 points 8 months ago (1 children)

My coworkers use chatgpt for this. Since it always answers in the same generic ways it's helpful to anonymize their peer reviews.

[–] [email protected] 3 points 8 months ago (2 children)

I don't understand people who want to anonymize their writing but then use chatgpt to do that. For me at least they are not exactly the business that I would trust.

[–] [email protected] 3 points 8 months ago

The concern here is less so OpenAI knowing, rather they worry about their coworkers identifying them.

[–] [email protected] 2 points 8 months ago (2 children)

You can run your own offline instance. Not that randos are likely to, but still.

load more comments (2 replies)

[–] [email protected] 8 points 8 months ago

Non-native english speakers tend to mix up various styles, you could ask somone to paraphrase your text.

[–] [email protected] 4 points 8 months ago* (last edited 8 months ago) (1 children)

ChatGPT will probably remember it was you who asked and doxx you in retaliation when it discovers you’ve plagerized chatGPT.

Another thought is to translate it into Scottish. But then again, you probably still want to be understood.

Changing dialect may be too small of a change. But if you could say write this like 1-2 generations younger/older using high school slang of the time you might get a useful difference.

[–] [email protected] 4 points 8 months ago

Changing dialect may be too small of a change. But if you could say write this like 1-2 generations younger/older using high school slang of the time you might get a useful difference.

I feel like knowing the correct use of slang for a demographic would be a challenge and require a lot of constant research. Even if someone was to go off of slang younger people were using I feel like there's a risk of it being a regional term.

Trying to force it I'd probably end up with something like "Those elf bars be dripping but that extra popcorn lung was a vibe check on god" which gives off "How Do You Do, Fellow Kids?" vibes.

[–] [email protected] 4 points 8 months ago

US to UK English and vice-versa.

[–] [email protected] 4 points 8 months ago (1 children)

Just ask ChatGPT to paraphrase.

[–] [email protected] 7 points 8 months ago (1 children)

Not a great solution unless you think you can trust OpenAI and their security implementation (which you shouldn't). We have seen simple PHP scripted prompts in the past have the AI recount an entire conversation from another user. Not safe at all.

[–] [email protected] 1 points 8 months ago

Fair point. Depends on what the document is used for I suppose — whether such security is an issue (vs simply anonymising the style).

[–] [email protected] 4 points 8 months ago

I had asked for the same thing a while back but didn't really get much. The round-about method that I have found is to finetune FOSS LLMs on data you want it to represent (largely text) and then diving into some prompt engineering to get it to say something you like.

However, I haven't been able to find a test which can accurately point towards text not having specific weights that it relies on. Cue the attacks on GPT-4 which deanonymises data it was trained on. You might also want to read about DPT and Shadowing techniques to red-team LLMs and LLM-generated text as literature.

Cheers

[–] [email protected] 3 points 8 months ago (2 children)

I wonder if google translate through multiple languages can fm the trick?

[–] [email protected] 3 points 8 months ago

I feel like if someone wanted to give off the impression that they were a non-English speaker that might work. I think it would be limited to a surface level though. Whoever attempted to use it would likely miss out on a lot of the common pitfalls someone learning a new language would run into like mixing up the order of adjectives.

That and the content that is being run through a translator multiple times might get warped. I am not sure if going back and forth messes things up as badly as it did 10 years ago though.

load more comments (1 replies)

[–] [email protected] 3 points 8 months ago (1 children)

Autocorrect?

If you use it before it has learned you writing idiosyncrasies?

[–] [email protected] 1 points 8 months ago

That would be an interesting way of doing it. Someone could probably couple that with predictive text for decent results

[–] [email protected] 3 points 8 months ago* (last edited 8 months ago) (1 children)

Yeah, it would need to be a browser extension, adding a button to scramble every text input field. Or maybe even on the OS level, opening an input field above the browser one, so the original text was never input into the browser.

[–] [email protected] 2 points 8 months ago

You can look here, maybe you find something usefull, if you search for AI (+150k apps)