this post was submitted on 27 Dec 2023
96 points (68.2% liked)

Technology

60071 readers
3505 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 2 years ago
MODERATORS
 

I often find myself explaining the same things in real life and online, so I recently started writing technical blog posts.

This one is about why it was a mistake to call 1024 bytes a kilobyte. It's about a 20min read so thank you very much in advance if you find the time to read it.

Feedback is very much welcome. Thank you.

top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 87 points 1 year ago* (last edited 1 year ago) (5 children)

A lot of people are replying as if OP asked a question. It's a link to a blog post explaining why a kilobyte is 1000 and not 1024 bytes (exactly as the title says!). OP knows the answer, in fact they know it so well they wrote an extensive post about it.

Thank you for the write up! You should re-check the spelling and grammar as some sections had some troubles. I have a sentence I need to go to the post to get, so let me edit this later!

Edit: the second half of this sentence is a mess: "The factors don’t solely consist of twos, but ten are certainly lot of them." Otherwise nothing jumped out at me but I would reread it just in case!

[–] [email protected] 34 points 1 year ago (17 children)

I also assume that people are answering that way because they thought it was a question.

However, it's also possible that they saw it described as a 20 minute read, and knew that the answer actually takes about 10 seconds to read, and figured that they'd save people 19 minutes and 50 seconds.

load more comments (17 replies)
[–] [email protected] 9 points 1 year ago

Thank you very much. I'll try to fix that sentence later. I'm not a native speaker so it's not always obvious for me when a sentence doesn't sound right even though I pass sentences I'm not sure about through spell checks, MS Word grammar check and chat gpt 🤣

[–] [email protected] 8 points 1 year ago

This is a great example of how a lot of people dont read the posts they are replying to.

This is even more prevalent when arguments break out in the comments where people misunderstand each other or argue about things that one side said that they qualified later in the original comment but the other side didnt read the whole comment and instead hyperfocused on that one sentence that really garbled their goolies.

I trust that none of these people would have read the article even if they had realised it was there.

P.s. i fully agree with you. It's a great blog post. Good write-up. Very informative. The only quibble i have is that I've always loved the words mebibyte, gibibyte, etc.

load more comments (2 replies)
[–] [email protected] 52 points 1 year ago (9 children)

Well it’s because computer science has been around for 60+ years and computers are binary machines. It was natural for everything to be base 2. The most infuriating part is why drive manufacturers arbitrarily started calling 1000 bytes a kilobyte, 1000 kilobytes a megabyte, and 1000 megabytes a gigabyte, and a 1000 gigabytes a terabyte when until then a 1 TB was 1099511627776 bytes. They did this simply because it made their drives appear 10% bigger. So good ol’ shrinkflation. You could make drives 10% smaller and sell them for the same price.

load more comments (9 replies)
[–] [email protected] 43 points 1 year ago* (last edited 1 year ago) (4 children)

I genuinely don't understand your disdain for using base 2 on something that calculates in base 2. Do you know how counting works in binary? Every byte is made up of 8 bits, and goes from 0000 0000 to 1111 1111, or 0-15. When converted to larger scales, 1024 bytes is a clean mathematical derivation in base 2, 1000 is a fractional number. Your pedantry seems to hinge on the use of the prefix right? I think 1024 is a better representation of kilo- in base 2, because a kilo- can be directly translated up to exabytes and down to nybbles while "1000" in base 2 is extremely difficult. The point of metric is specifically to facilitate easy measuring, right? So measuring in the units that the computer uses makes perfect sense. It's like me saying that a kilogram should be measured in base 60, because that was the original number system.

load more comments (4 replies)
[–] [email protected] 28 points 1 year ago (11 children)

Stop blaming drive manufacturers

The most scientific shill I've seen in a while

CC BY-NC-SA 4.0

load more comments (11 replies)
[–] [email protected] 28 points 1 year ago (5 children)

I was confused when I just read the headline. Should be "Why I (that would be you not me) think a kilobyte should be 1000 instead of 1024". Unpopular opinion would be a better sub for it.

load more comments (5 replies)
[–] [email protected] 28 points 1 year ago (6 children)

Thanks for this article. Unfortunately, you used the word “prefix” when you really meant “unit symbol”. So, “kilo” and “mega” are prefixes, kB and MB are unit symbols. You repeatedly called the latter “prefixes”.

load more comments (6 replies)
[–] [email protected] 19 points 1 year ago* (last edited 1 year ago) (1 children)

A kilobyte (kB) is 1000 bytes, that's what the prefix kilo means. A kibibyte (KiB) is 1024 bytes (the "bi" in the prefix means base 2 or binary). People often confuse them, but they're similar enough for smaller units, 10^3 ~ 2^10.

Oh and at first, kilobyte was used for both amounts, which is why kibibytes were introduced to fix the confusion, which perhaps was a bit late anyway.

load more comments (1 replies)
[–] [email protected] 18 points 1 year ago* (last edited 1 year ago) (1 children)

Because a kilo is 1000. That's why you have kibi, mebi, gibi binary prefixes for those times where 1024 (power of 2's) matter.

[–] [email protected] 10 points 1 year ago

I know, that's what the post is about 😉

[–] [email protected] 16 points 1 year ago (4 children)

I know it's already been explained but here is a visualization of why.

0 2 4 8 16 32 64 128 256 512 1024

load more comments (4 replies)
[–] [email protected] 11 points 1 year ago

Kilo = 1000

Byte = Byte

Kilobyte = 1000 bytes

Kibibyte = 1024 bytes

[–] [email protected] 10 points 1 year ago (4 children)

It's a scam by HDD makers to sell less storage for more money.

load more comments (4 replies)
[–] [email protected] 10 points 1 year ago (2 children)

Because SI prefixes are always powers of the base. Base 10 is the most common, but that's more human psychology that math.

load more comments (2 replies)
[–] [email protected] 9 points 1 year ago (1 children)

I suggest considering this from a linguistic perspective rather than a technical perspective.

For years (decades, even), KB, MB, GB, etc. were broadly used to mean 2^10, 2^20, 2^30, etc. Throughout the 80s and 90s, the only place you would likely see base-10 units was in marketing materials, such as those for storage media and modems. Mac OS exclusively used base-2 definitions well into the 21st century. Windows, as noted in the article, still does. Many Unix/POSIX tools do, as well, and this is unlikely to change.

I will spare you my full rant on the evils of linguistic prescriptivism. Suffice it to say that I am a born-again descriptivist, fully recovered from my past affliction.

From a descriptivist perspective, the only accurate way to define kilobyte, megabyte, etc. is to say that there are two common usages. This is what you will see if you look up the words in any decent dictionary. e.g.:

I don't recall ever seeing KiB/MiB/etc. in the 90s, although Wikipedia tells me they "were defined in 1999 by the International Electrotechnical Commission (IEC), in the IEC 60027-2 standard".

While I wholeheartedly agree with the goal of eliminating ambiguity, I am frustrated with the half-measure of introducing unambiguous terms on one side (KiB, MiB, etc.) while failing to do the same on the other. The introduction of new terms has no bearing on the common usage of old terms. The correct thing to have done would have been to introduce two new unambiguous terms, with the goal of retiring KB/MB/etc. from common usage entirely. If we had KiB and KeB, there'd be no ambiguity. KB will always have ambiguity because that's language, baby! regardless of any prescriptivist's opinion on the matter.

Sadly, even that would do nothing to solve the use of common single-letter abbreviations. For example, Linux's ls -l -h command will return sizes like 1K, 1M, 1G, referring to the base-2 definitions. Only if you specify the non-default --si flag will you receive base-10 values (again with just the first letter!). Many other standard tools have no such options and will exclusively use base-2 numbers.

load more comments (1 replies)
[–] [email protected] 9 points 1 year ago

This is the stupid af.

[–] [email protected] 8 points 1 year ago (3 children)

This whole mess regularly frustrates me... why the units can't be used consistently?!

The other peeve of mine with this debacle is that drive capacities using SI units do not use the full available address space (since it's binary). Is the difference between 250GB and 256GiB really used effectively for wear-levelling (which only applies to SSDs) or spare sectors?

[–] [email protected] 11 points 1 year ago (6 children)

Power of 2 makes more sense to the computer. 1000 makes more sense to people.

load more comments (6 replies)
[–] [email protected] 9 points 1 year ago (2 children)

Huh? What does how a drive size is measured affect the available address space used at all? Drives are broken up into blocks, and each block is addressable. This is irrelevant of if you measure it in GB or GiB and does not change the address or block size. Hell, you have have a block size in binary units and the overall capacity in SI units and it does not matter - that is how it is typically done with typical block sizes being 512 bytes, or 4096 (4KiB).

Or have anything to do with ware leveling at all? If you buy a 250GB SSD then you will be able to write 250GB to it - it will have some hidden capacity for ware-leveling, but that could be 10GB, 20GB, 50GB or any number they want. No relation to unit conversions at all.

load more comments (2 replies)
load more comments (1 replies)
load more comments
view more: next ›