this post was submitted on 27 Jul 2024
847 points (97.9% liked)

Programmer Humor

19967 readers
695 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 19 points 5 months ago (3 children)

SQL blows for hierarchical data though.

Want to fetch a page of posts AND their tags in normalized SQL? Either do a left join and repeat all the post values for every tag or do two round-trip queries and manually join them in code.

If you have the tags in a JSON blob on the post object, you just fetch and decide that.

[–] [email protected] 7 points 5 months ago (4 children)

I'm no expert in JSON, but don't you lose the ability to filter it before your application receives it all? If you had a reasonable amount of data then in SQL you can add WHERE clause and cut down what you get back so you could end up processing a lot less data than in your JSON example, even with the duplicated top table data. Plus if you're sensible you can ensure you're not bringing back more fields than you need.

[–] [email protected] 3 points 5 months ago

It's entirely possible to sort and filter inside JSON data in most SQL dialects. You can even add indexes.

[–] [email protected] 2 points 5 months ago (1 children)

In a traditional SQL database, yeah. In various document-oriented (NoSQL) databases, though, you can do that.

[–] [email protected] 5 points 5 months ago (1 children)

Modern relational databases have support for it too including indexes etc. For example postgres.

[–] [email protected] 2 points 5 months ago

Every major SQL database supports json manipulation nowadays. I know MariaDB and MySQL and SQLite at least support it natively.

[–] [email protected] 2 points 5 months ago* (last edited 5 months ago)

If there's commonly used data that would be good for indexing or filtering, you can take a few key values and keep them stored in their own fields.

There are also often functions that can parse structured text like XML or JSON so you can store data in blobs but not actually need to query all the blobs out to a client to use them on the database side and retrieve specific values. Another nice thing about blobs is the data can be somewhat flexible in structure. If i need to add a field to something that is a key/value pair inside a blob, i dont necessarily have to change a bunch of table schemas to get the functionality on the front end that I'm after. Just add a few keys inside the blob.

[–] [email protected] 4 points 5 months ago (1 children)

If you only join on indexed columns and filter it down to a reasonable number of results it's easily fast enough.

For true hierarchical structures there's tricks. Like using an extra Path table, which consists of AncestorId, DescendentId and NumLevel.

If you have this structure:

A -> B -> C

Then you have:

A, A, 0

A, B, 1

A, C, 2

B, B, 0

B, C, 1

C, C, 0

That way you can easily find out all children below a node without any joins in simple queries.

[–] [email protected] 2 points 5 months ago (1 children)

The fact that you'd need to keep this structure in SQL and make sure it's consistent and updated kinda proves my point.

It's also not really relevant to my example, which involves a single level parent-child relationship of completely different models (posts and tags).

[–] [email protected] 1 points 5 months ago (1 children)

I mean in my case it's for an international company where customers use this structure and the depth can basically be limitless. So trying to find the topmost parent of a child or getting all children and their children anywhere inside this structure becomes a performance bottleneck.

If you have a single level I really don't understand the problem. SQL joins aren't slow at all (as long as you don't do anything stupid, or you start joining a table with a billion entries with another table with a billion entries without filtering it down to a smaller data subset).

[–] [email protected] 1 points 5 months ago* (last edited 5 months ago) (1 children)

My point is that SQL works with and returns data as a flat table, which is ill fitting for most websites, which involve many parent-child object relationships. It requires extra queries to fetch one-to-many relationships and postprocessing of the result set to match the parents to the children.

I'm just sad that in the decades that SQL has been around, there hasn't been anything else to replace it. Most NoSQL databases throw out the good (ACID, transactions, indexes) with the bad.

[–] [email protected] 2 points 5 months ago

I really don't see the issue there, you're only outputting highly specific data to a website, not dumping half the database.

Do you mean your typical CRUD structure? Like having a User object (AuthId, email, name, phone, ..), the user has a Location (Country, zip, street, house number, ..), possibly Roles or Permissions, related data and so on?

SQL handles those like a breeze and doesn't care at all about having to resolve the User object to half a dozen other tables (it's just a 1..1 relation, on 1..n, but with a foreign key on the user id it's all indexed anyway). You also don't just grab all this data, join it and throw it to the website (or rather the enduser API), you map the data to objects again (JSON in the end).

What does it matter there if you fetched the data from a NoSQL document or from a relational database?

The only thing SQL is not good at is if you have constantly changing fields. Then JSON in SQL or NoSQL makes more sense as you work with documents. For example if you offer the option to create user forms and save form entries. The rigid structure of SQL wouldn't work for a dynamic use-case like that.

[–] [email protected] 2 points 5 months ago* (last edited 5 months ago)

Either do a left join and repeat all the post values for every tag or do two round-trip queries and manually join them in code.

JSON_ARRAYAGG. You'll get the object all tidied up by database in one trip with no need to manipulate on the receiving client.

I recently tried MariaDB for a project and it was kinda neat, having only really messed with DynamoDB and 2012 era MsSQL. All the modern SQL languages support it, though MariaDB and MySQL don't exactly follow the spec.