Peter Bocan's little corner

ORMs - a cul-de-sac of the software engineering

· Peter Bocan

One of the most contentious topics in software engineering is the use of ORMs (the Object-Relational mappers). To some, they are a blessing, to others they might as well be a plague. Here is my take on this topic, and I will use my anecdotal experience with the library called GORM why I think that ORMs are just obnoxious piece of software.

You are relinquishing too much control

I consider interaction with the database as the most crucial and sensitive work a backend engineer can do. It has a crucial impact on the performance of both, your API and the database server itself. So why do you want to introduce an extra layer of “magic sauce” which sits between you and your database? It’s a good thing to hone your SQL skills every day.

Why give it up to some library, hoping that it is reliable and it won’t break? Do you think that ORM library can not break? Well, it not only can, it did.

It’s reliable, right? Right?!

We updated the minor(!) version of GORM due to a vulnerability found in one of the GORM’s dependencies and then things started to break. With a little bit of investigation we figured out that GORM introduced a naming breaking change in the minor release. This really did not bolster my trust in this ORM library.

Updating dependencies - almost always break something

Another thing we discovered, is a “feature” which completely truncated more than 60% of the query, and when the query was executed it was no longer making sense to the database engine. (Hint: the culprit - the Regex seems to be wrong. Mandatory XKCD linked.)

1type Product struct {
2	gorm.Model
3	Code  string
4	Price uint
5}
6
7var product Product
8print(db.Select("code").Table("public.products p").Find(&product).Error) // err!

Solution to this problem was forcing the table alias as FROM foo AS f instead of the more common, natural FROM foo f. None of this behaviour is documented, of course. You have to find it out yourself the hard way. However, if the query somehow is not malformed, it can cause trouble. Not something you want to debug in production.

Questionable performance

The main functionality of the ORM is the mapping between objects and database tables. Simply put, the ORM library reconciles the column names in the database with the fields in the structs of your models. If a field has been added, it will create a new column for you automatically. If the field was removed, the column will remain in the database, and it is up to you to decide what you want to do with the data in the columns.

Sure, it’s handy for prototyping but at some point you will have to clean up all the zombie tables and columns yourself. This also makes writing queries and keeping track of what is actually in use close to impossible. The moment you rename any column you start accruing technical debt.

Did I mention to you that the reconciliation that GORM does takes good 40-50 seconds on a database of about 200 tables? Now, oddly enough, this behaviour is inconsistent, sometimes it manages to reconcile the tables in 40-50 millisecond, but it happens regularly. Something worth of investigation at some point in the future.

Fighting the ORM

One of the worst parts about ORMs is when you have to do something which your ORM of choice does not support, or the support is limited and/or buggy. Quite a bit of frustration and you are stuck using the tool that is supposed to help you. I guess it’s a part of the trade off.

Consider the following example. The example sets up a connection and all tables are prefixed with a schema name. But this query will fail to execute, because the preconfigured schema is not a part of the Join statement.

 1
 2func main() {
 3   db, err := gorm.Open(sqlite.Open("gorm.db"), &gorm.Config{ 
 4       NamingStrategy: schema.NamingStrategy{  
 5          TablePrefix: "schema_name.", 
 6       },
 7   }
 8
 9	var user models.User
10   db.Find(&user).Joins("left join payment_information pi on pi.user_id = users.id")
11   // ...
12)

Another big fight I had with GORM was when I was doing the tables migrations. My task was to separate tables into different database schemas. The problem was that I had to migrate the tables manually (or introduce a new DB migration, using raw SQL into the database migration tool) because the ORM did not offer any centralised solution to this problem. Another solution would be to specify the name of the schema on every table manually, which also sucked. So I chose running a raw SQL before the reconciliation step occurs.

The ORM-specific “features”

Some ORMs have specific features, which I find dubious at best. For example, GORM offers functionality called “eager loading” which they call Preloading which, as the copied example below shows, executes an unbounded subquery.

 1type User struct {
 2  gorm.Model
 3  Username string
 4  Orders   []Order
 5}
 6
 7type Order struct {
 8  gorm.Model
 9  UserID uint
10  Price  float64
11}
12
13var users []User
14// Preload Orders when find users
15db.Preload("Orders").Find(&users)
16// SELECT * FROM users;
17// SELECT * FROM orders WHERE user_id IN (1,2,3,4);

In this case If you have 10 million users, and on average they have 100 orders, your query may not fit into memory, do the sequential scan, etc. This problem only amplifies the more preloading you do. Also note that this preloading doesn’t really allow you to fetch only specific information.

Of course, in the prototyping stage it doesn’t really matter. Even for small production deployments it doesn’t really matter. This is going to be a scaling issue bottleneck later down the line, where you may end up with hundreds of queries which do preloading and somebody will have to address them one by one.

Another questionable feature of GORM are hooks. I particularly don’t like this feature as it is a very lazy way where to hide your business logic. I can imagine a few months later wondering where the values in specific columns get populated with specific values and/or dummy values. The documentation doesn’t really direct people how or when to use the hooks, or what is their primary intention. So as you may imagine, people will stick all sorts of crazy stuff in it, guaranteed.

Unreadable tangled mess

Admittedly, this is not an issue with the ORM itself, it’s just one extra foot gun baked into ORMs. People who haven’t done much programming will fall into a trap of passing the SQL builder across the codebase, entangle it with business logic and filtering logic which then results in all sorts of problems.

I have seen SQL queries that joined on 33 tables across 4 or 5 different places in the codebase. Why? Because the API of the SQL query builder allows you to abstract the heaps of nastiness into a nice interface and pile more on top of it. If you are not on top of it and strict guidelines how you write your queries, it will become a mess.

The “not all” argument

Don’t get me wrong, I think ORMs are a good prototyping tool, but the longer you use the ORM, the more you will fight it for one reason or another. I think there is definitely an inflection point where the ORM becomes hinderance. However, I don’t know where exactly is that inflection point. I don’t think anybody can tell for sure. But my experience tells me that the sooner you get away from an ORM, the easier your life will be in the long term.

#orm #engineering

Reply to this post by email ↪