Note: this article was too long in the works, and I decided to publish the latest draft despite not being fully satisfied with it to unclog the pipeline. Enjoy!

We are going to write an app to publish posts

Imagine that you have a requirement to build an API endpoint for an app that allows users to publish posts.
A Post has body, text and can either be in a published or unpublished state. When a post is being published, it's publication date should be recorded.

The easy win

Let's assume that we used Phoenix generators to generate all the necessary boilerplate code - mix phx.gen.json Feed Post posts title:string body:text status:text published_at:utc_datetime and ended up with a following schema:

  schema "posts" do
    field :title, :string
    field :body, :string
    field :published_at, :utc_datetime
    field :status, :string
  end

With this, we  have two distinct scenarios to cover:

  • when Post is created as published
  • when Post was made as unpublished and then published later on

That's a reasonably straightforward logic to implement, but we have a couple of options of how and when to tap into the Post status transition.

For example we can modify Feed.create_post and Feed.update_post to check status and conditionally update published_at field.

def create_post(attrs) do
   attrs = add_published_at(attrs)

    %Post{}
    |> Post.changeset(attrs)
    |> Repo.insert()
end

defp add_published_at(attrs) do
    if attrs["status"] == "published" do
      Map.put(attrs, "published_at", DateTime.utc_now())
    else    
      attrs
    end
  end

and a bit  tricker for update

def update_post(%Post{} = post, attrs) do
  attrs = add_published_at(post, attrs)
  post
  |> Post.changeset(attrs)
  |> Repo.update()
end

defp add_published_at(post = %Post{}, attrs) do
  if post.status == "unpublished" and attrs["status"] == "published" do
    Map.put(attrs, "published_at", DateTime.utc_now())
  else    
    attrs
  end
end

Nice. We just achieved the goal with minimal effort and can pat ourselves on the back -  the job's done!

But is it actually? And what's even more important - is this solution as simple as it looks at first sight?

As we say in TDD - we have a "green bar"  and now it's time to refactor.

Refactoring towards a meaningful model

First, let's define a model as a meaningful approximation of an object, system, or process that serves a specific goal.  A model can be good or bad, not in terms of how accurate it is but how well it serves its purpose.

Here is a twist - there is a code model, and our code itself is a model. The model of our code is roughly the following:

On post creation:

  • check if the status attribute of post params is set to published
  • if yes - set the published_at  to now
  • if not - return params as is
  • convert params into a Changeset
  • insert Post Changeset into a database

On post update:

  • check if current post status is unpublished and  post status in params  is published
  • if yes - if yes - set the published_at  to now
  • if not - return params as is
  • convert params into a Changeset
  • update Post with new Changeset

Now it's time to ask what our code models?

Here is a fun part - we don't know. In software development we can write something without knowing its exact purpose. I believe in plain English doing something without understanding its meaning called a "monkey job". Yet this is how most of the software is written.

There are some good reasons why it's the industry's state, but we will not get into that today.

What are we modeling?

By reading the code, we can deduct that Post can be published or not, which is determined by its status.  That's about it.

But why do we need more? Let me give you few very personal reasons.

In the first place - communicating meaning will increase your synchronization with colleagues. Being on the same page is a great feeling at the very minimum.

Second - to have a chance to keep professional satisfaction from doing your job, and the only way to do it is to keep growing, which is impossible if you can't reason clearly about what your code is expected to do solution-wise

How?

Here are a few key ideas, in my opinion:

  • Source code has a shape and a meaning.
  • Meaning is given to source code by a model that it tries to implement.
  • By controlling the shape, we can communicate meaning.
  • Definition of a good source code is when another person arrives to the same conceptual model after reading and understanding its meaning.

For those of you interested in specific technical patterns good starting point are "Domain Driven Design" and "Clean Code".

For example earlier snippet can look like:

def create_post(attrs) do
  if published?(attrs) do
    attrs
    |> mark_as_published()
    |> insert_record()
  else
    attrs
    |> mark_as_draft()
    |> insert_record()
  end
end

Afterword

Meaning of a source codel exists only within a given model. "This doesn't make sense" when reading source code roughly means "I can't connect meaning of this part of code to a part of the model that it tries to implement".

One of the biggest problems of our industry happens to be that code can be almost meaningless and yet serve its purpose good enough to  make people stakeholder believe that falling velocity are just a  temporary  circumstances and later they will be able regain it back.

Depending on the perspective some call that  "job security" or a "technical debt".