Relationships and Rails OOP: The Missing Link
In a vanilla Rails app, an association is really the byproduct of two objects referring to each other. Object 1 has an ID that gets referenced in a join table that also references Object 2’s ID, or maybe Object 1 has a column in its table that references Object 2’s ID — it doesn’t really matter how they refer to each other, cause the point is that the association doesn’t really exist the way the rows describing Objects 1 or 2. Since the association is just that, a literal association of one object to another, and not an object, it doesn’t really need much management, per se, since two objects are either related or they’re not.
In Neo4j, the idea that relationships are objects and are therefore just as real as nodes is the central part of its philosophy. It’s what separates a graph database from all other databases: your relationships are part of the data, your relationships are objects.
I’ve been using Neo4j.rb, an ActiveRecord replacement that uses Neo4j, in Phillymetal and other projects for about a year and a half now and started contributing to the development of v3.0 a little while ago. An issue I’ve always had with the shoehorning of Neo4j into Rails is that Rails isn’t really equipped to handle relationships as objects. I mean, it’s easy to picture how an ActiveRecord model and a Neo4j model parallel one another: what was once a row in the database now becomes a node. Associations in ActiveRecord make sense in principle — your association is a relationship and where once lived an ID referencing another row, you now have a relationship — but this doesn’t really go far enough, does it? In ActiveRecord, an association is a byproduct; in Neo4j, a relationship is truly an object.
This leads us to a philosophical issue when we approach it from an object-oriented perspective. Say I have two models, Student and Lesson. A student has many lessons, a lesson has many students. Which model is responsible for the relationship between student and lesson? In ActiveRecord, it wouldn’t be much of an issue since the association isn’t something that really needs responsibility. There are no properties, there are no methods, there are no instances; each object is essentially responsible for itself and the association is something that happens. But if we are adding properties to a relationship in Neo4j, if we are to assume that it truly is an object, should
- either model be responsible for that relationship? I’d argue no.
Here’s an example of where it goes wrong:
class User include Neo4j::ActiveNode has_many :out, :managed_lessons, model_class: Lesson, type: 'manages' def create_lesson_rel(lesson) if lesson.respond_to?(:managed_by) # create relationship # add properties to the relationship # call methods on lesson to modify the node else # failure behavior end end end
Why is User responsible for doing all of that? Because either it or the lesson object has to, even though Neo4j makes it clear that the relationship is part of our data and is its own model. In response, we need a new type of model, we need a relationship model. It is literally the missing link between nodes.
In Neo4j.rb 3.0, the ActiveRecord replacement for nodes is called ActiveNode. A few days ago, we released v3.0.0.alpha.10, containing ActiveRel, the relationship wrapper. It solves this problem by offering the ability to create relationship models that behave exactly as you would expect, complete with instances that support validations, callbacks, declared properties, and so on. They offer a separation of relationship logic, so models only need to be aware of the associations between nodes.
In practice, a slightly advanced implementation can looks like this:
class User include Neo4j::ActiveNode property :managed_stats, type: Integer #store the number of managed objects to improve performance has_many :out, :managed_lessons, model_class: Lesson, rel_class: ManagedRel has_many :out, :managed_teachers, model_class: Teacher, rel_class: ManagedRel has_many :out, :managed_events, model_class: Event, rel_class: ManagedRel has_many :out, :managed_objects, model_class: false, rel_class: ManagedRel def update_stats managed_stats += 1 save end end class ManagedRel include Neo4j::ActiveRel after_create :update_user_stats before_create :set_performance_review validate :manageable_object from_class User to_class :any type 'manages' property :created_at property :updated_at property :next_performance_review, type: DateTime def update_user_stats from_node.update_stats end def set_performance_review next_performance_review = 6.months.from_now end def manageable_object errors.add(:to_node) unless to_node.respond_to?(:managed_by) end end # elsewhere rel = ManagedRel.new(from_node: user, to_node: any_node) if rel.save # validation passed, to_node is a manageable object else # something is wrong end
As you can see, our User model is aware of associations to three different classes plus one, `managed_objects` that does not specify a class and will return all nodes related by type `manages`. Since they all refer back to the ManagedRel class, they all draw from the same `type`. As long as we create the relationship using an instance of ManagedRel, our validations and callbacks will run. The relationship sets its own timestamps and uses a callback to set its own property, then calls the method on the user to update its stats. We have separated the bulk of our relationship logic and can sleep soundly.