A Java Crash Course for Ruby Developers, Part 0
I've been working with Neo4j for something like two years now. I started with Ruby and Neo4j.rb 2.3, which used JRuby 1.7.x with Neo4j Embedded 1.9, and learned Ruby, Neo4j, Rails, and the Neo4j.rb gem concurrently. Funny as it might be, I made it pretty far without ever writing much more than an extremely basic Cypher query for over a year. Phillymetal.com is built entirely on this stack but there's very little "graphy" stuff going on. Neo4j Embedded is so fast that working with the database has almost no overhead for simple find/return/traverse operations, so I was mostly exploiting the easy data modeling and schema-free goodness.
Things changed last year when I started contributing heavily to the Neo4j.rb gem, then got even more intense when I ramped up work on my own project. As performance became more of a concern, I started writing more Cypher to limit the number of queries in and out of the database. As it turns out, Ruby seems to be Neo4j.rb's biggest problem, but the fact is still that the REST API with Neo4j 2.1.7 is just not as performant as Java and Neo4j Embedded.
Enter unmanaged extensions and the Java API. Neo4j has a very cool feature: since it is open source and exposes a REST API, they provide an easy way to write your own plugins, "unmanaged extensions," that expose new REST endpoints. Once your data hits that endpoint, you can use the Java API to do all your work, then you return whatever you want back to your application. This gives you the best of both worlds: Cypher when you need it, Java API for the heavy lifting. Without exception, everyone I know who needs serious performance from Neo4j has told me that they fall back to this method.
At work, we're building an ambitious social media analytics platform that relies heavily on Neo4j. It's collecting a lot of data and revealing a lot of interesting connections. The front-end app is a client of the JSON API my team is building and the goal is for it to have the feel of a desktop app. This means it wants a ton of data very fast. Unfortunately, the combination of our query, the amount of data, the REST API, and Neo4j.rb just do not make for the performance we're looking for. Without getting into how much data we're returning, I can say that we've been seeing 1.5s responses when we really need something closer to 400ms.
I took some steps to improve performance but each time, I got smacked down. I patched Neo4j-core's Query
class with an unwrapped
method that returns simpler objects, I implemented the amazing GraphAware TimeTree plugin, I optimized the hell out of my query, but we were still having these issues; in fact, the 1.5s response mentioned above is AFTER these optimizations!
After hearing so much about the power of unmanaged extensions, I thought this might be a good opportunity to dive in. Problem was, I had never written a line of Java before. Hell, I had never successfully read a line of Java! Still, I knew what I wanted to do: take the complex Cypher query we're running, rewrite it in Java, return JSON that is usable in the app. To do this, I started looking for resources about Java that were aimed at Rubyists. I figured that with the popularity of both languages, there had to be something out there for Ruby devs looking to learn Java... right? Wrong. Everything I could find was about going from Java to Ruby. Coooool...
I figured I might as well dive into TimeTree to see if I could figure some of it out. I started with the API, since it was all I was really interacting with directly. https://github.com/graphaware/neo4j-timetree/blob/master/src/main/java/com/graphaware/module/timetree/api/TimeTreeApi.java has all the endpoints and figuring out what methods were being called was easy enough. It was also pretty trivial to expose a new endpoint, take a few params, and... make the whole thing crash. A lot.
A few searches and half a book on a plane ride later, I made it to a point that I was able to hack together my own unmanaged extension. In the span of seven days, I was able to write it, extract it from TimeTree, test it, and get it into staging. Our report now runs at around 400ms and I'm going to move more of our app's code into Java ASAP. Victory!
I'm writing this post and what I hope will be a few companion pieces for two reasons.
First, from a Neo4j dev's perspective, I want to illustrate the power of the Java API compared to Cypher. This is not a knock on Cypher in any way, because it is a fantastically expressive, readable, powerful language that I'm always happy to work with, but the superior control you have over your data within Java just can't be understated.
Second, separate from Java, I want to provide a resource for other people who may be in my shoes: Ruby devs who want or need to learn Java quickly but don't want to read through a book that spends a significant amount of time on familiar principles of OOP. As I work with Java, it's going to be harder to remember how I felt about it when I got started, so now is the time to do this.
I don't expect there to be more than one or two other posts about this and I doubt that either of them will be very heavy on Neo4j, if they're mentioned at all. Though I may mention it, it's really not much more than a reference for how I figured some things out. If you are an experienced Java developer, please don't blast me if/when I misstate details, since I'm still learning here.
I'm going to start working on the first piece of this immediately, so stay tuned!
2014: Best/Worst Year in Review
While this blog might not really show it, 2014 was one of the best and worst years of my life.
Professionally, it was a landmark year. In March, I left my job with Valiant Technology, the place that got me out of Philadelphia and into NYC. The end of my time with Valiant also marked the end of my time in that part of the IT industry after nearly 10 years. Goodbye, Hyper-V; hello, Heroku.The plan was to get a startup off the ground. It didn't exactly work out that way, but something even better happened: I realized how much I needed to grow by allowing Neo4j.rb to consume me.
Working with Neo4j and parent Neo Technology resulted in trips to Malmö, Sweden and San Francisco. It opened countless doors, helped me make lot of new friends, and helped me find a sense of both community and professional satisfaction that I've never really known. I wrote a lot of blog posts and made nearly 1500 commits on Github. Phillymetal.com relaunched, I participated in a whirlwind hackathon (that I'll link to when I move it to its new home), and just generally made a ton of stuff I'm extremely proud of. I put my ADHD hyper-focus to use and learned, learned, learned.
There was stress — a shitload of it. If not for the patience and support of my amazing girlfriend, who also left her job for the big ???? of freelance life only to end up in a fantastic new role of her own, I'd have been homeless or living on some generous person's couch sometime around September. I had some lucky breaks along the way (thank you, Utpal and Michael!) and just this morning, I accepted a full-time offer that came largely as a result of my obsessive open source work, dedication to not being shitty at things, and a whole lot of luck. This whole rags-to-something-better-than-rags story will make a better post sometime; for now, the point is that I feel as though it was all worth it because I no longer feel like I'm treading water in an industry that doesn't really have a future for me.
That was the good. It was a banner year, one I'll remember as a massive turning point thanks to the new opportunities and irons on the fire as we enter 2015. But if it was to sit on a scale, if I heaped in every single positive event, big and small, every tiny victory and massive conquest, and condensed it into something tangible, I don't think it could outweigh the single moment of devastation that occurred on January 29, 2014: my mom died of cancer. It crept up on us barely 8 months after it reappeared from 15 years of remission. Melanoma. She went into the hospital with stomach pain one night and was gone the next.
I haven't really written much about it and other than a post on Facebook and some time off of work and I don't think I've even typed it out in months. Seeing it on the screen still kind of feels like a shock. Things changed immediately after it happened: I couldn't handle life in the office, I couldn't deal with other people's schedules or attitudes — hell, I just couldn't deal with other people. A lot of my behavior since then — the decision to switch fields and work by myself from home, the unblinking focus on code, the freezing of every music project, silence on so many public tragedies that normally would have found me engaging in debate — I wonder how much of it was influenced by this. So much of this year was spent trying to process what happened. When the smoke cleared enough for me to see the other side, I think my priorities had changed. I found myself wanting to provide for my father, in whose household my mom was always CEO. I'm so much less interested in drama and dealing with personalities, which made me unwilling to pursue recording bands or playing shows or arguing politics or social issues. I'm filled with regret for things I never said or did. I like to think that I'm more careful with words, more deliberate with actions.
Even now, the world is darker, dirtier. It's as if everything is just slightly out of tune and I feel that dissonance all around me, subtle and unnerving; a constant, creeping vibration, profoundly wrong on a visceral level. For months, I kept waiting for my phone to ring and for someone to say, "Chris, we made a mistake! She's here, she's doing great!" Like a superhero who refuses to stay dead, cause nobody really stays dead, right?
No, the world doesn't work like that, and this is a part of life that every child eventually has to deal with. It's just frustrating that it should happen this year. I feel robbed of the opportunity to show her how things are working out, that her unflinching support and confidence for nearly 30 years wasn't misplaced. It's an incomprehensible feeling, to be so angry an existence itself but not really having anyone or anything specific to blame.
I try to reserve my blog for helpful things, how-tos and records of triumph, or at least cool things I come across as I go, but it didn't seem right to let this year close out without documenting 2014's twisting river of successful and failure. I don't mean to reduce the loss of my mother to some sort of apologue, but if I had to try and explain why I wanted to present this publicly, it would be to make a statement about the interconnectedness of all things. More than ever before, I am aware that we are all products of innumerable people's decisions and actions, nature's amorality, random coincidences, good and bad timing, and a whole lot of generous people. No matter how hard we work at things, we can take a step back and think about what led up to the opportunities, taken and missed, that shape our lives. Hard work is important, but so is awareness of the influence of everyone and everything around us; more importantly, we should take inventory of how cruelly fate could work out, how much worse things could always be.
I think of all the ways it could have been worse, of kids robbed of their parents as children, or kids who never know their parents, or kids who know their parents and hate them or are hated by them. My mom would tell me to think back on this year and be proud of my accomplishments, look after my father, love my friends and girlfriend, and do everything to the best of my abilities. She'd tell me to focus not on loss, but on success, the future, and the things I do have instead of the things I don't. It's a process, I guess, one I've never been very good at, but I'm doing my best, and I guess that's all anyone can really ask for.
Dynamically Adding Nested Resource Routes in Rails
I’m working what feels like a rather large project using the Neo4j.rb gem (which recently had its 3.0 release!). One feature of this project allows users to share different types of events with other users. Access to an endpoint in the API is based on whether a given user has a relationship to the target and, if so, some properties of that relationship. So, for instance, a User who has a direct relationship to an Event with the right score may see some restricted properties; a User who is related to an object that is related to that Event will see some limited properties; a User with no relationship whatsoever will not even be able to get to the page.
One of the nice things about the way this is setup is that all of the relationships share properties and some behavior, so it’s begging to be abstracted out into a module that I can test once and share with my resources. It also means that if I’m not careful, I’ll have to duplicate a lot of basic setup code: routes, controllers, etc,… I want to do this:
class Api::V1::EventsController & ApplicationController has_users_route # normal methods end
…and have it automatically add a `:users` resource under `event`, so I can do `/api/v1/events/:event_id/users/:user_id` and it will route to the `UserSecurity` controller. This will prevent me from having to do this:
namespace :api do namespace :v1 do resources :events do resources :users, to: 'user_security' end resources :bands do resources :users, to: 'user_security' end # repeat about 15 times end end
The question, then, is… how the hell do I do this? I did some research and found a lot of information on using Engines to add routes to apps, but this didn’t feel quite right. Someone on StackOverflow pointed me this. It was different from what I had in mind in that it was registering the new resources directly from `routes.rb`. At first, I didn’t think it was going to fit. I found a way to make it use a method in my controller, celebrated, began writing tests and this post… and realized that it was wiping out all the existing routes and just adding the new ones! Womp womp.
So I abandoned the idea of calling methods from controllers and I’m sticking it in `routes.rb`. This is smarter because (as we already covered) this really is a routing issue. By calling a method from the routes file, I can very easily manage which resources are taking advantage of this feature.
With all that said, my `UserAuthorization` module ended up looking like this:
module UserAuthorization extend ActiveSupport::Concern module ClassMethods def register_new_resource(controller_name) MyApp::Application.routes.draw do puts "Adding #{controller_name}" namespace :api do namespace :v1 do resources controller_name.to_sym do resources :users, controller: 'user_security', param: :given_id end end end end end end end
`routes.rb` looks like this:
['events', 'bands', 'and so on'].each { |resource| ApplicationController.register_new_resource(resource) }
Calling `Rails.application.routes.named_routes.helpers` from the Rails console showed that my new resources were added. Victory! My request specs also suddenly changed and showed that my endpoints had come to life. There was a new problem, though, in the form of `params`.
It’s like this: since UserSecurityController is receiving data from any number of endpoints, I have a mystery resource and mystery param ID: `/api/v1/mystery_resource/:mystery_id/users/:id`. My controller actions need an easy way to get access to each of those and find the appropriate models the user is trying to load.
I started by trying to use the `param` option in the routes like this:
resources controller_name.to_sym, param: :target_id do resources :users, controller: 'user_security', param: :given_id end
All that did was given me `param[:mystery_resource_target_id]` and `param[:given_id]`. The mystery resource — the target the user is trying to modify — was still unidentified, it just had `target_id` appended to it. Some more searches indicated that it might not be possible to change this, so I went in the other direction: If I can’t change the param’s key, I can figure out the path taken that ended up at the controller and set the param accordingly. While I was at it, I added a method to help me find the model that is responsible for the target so I can do things like `target_model.where(whatever)`.
Here’s the resultant class.
class Api::V1::UserSecurityController & ApplicationController before_action :authenticate_user! before_action :target_id private attr_reader :root_resource def target_id @target_id ||= get_target_id end def get_target_id @root_resource = request.fullpath.split('/')[3].singularize params["#{root_resource}_id".to_sym] end def target_model @target_model ||= root_resource.capitalize.constantize end def given_id params[:given_id] end end
I don’t love how I had to do that but it gets the job done and I can and will always refactor. There’s still a lot to do but this is a start. Hope it gives you some ideas.
How I Refactor
You can find a gist of this at https://gist.github.com/subvertallchris/1c6397ea7d66be0c0aab.
/u/zaclacgit on Reddit posted a topic the other day asking for thoughts on his exercise of recreating some basic enumerable methods. I gave him some tips on refactoring one method in particular and a few days later, he asked me to elaborate. I thought the easiest way might be to go through each step of the refactor and I’d get a nice blog post out of it in the process.
To use this, start by commenting out each my_inject
definition except the first. As you encounter new ones, uncomment them. Save this file as refactor.rb
in the folder of your choice, ensure you have the rspec gem installed and run rspec refactor.rb
from CLI to execute.
When you can look at your code and see repeating patterns, work to reduce all the unique parts of the repeating code. Reduce everything to variables before you enter your if
statements. Never repeat yourself if you can avoid it.
Before we write a line of code, we’re going to write specs based off of the simple comparisons you were doing in your version. It makes it much clearer when there’s a problem. I’m going to show it as a comment here but it’s really going to live at the bottom of my file. You can redfine the same method over and over again, Ruby will execute the last one it finds.
public :my_inject, :my_each
require 'rspec'
describe 'inject with' do
let(:a) { [1, 2, 3, 4] }
context 'sym' do
subject { a.my_inject(:+) }
it { is_expected.to eq a.inject(:+) }
end
context 'int and sym' do
subject { a.my_inject(100, :+) }
it { is_expected.to eq a.inject(100, :+) }
end
context 'block' do
subject { a.my_inject { |memo, x| memo + x } }
it { is_expected.to eq a.inject { |memo, x| memo + x } }
end
context 'int and block' do
subject { a.my_inject(100) { |memo, x| memo + x } }
it { is_expected.to eq a.inject(100) { |memo, x| memo + x } }
end
end
It’s important that we run the spec as we refactor. Pretty code is nice, working code is better, and working code that’s pretty is best. You should code in that order: make it work first, then worry about what you can do to make it more efficient and easier to read.
Since all the methods rely on your my_each
method, we’ll include that here at the top.
def my_each
n = self.length
i = 0
while i & n
yield(self[i])
i += 1
end
self
end
We start here.
def my_inject(initial = nil, sym = nil)
case
when initial.is_a?(Symbol)
sym = initial
memo = self[0]
self[1..-1].my_each do |x|
memo = memo.send(sym, x)
end
memo
when initial && sym
memo = initial
self.my_each do |x|
memo = memo.send(sym,x)
end
memo
when initial && block_given?
memo = initial
self.my_each do |x|
memo = yield(memo,x)
end
memo
when block_given?
memo = self[0]
self[1..-1].my_each do |x|
memo = yield(memo,x)
end
memo
end
end
Each of the when
statements is very similar. There are subtle differences but we don’t want to focus on that, we want to expose patterns. They all kind of look like this:
when some_condition
memo = something
something_or_other.my_each do |x|
memo = some_combination_of_memo(var1, var2)
end
memo
end
To me, the most obvious place to start is with the starting memo
. In two cases, it’s equal to initial
, in the other two it’s self
. Rather than defining that each time, let’s define it before we enter the when
clauses.
def my_inject(initial = nil, sym = nil)
memo = if initial.is_a?(Symbol) || (!initial && block_given?)
self[0]
else
initial
end
case
when initial.is_a?(Symbol)
sym = initial
self[1..-1].my_each do |x|
memo = memo.send(sym, x)
end
memo
when initial && sym
self.my_each do |x|
memo = memo.send(sym,x)
end
memo
when initial && block_given?
self.my_each do |x|
memo = yield(memo,x)
end
memo
when block_given?
self[1..-1].my_each do |x|
memo = yield(memo,x)
end
memo
end
end
That’s cool. Each when
clause has a call to my_each
after either self or self[1..-1]. Why is it different each time? Set it once at the beginning.
def my_inject(initial = nil, sym = nil)
memo = if initial.is_a?(Symbol) || (!initial && block_given?)
self[0]
else
initial
end
starting = if initial.is_a?(Symbol) || (!initial && block_given?)
self[1..-1]
else
self
end
case
when initial.is_a?(Symbol)
sym = initial
starting.my_each do |x|
memo = memo.send(sym, x)
end
memo
when initial && sym
starting.my_each do |x|
memo = memo.send(sym,x)
end
memo
when initial && block_given?
starting.my_each do |x|
memo = yield(memo,x)
end
memo
when block_given?
starting.my_each do |x|
memo = yield(memo,x)
end
memo
end
end
We’re getting better but what do we have now? We have duplicate code right at the start! That can be fixed easily. We don’t want to set the memo
and starting
variables within the if
statement, so let’s do it using the output of if
. We can set multiple variables to elements of an array.
def my_inject(initial = nil, sym = nil)
memo, starting = if initial.is_a?(Symbol) || (!initial && block_given?)
[self[0], self[1..-1]]
else
[initial, self]
end
case
when initial.is_a?(Symbol)
sym = initial
starting.my_each do |x|
memo = memo.send(sym, x)
end
memo
when initial && sym
starting.my_each do |x|
memo = memo.send(sym,x)
end
memo
when initial && block_given?
starting.my_each do |x|
memo = yield(memo,x)
end
memo
when block_given?
starting.my_each do |x|
memo = yield(memo,x)
end
memo
end
end
Getting there. What pops up now? Hopefully that the last two when
clauses are identical aside from their conditions. The conditions used to matter when we were setting variables within them. Since that’s not happening anymore, we can clean that up.
def my_inject(initial = nil, sym = nil)
memo, starting = if initial.is_a?(Symbol) || (!initial && block_given?)
[self[0], self[1..-1]]
else
[initial, self]
end
case
when initial.is_a?(Symbol)
sym = initial
starting.my_each do |x|
memo = memo.send(sym, x)
end
memo
when initial && sym
starting.my_each do |x|
memo = memo.send(sym,x)
end
memo
when block_given?
starting.my_each do |x|
memo = yield(memo,x)
end
memo
end
end
What about that whole initial/sym thing? If not for that, the first two when
clauses would be identical, so let’s declare sym
before when
and then we won’t need both of those.
def my_inject(initial = nil, sym = nil)
memo, starting = if initial.is_a?(Symbol) || (!initial && block_given?)
[self[0], self[1..-1]]
else
[initial, self]
end
symbol = sym || initial
case
when initial
starting.my_each do |x|
memo = memo.send(symbol, x)
end
memo
when block_given?
starting.my_each do |x|
memo = yield(memo,x)
end
memo
end
end
This seems to make sense but we run our specs and… a failure!
1) inject with int and block
Failure/Error: memo = memo.send(symbol, x)
TypeError:
100 is not a symbol
# ./refactor.rb:249:in `block in my_inject'
# ./refactor.rb:5:in `my_each'
# ./refactor.rb:248:in `my_inject'
# ./refactor.rb:309:in `block (3 levels) in &top (required)>'
# ./refactor.rb:310:in `block (3 levels) in &top (required)>'
What gives? Well, since our starting value can be a symbol or an integer, we need to approach that line we just added a bit differently. We have been looking for the presence of initial
but maybe that’s not the best way of handling it. Since rspec is complaining about what we are feeding send
, let’s focus on that. We need to determine if there is a symbol to send and, if so, what is it? Then we only want to send if there’s a symbol.
Along the way, we’re going to fix that whole case
/when
situation. Both of the remaining cases start and end the same way, so let’s wrap that around the conditions and then perform some logic inside.
def my_inject(initial = nil, sym = nil)
memo, starting = if initial.is_a?(Symbol) || (!initial && block_given?)
[self[0], self[1..-1]]
else
[initial, self]
end
symbol = sym ? sym : (initial.is_a?(Symbol) ? initial : nil)
starting.my_each do |x|
memo = memo.send(symbol, x) if symbol
memo = yield(memo, x) if block_given?
end
memo
end
We are aaaaalmost done. This new code is better but this line is kind of silly:
symbol = sym ? sym : (initial.is_a?(Symbol) ? initial : nil)
It’s certainly better than this:
if sym
sym
else
if initial.is_a?(Symbol)
initial
else
nil
end
end
But we can just use the | operator to handle some of logic. |
def my_inject(initial = nil, sym = nil)
memo, starting = if initial.is_a?(Symbol) || (!initial && block_given?)
[self[0], self[1..-1]]
else
[initial, self]
end
symbol = sym || (initial.is_a?(Symbol) ? initial : nil)
starting.my_each do |x|
memo = memo.send(symbol, x) if symbol
memo = yield(memo, x) if block_given?
end
memo
end
It’s still an important line to understand because it makes our line #317 possible
symbol = sym || (initial.is_a?(Symbol) ? initial : nil)
The key is that we’re declaring symbol
, just like we declare initial
and sym
as defaulting to nil in the method. We need symbol
to exist or this line will fail:
memo = memo.send(symbol, x) if symbol
We use the parenthesis to control the folow of logic, with the ternary operator reducing this:
if initial.is_a?(Symbol)
initial
else
nil
end
To a simple one-line expression.
expression_returning_boolean ? do_this_if_true : do_this_if_false
I was testing it and realized that we’ve left something out, haven’t we? We’re testing sym, int and sym, block, int and block, but what if someone gives int and block and sym? They shouldn’t do that, so let’s look for that right at the beginning.
def my_inject(initial = nil, sym = nil)
raise 'Cannot pass int, sym, and block' if initial && sym && block_given?
memo, starting = if initial.is_a?(Symbol) || (!initial && block_given?)
[self[0], self[1..-1]]
else
[initial, self]
end
symbol = sym || (initial.is_a?(Symbol) ? initial : nil)
starting.my_each do |x|
memo = memo.send(symbol, x) if symbol
memo = yield(memo, x) if block_given?
end
memo
end
And we’ll modify our tests to check for that, too. Uncomment the tests below, comment out the tests at the bottom of the page to execute.
public :my_inject, :my_each
require 'rspec'
describe 'inject with' do
let(:a) { [1, 2, 3, 4] }
context 'sym' do
subject { a.my_inject(:+) }
it { is_expected.to eq a.inject(:+) }
end
context 'int and sym' do
subject { a.my_inject(100, :+) }
it { is_expected.to eq a.inject(100, :+) }
end
context 'block' do
subject { a.my_inject { |memo, x| memo + x } }
it { is_expected.to eq a.inject { |memo, x| memo + x } }
end
context 'int and block' do
subject { a.my_inject(100) { |memo, x| memo + x } }
it { is_expected.to eq a.inject(100) { |memo, x| memo + x } }
end
context 'int, block, and sym' do
it 'raises an error' do
expect { a.my_inject(100, :+) { |memo, x| memo + x } }.to raise_error
end
end
end
And there you have it! From 27 lines down to 12. It could always be a bit more concise but for me, it’s perfectly readable and won’t require a ton of head-scratching if I ever have to revisit it.
Hope this helps. Get in touch if you have any questions, [email protected].
public :my_inject, :my_each
require 'rspec'
describe 'inject with' do
let(:a) { [1, 2, 3, 4] }
context 'sym' do
subject { a.my_inject(:+) }
it { is_expected.to eq a.inject(:+) }
end
context 'int and sym' do
subject { a.my_inject(100, :+) }
it { is_expected.to eq a.inject(100, :+) }
end
context 'block' do
subject { a.my_inject { |memo, x| memo + x } }
it { is_expected.to eq a.inject { |memo, x| memo + x } }
end
context 'int and block' do
subject { a.my_inject(100) { |memo, x| memo + x } }
it { is_expected.to eq a.inject(100) { |memo, x| memo + x } }
end
end
Relationships and Rails OOP: The Missing Link
In a vanilla Rails app, an association is really the byproduct of two objects referring to each other. Object 1 has an ID that gets referenced in a join table that also references Object 2’s ID, or maybe Object 1 has a column in its table that references Object 2’s ID — it doesn’t really matter how they refer to each other, cause the point is that the association doesn’t really exist the way the rows describing Objects 1 or 2. Since the association is just that, a literal association of one object to another, and not an object, it doesn’t really need much management, per se, since two objects are either related or they’re not.
In Neo4j, the idea that relationships are objects and are therefore just as real as nodes is the central part of its philosophy. It’s what separates a graph database from all other databases: your relationships are part of the data, your relationships are objects.
I’ve been using Neo4j.rb, an ActiveRecord replacement that uses Neo4j, in Phillymetal and other projects for about a year and a half now and started contributing to the development of v3.0 a little while ago. An issue I’ve always had with the shoehorning of Neo4j into Rails is that Rails isn’t really equipped to handle relationships as objects. I mean, it’s easy to picture how an ActiveRecord model and a Neo4j model parallel one another: what was once a row in the database now becomes a node. Associations in ActiveRecord make sense in principle — your association is a relationship and where once lived an ID referencing another row, you now have a relationship — but this doesn’t really go far enough, does it? In ActiveRecord, an association is a byproduct; in Neo4j, a relationship is truly an object.
This leads us to a philosophical issue when we approach it from an object-oriented perspective. Say I have two models, Student and Lesson. A student has many lessons, a lesson has many students. Which model is responsible for the relationship between student and lesson? In ActiveRecord, it wouldn’t be much of an issue since the association isn’t something that really needs responsibility. There are no properties, there are no methods, there are no instances; each object is essentially responsible for itself and the association is something that happens. But if we are adding properties to a relationship in Neo4j, if we are to assume that it truly is an object, should
- either model be responsible for that relationship? I’d argue no.
Here’s an example of where it goes wrong:
class User include Neo4j::ActiveNode has_many :out, :managed_lessons, model_class: Lesson, type: 'manages' def create_lesson_rel(lesson) if lesson.respond_to?(:managed_by) # create relationship # add properties to the relationship # call methods on lesson to modify the node else # failure behavior end end end
Why is User responsible for doing all of that? Because either it or the lesson object has to, even though Neo4j makes it clear that the relationship is part of our data and is its own model. In response, we need a new type of model, we need a relationship model. It is literally the missing link between nodes.
In Neo4j.rb 3.0, the ActiveRecord replacement for nodes is called ActiveNode. A few days ago, we released v3.0.0.alpha.10, containing ActiveRel, the relationship wrapper. It solves this problem by offering the ability to create relationship models that behave exactly as you would expect, complete with instances that support validations, callbacks, declared properties, and so on. They offer a separation of relationship logic, so models only need to be aware of the associations between nodes.
In practice, a slightly advanced implementation can looks like this:
class User include Neo4j::ActiveNode property :managed_stats, type: Integer #store the number of managed objects to improve performance has_many :out, :managed_lessons, model_class: Lesson, rel_class: ManagedRel has_many :out, :managed_teachers, model_class: Teacher, rel_class: ManagedRel has_many :out, :managed_events, model_class: Event, rel_class: ManagedRel has_many :out, :managed_objects, model_class: false, rel_class: ManagedRel def update_stats managed_stats += 1 save end end class ManagedRel include Neo4j::ActiveRel after_create :update_user_stats before_create :set_performance_review validate :manageable_object from_class User to_class :any type 'manages' property :created_at property :updated_at property :next_performance_review, type: DateTime def update_user_stats from_node.update_stats end def set_performance_review next_performance_review = 6.months.from_now end def manageable_object errors.add(:to_node) unless to_node.respond_to?(:managed_by) end end # elsewhere rel = ManagedRel.new(from_node: user, to_node: any_node) if rel.save # validation passed, to_node is a manageable object else # something is wrong end
As you can see, our User model is aware of associations to three different classes plus one, `managed_objects` that does not specify a class and will return all nodes related by type `manages`. Since they all refer back to the ManagedRel class, they all draw from the same `type`. As long as we create the relationship using an instance of ManagedRel, our validations and callbacks will run. The relationship sets its own timestamps and uses a callback to set its own property, then calls the method on the user to update its stats. We have separated the bulk of our relationship logic and can sleep soundly.