Wednesday 27 June 2012

Persisting to Neo4j via Spring Data (or, "Aren't We Persistent?")

Hi gang!

Ok, I'm back with a new post, this time with a post about a couple quirks I ran into while implementing some test cases for Spring Data using Neo4j.  They're not bugs by any stretch; it's just new behaviour to get used to as you venture into the Spring Data world (which I'm loving, by the way).

During my copious amounts of downtime (that should be read while imagining me rolling my eyes so hard that I fall over backwards in my chair), I've been putting together a little playground for me to mess around with and play with Spring Data.

It's definitely evolving and changing as I change things up and try out new ideas, and I fully plan on sharing more about this in future posts.

For now, though, I'm just discussing a couple potential pitfalls newcomers to Spring Data might fall prey to (please pardon the prepositional phrase; I'm sure it won't be the last one).

Background
I've wanted to play with Spring Data a bit more seriously for some time now, so I started a few weeks ago and, I have to say, I'm loving every second of it.  (I'm already a huge Spring fan, and the annotations continue to make my life easier.)

My sandbox goes something like this: After having gone through the docs for Spring Data (especially "Good Relationships"), I thought I'd try out something similar for myself, borrowing the whole "store" concept (as it seems to me to be the best, first choice for implementing a graph database).

Instead of using the whole "movie store" concept, I switched to something a little different to avoid total code reuse (I do borrow some code from the link above but modify it an awful lot).

For any geeks around my age (or older), you will remember a certain computer software retailer called Babbage's.  I have fond memories of begging my parents to go into the store every time we passed one, which wasn't often (at least I don't think it was...).  GameStop Corporation went on to purchase Babbage's (and EB Games, and a bunch of other software retailers), so you're unlikely to see a Babbage's by that name.

With that short trip down memory lane finished (more like memory cul de sac), I decided to model my domain after the concept of a software retailer.  My store will cleverly enough be called Von Neumann's (any CS major and most geeks out there are currently groaning at that joke).

Domain Model
Currently, the domain model consists of the following:



Even looking at the UML diagram above, we can see that it's based on a graph model (can you pick out the entities and/or the relationship(s)?).

Test Cases' Setup
After setting up the relevant project (which I did as a Maven project), I set forth creating some tasks using JUnit.  Before creating the actual test cases, I needed to make sure I had my testing context setup.  I also needed away to ensure that any data being persisted was wiped clean after each run.

Fortunately, instead of having to create such functionality for my project, I learned that Neo4j already has a handy solution!  The ImpermanentGraphDatabase.  This little gem can be found in the Neo4j kernel.  Specifically, I added these lines to my POM (you can see the specific version I'm using, too):


<dependency>
<groupId>org.neo4j</groupId>
<artifactId>neo4j-kernel</artifactId>
<version>1.8.M03</version>
</dependency>


...and then adding the following line to my testing context:

<bean id="graphDBService" class="org.neo4j.test.ImpermanentGraphDatabase" destroy-method="shutdown"/>

And presto!  A suitable testing graph database for my test cases!

(Warning: I am using the latest version that I found worked best for me and is compatible with all my other dependencies.  ImpermanentGraphDatabase as available in earlier versions of Neo4j, as well.)

It should also be noted that I make use of both the Neo4j repositories interfaces AND the Neo4jOperations class for persisting and retrieval.

Another note is that I've made the entire test class @Transactional.

Test Cases Proper
I'll list two of them below and a couple of the quirks I noticed.

Ensuring a Customer Can Make a Purchase
This test case consists of creating a Customer object, a couple Game objects, and making sure that Purchases can be created, persisted and retrieved (along with the associated entities). 


 
 @Test
 public void customerCanMakePurchases()
 {
  // setup our game constants
  final int QTY = 1;
  final String GAME_TITLE = "Space Weasel 3.5";
  final String GAME_TITLE_2 = "The Space Testing Game";
  final String GAME_DESC = "Rodent fun in space!";
  final String GAME_DESC_2 = "Tests in space!";
  final int STOCK_QTY = 10;
  final float PRICE = 59.99f;
  
  // setup our customer constants
  final String FIRST_NAME = "Edgar";
  final String LAST_NAME = "Neubauer";
  
  // create our customer for this test
  Customer customer1 = new Customer();
  
  // set the customer's properties (NOTE: "firstName" is an indexed property in the Customer entity, but "lastName" is not!)
  customer1.setFirstName(FIRST_NAME);
  customer1.setLastName(LAST_NAME);

  // create our games for this test
  Stock game1 = new Game(GAME_TITLE, GAME_DESC, STOCK_QTY, PRICE);
  Stock game2 = new Game(GAME_TITLE_2, GAME_DESC_2, STOCK_QTY + 5, PRICE + 5);

First, do the setup.  (And, for the sake of brevity, I'm leaving out the annotated entities.)

Nothing strange going on here--just creating two games and a single customer.  It is worth noting (for later on) that "firstName" is an indexed property of the Customer entity/node.  This means that it is searchable (also recall that Neo4j's default indexing engine is Lucene).


It had to be done.
The games we've chosen are clearly AAA-title games.  These tests should be interesting.


  // save entities BEFORE saving the relationships!
  template.save(game1);
  template.save(game2);
  template.save(customer1);

  // make those purchases! Support our test economy!
  // (NOTE: "makePurchase" actually uses the "template" parameter to persist the relationship, so no need to do it again)
  Purchase p1 = customer1.makePurchase(template, game1, QTY);  
  Purchase p2 = customer1.makePurchase(template, game2, QTY);

Above, we make sure to persist the 2 games and single customer.  We also do this prior to persisting any relationships.  This is necessary.  In this case, I make use of an instance variable called "template" which is actually an instance of 
Neo4jOperations.  This is one way of accessing the necessary persistence/retrieval functionality we need.

We then create 2 Purchase objects/relationships (Purchase is actually a relationship entity).  It is also worth noting that, instead of using the "template" object to persist the relationships, I've followed the Neo4j tutorial book "Good Relationships" and attempted another way of doing persistence, i.e. by passing the "template" object into the necessary method and having the method (in this case makePurchase) actually do the persisting of the newly-created Purchase.

Again, both "game1" and "game2" need to be persisted prior to persisting any relationships between them.

Still with me?


  // retrieve the customer
  Customer customer1Found = this.customerRepository.findByPropertyValue("firstName", FIRST_NAME);
  
  //
  // Tests
  //
  
  // can we find/retrieve the customer?
  assertNotNull("Unable to find customer.", customer1Found);
  
  // can we find the specific customer for which we are looking?
  assertEquals("Returned customer but not the one searched for.", FIRST_NAME, customer1Found.getFirstName());
  
  // does the retrieved customer have its non-indexed properties returned, as well?
  assertEquals("Returned customer doesn't have non-indexed properties returned.", LAST_NAME, customer1Found.getLastName());

  // retrieve the customer's purchases
  // (NOTE: We case as a Collection just to make checking the number of puchases easier)  
  Iterable purchasesIt = customer1Found.getPurchases();
  Collection purchases = IteratorUtil.asCollection(purchasesIt);
  
  // do we have the correct number of purchases?
  assertEquals("Number of purchases do not match.", 2, purchases.size());

So now we get to some actual testing.

The tests above are all straightforward.  We ensure the following:
  1. We can retrieve a persisted node, specifically via an indexed property.
  2. We can retrieve the correct persisted node for which we are searching.
  3. We can view non-indexed properties from the retrieved node.
  4. We can retrieve the correct number of relationships of the retrieved node.
As noted in Section 9.3 of "Good Relationships", we use Iterable for those node properties that are collections and are to be left as read-only, and Collection or Set for those collections that can be modified.



  // go through the actual purchases...
  Iterator purchIt = purchasesIt.iterator();
  Purchase purchase1 = purchIt.next();
  
  // retrieving objects via Spring Data pulls lazily by default; for eager mapping, use @Fetch (but be forewarned!)
  // ...this means we have to use the fetch() method to finish loading related objects
  Stock s1 = template.fetch(purchase1.getItem());

What if we want to view a node's related nodes' data?


By default, Spring Data loads an entity's relationships lazily, which makes perfect sense (just picture how much memory would be needed if you had a very large, highly connected graph).  Also remember that there are implicit relationships between entities if an entity is contained as a property of another entity.


(Courtesy of Paramount Picture's Forrest Gump)
"Mama said eager loading is like a box of chocolates: You never know what you're gonna get."
Well, at least the chocolates had an easily-determined, finite number in the box...

It is possible to have an eager retrieval by using the @Fetch annotation (be warned, though, that it will currently only work, by default, on node entities and collections of relationships that are based on Collection, Set, or Iterable; Spring Data may expand that in later releases, but I believe you can extend the mappings to work with other classes, if you so desire).

So, with our lazily-loaded relationships, we can use "template"'s fetch method to finish loading in the missing data.  It's as simple as that!  Anyone familiar with ORM will get this immediately.


  // can we retrieve our first purchase successfully w/ its details?
  assertEquals("Purchased item not persisted properly.", GAME_TITLE, s1.getTitle());

  purchase1 = purchIt.next();  
  Stock s2 = template.fetch(purchase1.getItem());
  
  // can we retrieve our second purchase successfully w/ its details?
  assertEquals("Purchased item not persisted properly.", GAME_TITLE_2, s2.getTitle());
  
  // if we're here, then all test ran succesfully.  Hooray!
 }

Above, we run a couple more tests to ensure that we can, in fact, retrieve and view lazily-loaded objects from Neo4j.

Nothing to it!

Making Friends the Easy Way: By Creating Them!
For these tests, we're going to have a look at something a bit more social, i.e. customers befriending other customers (how Utopian!).  I suppose we could make them "rivals" or "enemies", but that's a bit too sinister for this blog (for now...).

Anyway, I have prior to this test method a setup method (annotated with the @Before JUnit annotation) that creates 5 customers (if you're interested, I persist them using a CustomerRespoitory I created by extending the GraphRepository and RelationshipOperationsRepository interfaces).


 @Test
 public void customerFriends()
 {
  // add friends
  c1.addFriend(c2);
  c1.addFriend(c3);
  c1.addFriend(c4);
  c1.addFriend(c5);

  // be careful! setting a "Direction.BOTH" relationship in one node entity will have the ENTIRE relationship saved (*including the adjoining node*) when saving just ONE of the two entities!
  // ...if you save both, Neo4j will remove the duplication (and you'll be left wondering why c1 is a friend of c2, but not vice versa)
  
  // save c1's friends
  customerRepository.save(c1);

In the code above, we have the customer "c1" make friends with the other customers (he's a social butterfly).

Now, perhaps the most important part of this whole blog is shown here (and below).  It has to do with relationships, specifically those that are annotated as being "Direction.BOTH".

As you can see from the comments in the code above, we need to be careful about how we create relationships between nodes and save them.  If we were to create the relationship between, say "c1" and "c2", and then persist each node (and therefore the relationships, which the customer repository will handle), we would notice that the relationships have gone awry, and that the duplicate relationship from "c2" has been removed.

So, what we're going to do is the following (keeping in mind that "c1" and "c2" have already been persisted in the setup method):

  1. Persist "c1" (and thereby its friendship to "c2").
  2. Retrieve "c2".
  3. Add any other friends' relationships to the retrieved "c2" (while not befriending back to "c1").
  4. Persist "c2" (and thereby its friendships to those added in Step 3).

Step 1 is done above.


  // we can't just continue to add friends to this.c2, as once we try to save this.c2, it'll remove the duplicate relationship between c1 and c2.
  // ...so, to get around this, we retrieve the persisted object from the DB
  Customer c2Found = customerRepository.findByPropertyValue("lastName", C2_LNAME);
  c2Found.addFriend(c3);
  c2Found.addFriend(c4);
  c2Found.addFriend(c5);

  // save c2's friends, which will preserve the existing relationship with c1! Old friends can remain friends!
  customerRepository.save(c2Found);

As you can see above, we finish the remaining steps (2 through 4).

Again, note that we DO NOT create a reciprocal relationship from "c2" to "c1".  (One would hope a friendship relationship would be reciprocal; unless you have stalkers or something...)


This would totally help.

All that's left now is to run some tests to ensure that our friends have remained friends throughout all this persisting!


  // retrieve c1 for some tests
  Customer c1Found = customerRepository.findByPropertyValue("lastName", C1_LNAME);
  
  Iterable c1Friends = c1Found.getFriends();
  Collection c1FriendsSet = IteratorUtil.asCollection(c1Friends);
  Iterator custIt = c1Friends.iterator();
  
  int numFriends = 0;
  
  // let's make sure all of c1's friends were retrieved
  assertTrue("Friend not found.", c1FriendsSet.containsAll(IteratorUtil.asCollection(c1.getFriends())));
  
  // let's also make sure that c1 and c2 are still buds specifically (these two are inseparable...you should see them at ComicCon!)
  assertTrue("Friend not found.", c1FriendsSet.contains(c2));
  
  // let's make sure the exact number of friends returned is correct 
  while (custIt.hasNext())
  {   
   custIt.next();
   numFriends++;
  } // while
  
  assertEquals("Number of friends returned incorrect.", 4, numFriends);

  // if we're here, all is well! Huzzah!
}

Above, as in the first test, we make sure that all of the friendships have been properly preserved, both from "c1"'s and "c2"'s perspective.

Conclusion
In this post, we have seen the basics of persisting with Spring Data and a couple of the quirks I ran into.  These are documented within the Spring Data documentation, but it never hurts to bring these little nuances out into the light even further.

We also saw that the ImpermanentGraphDatabase is available to us through the Neo4j kernel which is a wonderful tool for implementing test cases with quick setup and teardown--no needing to write initializers and cleaners for a Neo4j installation!

So there we have it!  A first pass through persisting with Spring Data and implementing some unit tests using Neo4j and JUnit.

If anyone has any questions or I've made a mistake, please feel free to leave feedback.

We'll see you on the next post!