I built a small project to accompany my Datanami article. It illustrates collaborative filtering using MLlib on Apache Spark. It accesses the data in Cassandra using the DataStax connector..

I wrote the key class twice, in both Java 7 and Java 8 to illustrate how much easier lambdas make things. Java 8 is not quite as good as Scala for Big Data–and Spark itself is written in Scala–but if you’re team is already deep into Java and you’re not ready to chance, you can get half the benefits of functional programming (and strongly typed functional programming, for that matter) but upgrading to Java 8.

GitHub