Friday, January 29, 2016

Testing Neo4j 3 with embedded server with Bolt

Together with colleague Stijn van Drunen we're working on a project where we're using Neo4j 3 since we want to use Neo4j's new binary Bolt driver.

See Stijn's blogpost on how to use an embedded Neo4j to run integration tests where the application uses the new Bolt driver.

Friday, December 11, 2015

Building JavaCPP presets for OpenCV 3 for Raspberry Pi (linux-arm)

Native image processing

To speed up image processing in a Java/Scala application on a Raspberry Pi, we resorted to 'opencv'. OpenCV already provides native Java binding. The disadvantage of this is however, that you manually must load those native libraries in your Java application.

JavaCV/JavaCPP to the rescue!

JavaCV is a wrapper using JavaCPP Presets like OpenCV.
JavaCPP provides a way to use OpenCV without manually adding code to load the native library. It does this via a static initialiser in the JavaCPP classes which makes sure the correct native library is loaded by the JVM. JavaCPP supports several C library, among which 'opencv'. They provide seperate jar files containing platform dependent native libraries, like for for macosx, linux-x86, linux-x86_64, windows-x86, windows-x86_64, android-arm, android-x86 (See JavaCPP presents in Maven Central). 

The advantage of JavaCV over the native OpenCV bindings, is that
- JavaCV combines all c-libraries in a single jar with a classifier for a specific platform
- It's easy to include platform specific dependencies in a project.
- JavaCV comes with a tool to automatically load the c-libraries from the jar.

However, they do not provide a 'linux-arm' build so you have to build it yourself. The easiest way, I think, is to build it on a real Pi. This can take several hours however.

Used resources:


If you just need the opencv JavaCPP preset then I provide these jars:
Note that these jars probably does not support video processing since I did not build opencv with any video dependency.

Note that JavaCPP 1.1 fixes a 'native-library-loading' issue on amd64 linux systems (for virtual machines, docker images etc) where on a 'amd64' architecture the native libs, of linux-x86_64 packages, did not get loaded.


Building for the Pi

JavaCPP comes with a 'cppbuild' script to build the opencv sources and create the java binding for it, but 'linux-arm' is not supported yet. In SNAPSHOT version there is some work in progress, but it seems this is only for cross compiling which does not support creating the java bindings currently.
So, only solution is to build it on the Pi itself.
The 'regular' opencv sources can be build for the Pi. Then these libs can be used to create the javacpp-opencv jar.

In short, these are the steps to take:

  • build opencv on pi (building the 'real' opencv project for pi is easier than trying to tweak the javacpp/opencv cppbuild script. Support for building linux-arm was added in SNAPSHOT but only for cross compilation. Does not work with Java wrappers, see below, and does not run on a real Pi.)
  • get opencv sources
  • run cmake to create native make files
  • run 'make -j5' to compile opencv on Pi. Use -j5 to use all cores!
  • run 'make install' to install libs in /usr/local/.. folders.
  • build javacpp for opencv using compiled opencv libs
  • make links to /usr/local's bin, include, lib and share folders in 'javacpp/opencv/cppbuild/linux-arm'.
  • build javacpp jar: 'mvn package'

Pitfalls:


  • make sure to checkout the same version by a git-tag in both 'opencv' and 'opencv_contrib' source folders.
  • run using 'screen' so build does not stop once you accidently disconnect.
  • ANT is required to be able to build java wrappers. Set both ANT_HOME and JAVA_HOME.
  • javacpp requires libraries in 'opencv_contrib' (e.g. 'face' module) so opencv build must include those modules.

Building OpenCV

sudo apt-get update
We did not need video libraries, but add those when you need them.
sudo apt-get install build-essential cmake pkg-config libpng12-0 libpng12-dev libpng++-dev libpng3 libpnglite-dev zlib1g-dbg zlib1g zlib1g-dev pngtools libtiff4-dev libtiff4 libtiffxx0c2 libtiff-tools libjpeg8 libjpeg8-dev libjpeg8-dbg libjpeg-progs 
sudo apt-get install screen
sudo apt-get install ant
in a screen session:
screen -S opencv
OpenCV Contrib needed because of 'face' dependency by javacpp
git clone https://github.com/Itseez/opencv_contrib.git
checkout correct version to build (same as opencv version!)
cd opencv_contrib
git checkout 3.0.0
cd ..
git clone https://github.com/Itseez/opencv.git
cd opencv
git checkout 3.0.0

ANT_HOME and JAVA_HOME needed to be able to build Java Wrappers and Java Tests.
It's assumed Java 8 is already installed on your Pi!
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export ANT_HOME=/usr/share/ant
create dir for build into
mkdir build
cd build
All build options are mentioned in this presentation (slide 12): http://www.slideshare.net/andredsm/apresentacao-36089830
cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D BUILD_EXAMPLES=D BUILD_PNG=ON -D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib/modules -D BUILD_opencv_face=ON -D BUILD_opencv_ximgproc=ON -D BUILD_opencv_optflow=ON ..
Check Java output is:
--   Java:
--     ant:                         /usr/bin/ant (ver 1.8.2)
--     JNI:                         /usr/lib/jvm/java-8-oracle/include /usr/lib/jvm/java-8-oracle/include/linux /usr/lib/jvm/java-8-oracle/include
--     Java wrappers:               YES
--     Java tests:                  YES
Start compiling:
make -j5 (aantal cores + 1)
Install bins, libs, includes, shares in /usr/local to be used by javacpp
sudo make install 

Building JavaCPP

Use natively build OpenCV libs.
JavaCPP needs Face.hpp !! --> is in opencv-contrib
cd  // to home dir
git clone https://github.com/bytedeco/javacpp-presets.git
cd javacpp-presets
checkout the version to build
git checkout 1.1
install main pom in repo (might not really be necessary)
mvn install -N 
make links in javacpp-presets/opencv/cppbuild/linux-arm to /usr/local/bin, /usr/local/share, /usr/local/lib, /usr/local/include
cd opencv/cppbuild/linux-arm
ln -s /usr/local/bin bin
ln -s /usr/local/include include
ln -s /usr/local/lib lib
ln -s /usr/local/share share
back to javacpp-presets/opencv folder
cd ../.. 
build javacpp library using previously build opencv libs
mvn clean package
Once completed, the 'target' folder contains a 'opencv-linux-arm.jar'
mv target/open-linux-arm.jar target/opencv-3.0.0-1.1-linux-arm.jar
Install jar in local repo or Nexus using these artifact details (also see 'target/maven-archive/pom.properties'):
groupId: org.bytedeco.javacpp-presets
artifactId: opencv
version: 3.0.0-1.1
classifier: linux-arm
packaging: jar

Now use this jar in your project using details as above.
Don't forget the classifier!!
<dependency>
<groupId>org.bytedeco.javacpp-presets</groupId>
<artifactId>opencv</artifactId>
<version>3.0.0-1.1</version>
<classifier>linux-arm</classifier>
</dependency>
or for SBT in build.sbt:
// Platform classifier for native library dependencies for javacpp-presets
lazy val platform = org.bytedeco.javacpp.Loader.getPlatform
in libraryDependencies:
"org.bytedeco" % "javacpp" % javacppVersion,
"org.bytedeco" % "javacv" % javacppVersion excludeAll(ExclusionRule(organization = "org.bytedeco.javacpp-presets")),
"org.bytedeco.javacpp-presets" % "opencv" % ("3.0.0-"+javacppVersion) classifier "",
"org.bytedeco.javacpp-presets" % "opencv" % ("3.0.0-"+javacppVersion) classifier platform,
"org.bytedeco.javacpp-presets" % "opencv" % ("3.0.0-"+javacppVersion) classifier "linux-arm",
in project/plugins.sbt
// `javacpp` are packaged with maven-plugin packaging, we need to make SBT aware that it should be added to class path.
classpathTypes += "maven-plugin"
// javacpp `Loader` is used to determine `platform` classifier in the project`s `build.sbt`
// We define dependency here (in folder `project`) since it is used by the build itself.
libraryDependencies += "org.bytedeco" % "javacpp" % "1.0"

Note on cross compiling:

OpenCV can also be cross compiled. Philipz's docker containers helps a lot setting up such an environment. The problem is however, you also want to build the java wrappers. Because the build architecture is set to 'arm' it will also look for an 'arm' version of the JVM and AWT libraries whereas you must use the native (ubuntu) java instead. Was not able to get a cross compile working yet.
Had to hack the /usr/local/cmake-2.8/Modules/FindJNI.cmake to make sure it found the 'amd64' java libraries (might be possible to set the JAVA_AWT_LIBRARY JAVA_JVM_LIBRARY vars instead) so cmake would make a build file where it would build the java wrappers and tests.
So, for now, this opencv-3.0.0-1.1-linux-arm.jar was still build on a real Pi (v2) and yes, it took hours to complete. ;-)

Friday, September 4, 2015

Chaining rejection handlers

To serve some resources via Spray, it's as easy as using the 'getFromFile' directive. If you want to fall back to an alternative because the file is not available, you can define a RejectionHandler to serve an alternative file.

In a web application I wanted to try some alternative names before falling back to a default image. The default solution requires you to nest all RejectionHandlers.
 def alternativesHandler = RejectionHandler {  
  case rejection =>  
   handleRejection(RejectionHandler {  
    case rejection => getFromFile("second-alternative")  
   }) {  
    getFromFile("first-alternative")  
   }  
 }  
This becomes quite messy very fast and is not very easy to read.
Chaining instead of nesting would improve it quite a bit.
 def alternativesHandler = chaining(RejectionHandler {  
  case rejection => getFromFile("first-alternative")  
 } >> RejectionHandler {  
  case rejection => getFromFile("second-alternative")  
 } >> RejectionHandler {  
  case rejection => getFromFile("third-alternative")  
 })  
Here is the code which makes this possible. It only needs to be imported where the route is being defined. The 'chaining' method just tries each handler in the List. The implicit classes extends the classes with a '>>' method which allows the handlers to be chained in a list.
 trait RejectionHandlingChain {  
  type RejectionHandlerList = List[RejectionHandler]  

  implicit class RejectionHandlerListExt(handler: RejectionHandlerList) {  
   def >>(other: RejectionHandler): RejectionHandlerList = handler :+ other  
  }  

  implicit class RejectionHandlerExt(handler: RejectionHandler) {  
   def >>(other: RejectionHandler): RejectionHandlerList = List(handler, other)  
  }  

  import spray.routing.directives.RouteDirectives.reject  
  import spray.routing.directives.ExecutionDirectives.handleRejections  

  final def chaining(handlers: RejectionHandlerList): RejectionHandler = RejectionHandler {  
   case rejection => handlers match {  
    case Nil     =>  
     reject(rejection: _*)  
    case head :: tail =>  
     handleRejections(chaining(tail)) {  
      head(rejection)  
     }  
   }  
  }  
 }  

 object RejectionHandlingChain extends RejectionHandlingChain  

Wednesday, July 8, 2015

Using Akka Http to perform a Rest call and deserialise json

I have been playing with Akka Streams and Akka Http to create a flow to get some data from a public Rest endpoint and deserialize the json using Json4s.
Since there are not that many examples yet, and documentation only has a few examples, I'm sharing my little app here.

Default Akka Http only supports Spray Json, but fortunately Heiko already created a small akka-http-json library for Json4s or Play Json.

Here's is small code sample on how to create a Akka Streams Flow and run it. This was just to test the calling of the Rest endpoint and deserialise the result json into a case class. Next step is then to extend the flow to do something useful with the retrieved data. I'll put putting it into a time series database called Prometheus, and maybe also into Mongo.

package enphase

import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.model.{HttpRequest, Uri}
import akka.http.scaladsl.unmarshalling.Unmarshal
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{Sink, Source}
import de.heikoseeberger.akkahttpjson4s.Json4sSupport
import org.json4s.{DefaultFormats, Formats, Serialization, jackson}

import scala.concurrent.{Await, Future}

/**
 * Enphase API Client which gets Enphase data and put those into InfluxDB
 *
 * - Start with HTTP GET request to Enphase API.
 * - Transform response into json
 * - Transform json into time series data
 * - Put time series data into InfluxDB using HTTP POST request
 */
object Client extends App with Json4sSupport {

  val systemId = 999999 // replace with your system id
  val apiKey   = "replace-with-your-api-key"
  val userId   = "replace-with-your-user-id"

  val systemSummaryUrl = s"""/api/v2/systems/$systemId/summary?key=$apiKey&user_id=$userId"""
  println(s"Getting from: $systemSummaryUrl")

  implicit val system = ActorSystem()
  implicit val materializer = ActorMaterializer()
  implicit val formats: Formats = DefaultFormats
  implicit val jacksonSerialization: Serialization = jackson.Serialization
  import concurrent.ExecutionContext.Implicits.global

  val httpClient = Http().outgoingConnectionTls(host = "api.enphaseenergy.com")

  private val flow: Future[SystemSummary] = Source.single(HttpRequest(uri = Uri(systemSummaryUrl)))
      .via(httpClient)
      .mapAsync(1)(response => Unmarshal(response.entity).to[SystemSummary])
      .runWith(Sink.head)

  import concurrent.duration._

  val start = System.currentTimeMillis()
  val result = Await.result(flow, 15 seconds)
  val end = System.currentTimeMillis()

  println(s"Result in ${end-start} millis: $result")
}

/**
 * Entity for system summary json:
 * {
 * "current_power": 3322,
 * "energy_lifetime": 19050353,
 * "energy_today": 25639,
 * "last_report_at": 1380632700,
 * "modules": 31,
 * "operational_at": 1201362300,
 * "size_w": 5250,
 * "source": "microinverters",
 * "status": "normal",
 * "summary_date": "2014-01-06",
 * "system_id": 123
 * }
 */
case class SystemSummary(system_id: Int, summary_date: String, status: String, source: String,
                          size_w: Int, operational_at: Long, modules: Int, last_report_at: Long,
                          energy_today: Int, energy_lifetime: Long, current_power: Int)


At first I could not get Heiko's Unmarchallers working and I wrote my own Unmarshaller which is not that difficult looking at some other implementations. The problem was a very vage error saying something was missing, but not exactly what. Today I figured out, it was just missing one of the required implicit arguments, the Json4s Serializers, and then it all worked nicely.

But here's is how to implement a custom Unmarshaller which unmarshalls a HttpResponse instance:

  implicit def responseUnmarshaller[T : Manifest]: FromResponseUnmarshaller[T] = {
    import concurrent.duration._
    import enphase.json.Json4sProtocol._
    import org.json4s.jackson.Serialization._

    new Unmarshaller[HttpResponse, T] {
      override def apply(resp: HttpResponse)(implicit ec: ExecutionContext): Future[T] = {
        resp.entity.withContentType(ContentTypes.`application/json`)
            .toStrict(1 second)
            .map(_.data)
            .map(_.decodeString(resp.entity.contentType.charset.value))
            .map(json => { println(s"Deserialized to: $json"); json })
            .map(json => read[T](json))
      }
    }
  }

The only change in the application needed to use this unmarshaller is to replace the 'mapAsync' line with:

    .mapAsync(1)(Unmarshal(_).to[SystemSummary])

The project build.sbt contains these dependencies:

scalaVersion := "2.11.6"

libraryDependencies ++= Seq(
  "com.typesafe.akka" % "akka-http-experimental_2.11" % "1.0-RC4",
  "de.heikoseeberger" %% "akka-http-json4s" % "0.9.1",
  "org.json4s" %% "json4s-jackson" % "3.2.11",
  "org.scalatest" % "scalatest_2.11" % "2.2.4" % "test"
)

Happy Akka-ing.

Joost



Saturday, July 4, 2015

Selecting a time database

For a new play project, I want to use a time series database to visualise gathered data over time and time be able to use functions like sum/div/etc on the time series data. I thought I could just pick a database and start coding on my project, but instead I ended up spending several evenings trying out several databases to find the right one for my needs.

1) InfluxDB

InfluxDB was my first choice. The installation is very easy and it has a simple CLI, a bit like Mongo.
Data can be injected via an HTTP Put request like 'http://<host>:8086/write?db=$database&precision=ms'. If you pass a timestamp, InfluxDB assumes a nano-timestamp. If you use a different timestamp, for example the system time which is in milliseconds, you need to tell InfluxDB via the 'precision' parameter. If no timestamp is provided, InfluxDB will use the timestamp of the request.
Although InfluxDB is still a very young db, there are already many client libraries available in many languages among which Java and Scala.
The data format is very easy: <measurement>[,<tag-key>=<tag-value>...] <field-key>=<field-value>[,<field2-key>=<field2-value>...] [unix-nano-timestamp]
Measurement is the name of the time series. At least one field key-value is required, others are optional. Tags are also optional. InfluxDB automatically creates indexes for these tag-keys which makes querying very powerful.
Installation is very easy, but even simpler via one of the available Docker containers. I used tutum/influxdb.

Disadvantages:
Version 0.9 is a complete rebuild and incompatible with 0.8. Which means there are not that many query functions available, as compared to Grafite, and not all 3rd party tools, like Grafana, work fully with the latest InfluxDB version. There are no functions yet to combine multiple series and therefore I am unfortunately not able to use InfluxDB for me project yet.
Although there is some basic support for InfluxDB v0.9 in Grafana already, it was still a bit tricky to use it. Even though I had configured Grafana correctly, it didn't show any graph yet, and then suddenly it did and I had no idea what I did any different.

2) Grafite

So, fallback to Grafite then? The reason I did not start out with Grafite is because the installation is just too damn hard to get it right. Grafite consists of 3 parts: Carbon (for capturing data), Whisper (a fixed size db) and Grafite Web (a simple web interface). There are no downloadable installations so those applications have to be build for your environment. 
In the past I have looked multiple times for Vagrant or Docker containers which setup a complete, and up-2-date (!!),  Grafite environment for me, but I alway gave up after spending another couple of hours to get it working without result. Recently I came across the Kamon Grafite-Grafana Docker container, which was the first one which worked right out of the box.

The main disadvantage is the fixed size database. You have to determine up front how much detail you want to keep for how long. For example, per second for 2 week, per hour for a month, per day for a year. For now I'd like to keep all data so I do not want to have to choose when data gets aggregated. Also, because the whole Grafite setup is so complex, I do not want to have to change the Grafite configuration for my project.

The biggest advantage is the extended support by Grafana, which is logical since Grafana was created for Grafite in the first place. Performing functions over multiple series is child's play with Grafite+Grafana.

Since Grafite is ofter used with Statsd, I did a small test with Statsd + Grafite, but Statsd is not suited for what I want to do since Statsd aggregates and summarises metrics before sending it to Grafite. Where, in a similar test application for InfluxDB, I was able to get a graph with multiple sinuses, this was impossible with Statsd.
An alternative would be to store the metrics in Grafite directly, but since Grafites complex installation and fixed size database, I would rather use something else.

So ... then what else is there? Are there other time series databases ... ? Yes there are! Plenty !! See this list of time series databases.
I looked at some of the databases in the list and eventually chose to try out Prometheus since it looked most promising since it has it's own Grafana-like dashboard and it has alerting which none of the other databases has. So ...

3) Prometheus

I chose to try out Prometheus since it has it's own Grafana-like dashboard and it has alerting which none of the other databases has. The alerting allows you to receive notifications of series going above a threshold, or when an event has not occurred for x-time, or some value rises, or falls, too quickly (a trend).
All Prometheus parts are available as Docker containers which makes it very easy to get started.
Prometheus has one characteristic which makes it very different from other (time series) databases: Prometheus uses a pull-mechanism instead of a push-mechnisme like all other databases, which means Prometheus wants to gather all data itself by calling on Rest endpoints of applications. Fortunately, they do provide a Push Gateway which makes it possible to push your own data again. Prometheus will then pull it from the Push Gateway. To use the Docker containers for both Prometheus and the PushGateway, you must link the PushGateway to the Prometheus container so Prometheus can reach the PushGateway Rest endpoint.
Prometheus default runs on port 9090. The Push Gateway runs on port 9091.

Prometheus as a some UI to display it's configuration and to do some queries either visualised in a table or a graph. The Grafana-like dashboard is call PromDash, which runs default on port 3000.

There is a Java client available for using the PushGateway, but it works a bit strange. The construction of a Registry and Gauge for each metric value seems overkill, but trying to reuse those objects resulted in an error. Probably because the whole registry gets pushed to the PushGateway, since the Java client is actually ment to be used for batch processing. Maybe it's easier to just push the value via HTTP Put yourself, but I haven't tried it out yet.

As with InfluxDB and Statsd/Grafite, I created another test application which produces sinus values. Running multiple created multiple series. Prometheus has functions to sum or div multiple values. Just make sure the time series have the same name, but instead you change the job- or instance name. PromDash displayed a graph containing both sinus values and the sum of both values, which is exactly what I want for my project.

Conclusion

So, after spending 4 evenings playing with these databases, I'm going forward with Prometheus. At least, until InfluxDB also has the functions I need and Grafana supports them also, because I still do like InfluxeDB's data structure and ease of installation and use.
So, I think I will also use another database besides Prometheus to store each and every metric value so I can easily replay all data when a more mature InfluxDB is available. Although Mongo is not suitable for write-heavy applications, I'll probably use it anyway since my application does not need to write many data and it is just easy to use.

More on my mysterious project in next posts ....

Saturday, August 9, 2014

Preventing NPE when creation Option

Sometimes you don't know whether values are present and you want to use an Option to represent that posibility. While Scala API's often support returning an Option, this is not so when integrating with Java. You might want to get a value somewhere from an object hierarchy like this:
user.getAddress().getStreet()
Since you might expect a value is not present, you want to wrap this in an Option. However, if either 'user' or 'address' is null, Option will throw a NPE.

To guard against this, you want to catch any NPE and return a normal None. However, the Option trait is sealed so it is not possible to extend it. Here is a solution how it can be done by using an object with an 'apply' method. This SafeOption returns a normal Option (Some or None), but catches any NPE and returns a None in that case.
/**
 * A 'safe' type of Option which catches any exception, so also NPE's
 * when creating the option.
 * So 'SafeOption(null.name)' will return None.
 */
object SafeOption {

  def apply[A](value: => A): Option[A] = try {
    Option(value)
  } catch {
    case _: NullPointerException => None
  }
}
Use the SafeOption like this, not much different from creating a normal Option:
SafeOption(user.getAddress().getStreet())
Note: use this only in case when you don't care about the null's and a None is a valid option for you in those cases.

Another note: It would be possible for SafeOption to catch any kind of Exception or Throwable instead of only a NullPointerException, but this would totally depend on you application whether that is a valid solution or not. I just choose to only catch NullPointerException so other kinds of exceptions can still be handled differently by the application.

Happy Optioning ;-)

Wednesday, July 30, 2014

Running Jetty as an application for fast starts and easy debugging

Here a simple utility that will safe you a lot of time doing restart or redeployments of your webapp in Jetty. It also makes debugging a lot easier. And the integration of tools like JRebel is now a real breeze.
Before I used the Jetty Maven plugin, but that always recompiled all code and since I have mixed in Scala with Java code, compiling has not become any faster.

Here is the code of a simple App which embeds Jetty and allows you run or debug your webapp from any IDE. The code is in Scala (of course :-)), but is easy translatable to Java.

import org.eclipse.jetty.server.Server
import org.eclipse.jetty.webapp.WebAppContext

/**
 * App to run Jetty for testing.
 */
object JettyApp extends App {

  val server = new Server(8080)

  Runtime.getRuntime.addShutdownHook(new Thread(new Runnable {
    override def run(): Unit = server.stop
  }))

  val context = new WebAppContext()
  context.setDescriptor(context + "/WEB-INF/web.xml")
  context.setResourceBase("src/main/webapp")
  context.setContextPath("/")
  context.setParentLoaderPriority(true)

  server.setHandler(context)

  server.start()
  server.join()
}


Put this in your src/test/[scala|java] folder and the server will use the resources and classes from both the main and test locations.

The only dependency you need to add to you project is org.eclipse.jetty:jetty-webapp:9.2.2.v20140723:test.

The shutdown hook might not be necessary, but I threw it in anyway.

Happy Jetty-ing. ;-)