Understanding Elasticsearch QueryDSL for Analytics

This month I spoke at Big Data Jax on the topic of Elasticsearch and QueryDSL as it pertains to data science. Matt Berseth from NLP Logix was good enough to arrange speakers for my Drink and Think event, so I’m returning the favor.

My presentation covered the following general topics:

  • Brief history of Elasticsearch
  • Download and installation of Elasticsearch 5.2
  • Nomenclature and cluster overview
  • Indexing data
    • Inverted indexes
    • Mappings
    • Analyzers
  • QueryDSL
    • Common query types
    • Aggregates
    • Statistical aggregates
  • Usage

You can download the full PDF of my presentation here: Elasticsearch Presentation

For more events, please check out Big Data Jax on Meetup.

Setting a New Healthcare Paradigm with Elasticsearch

The healthcare industry needs to be able to make real-time decisions — lives are literally on the line — and traditional methods of information capture and transfer are no longer sufficient. Forcura is a document workflow and healthcare solutions provider for the home health and hospice industry that uses Elasticsearch to quickly index and search access patient or physician data. Having access to the correct data and secure it to comply with HIPAA regulations is critically important to their business, setting a new paradigm for the way work gets done in the healthcare industry.

Thank you to everyone at Elastic for setting this up, editing the video, and publishing it!

Elasticsearch on a Raspberry Pi

I’m posting this as a step-by-step on installing Elastic Search on a Raspberry Pi. While I don’t honestly think having a cluster of Raspberry Pi computers running Elastic Search is a true enterprise solution, I think there could be something to this in the future. As cheap as hardware keeps getting, it’s entirely reasonable to believe that a myriad ARM-based Systems-on-a-Chip (SOC) could be a completely feasible cluster solution. Today you can pay $40 for a single little Raspberry Pi. If you forked out $4000, you could have 100 of these tiny machines all running Elastic Search in memory. What does one VM slice cost an enterprise right now? Just sayin’…

  • Format your SD card (I used a 32GB, a 64GB card would be nice) using this formatter.
  • Download NOOBS for Raspberry Pi here.
  • Copy the files to the card, insert into the Raspberry Pi, and boot.
  • When prompted, choose to install Raspbian.
  • After install is complete, reboot to command line (bash shell)
  • Install Oracle Java (7, at the time of this posting), using “sudo apt-get update && sudo apt-get install oracle-java7-jdk”
  • Create a directory to store Elastic Search using “sudo mkdir /usr/share/elasticsearch”.
  • Change to the directory using “cd /usr/share/elasticsearch/elasticsearch-0.90.7”.
  • Download Elastic Search (v0.90.7 at the time of this posting), using “http://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.7.tar.gz”. Please note you may have to change the file name if the version has increased since this article.
  • Extract the zipped file: “sudo tar -zxvf elasticsearch-0.90.7.tar.gz”
  • Delete the archive: “sudo rm elasticsearch-0.90.7.tar.gz”.
  • Run Elastic Search! “sudo bin/elasticsearch”
  • Verify the install: “curl -XGET http://localhost:9200/”
  • Install the _head plugin: “sudo bin/plugin -install mobz/elasticsearch-head”.
  • Get your local IP address: “sudo ifconfig”. Look for “inet addr”. Copy your IP address.
  • From any browser on the network, navigate to “http://[your Raspberry Pi IP]:9200/_plugin/head/”.
  • Start searching!

Hope it helps!