Get Started with Ampool ADS

Try Ampool ADS developer edition for FREE

Ampool Usage Patterns (Part 1): Near-App Analytics

January 9, 2018|by: Milind Bhandarkar

Ampool Usage Patterns (Part 1): Near-App Analytics

Wishing our blog readers a very happy new year 2018.

In the first blog series of this new year, we will outline three broad patterns where Ampool Active Data Store (ADS) and In-Memory Platform is being used for real use-cases by our customers and pilots. In this post, we will describe the first usage pattern for Ampool, Near-App Analytics.

Previous State

Modern applications need to provide hyper-personalized experiences to their end users, so as to improve user-engagement by providing most-relevant information. In order to achieve this, the application needs to accurately model user behavior while interacting with the application, and tailor the presented information to the user activity, as opposed to a generic one-size-fits-all model. Previously, in data lake architectures, the user activity is logged by the applications, and stored in log files on application servers. Periodically, this collection of user activity logs are transported and ingested into a centralized data lake as a set of raw data files. Later, in batches, these raw data files are cleansed & denormalized by combining them with other reference datasets, and user activity sessions are created. Activity modelling algorithms using machine learning techniques are applied on a sliding window of time (for example, last 30 days) on these datasets, and the model is evaluated on previous known and labelled behavior of users to determine efficacy of the new model created. If the model is found to be effective & beneficial, then the model is uploaded into application servers, and is applied when the user next visits the application.

Deficiencies

As the user activity data is staged on “the edge”, i.e. the application servers, and goes through multiple transports, format conversion, batch ingestion etc to finally land in centralized data lake, before it is available for analytics, the business value of the data, and the actionability of insights that could be gained from analyzing this data rapidly diminishes with time. In addition, because of the delay between user activity, and the insights generated, the machine-learned user behavior models are often stale. Paul Maritz (Executive Chairman of Pivotal), succinctly described the core application of “Big Data” as ability of “Companies & Organizations to catch people or things in the act of doing something and affect the outcome.” Clearly, feeding stale insights back to applications implies losing the ability to catch users in the act of interacting with the application, and to affect the outcome. The core problem that needs to be solved, is to create “Real-Time, Personalized, Actionable Information, in Current Context”. This is where Ampool In-Memory Platform comes into picture.

“Companies need to learn how to catch people or things in the act of doing something and affect the outcome

-Paul Maritz, Executive Chairman, Pivotal

With Ampool

Near-App Analytics Dataflow

The block diagram above illustrates where Ampool platform is used for Near-App Analytics, with the data flow.

  1. Application emits data exhaust (user activity events) on a message queue (e.g. Apache Kafka)
  2. Ampool connector for message queue fetches, sessionizes, & denormalizes events in Ampool
  3. Real-time Analytics (model refinement, dashboards, anomaly detection) performed in Ampool
  4. Results of analytics (models, visualization, alerts) emitted on message queue
  5. Serving store is updated with results of analytics
  6. Application uses results of analytics
  7. Colder data persisted in data lake for historical analytics (e.g. large scale batch model training)

Why Ampool

The following features of Ampool enable Near-App Analytics seamlessly:

In addition to the above mentioned features, Ampool can be deployed in familiar application deployment frameworks, such as Docker containers, and can be orchestrated using modern platforms such as Kubernetes, the reducing the need to learn new/unproven platforms.

If you are using Pivotal Gemfire (or Apache Geode) for caching needs of your applications, you would find Ampool ADS (Powered by Apache Geode) a natural fit for your near-app analytics needs.

When to consider Ampool for Near-App Analytics?

You should consider deploying Ampool for your near-app analytics needs, if:

ICYMI (from December 2017, which was “so last year”):

If you are interested in exploring Ampool, write to us to schedule a demo.

Get Started with Ampool

Try Ampool developer edition for free.