Case Study in End-to-End Self-Service Analytics in the Cloud

Case Study in End-to-End Self-Service Analytics in the Cloud

My client was a FinTech that needed to add self-service analytics to their customer solution. You too can do this quickly. In this post I'll show you how I focused on small MVPs and showing value quickly.

The Business Problem

A financial services company offers a business expense management tool to customers for free as part of a larger offering. The tool is quite standard for 2018…scan your receipt, categorize your transaction, submit it. Then someone approves the expense, and the reimbursement process happens. From the backoffice the expenses can be downloaded into various formats for GL ingestion. These are commodity solutions that require some kind of competitive differentiator in 2018. There are a few ways to do this but it’s best to start simple: dl

  • build customizable analytics dashboards for the customer/user, in real-time, with live data. Generally these tools offer simple queued reporting, meaning you submit a request for a report and it is generated and emailed to you after some period of time. This makes dashboarding in real-time quite difficult. THIS is something very few products in FinTech do, at least at scale for HUGE customers.
  • allow users to slice-and-dice data in real-time
  • apply “AI and ML” to the submitted expenses, in real time, looking for fraud, miscategorized expenses, etc.
    • if the receipt can be scanned and the expense form can be auto-completed with image recognition, that’s even better
  • allow one company to see how its expense metrics stack up against companies in their geography or industry.

I preach this almost daily: The real Digital Transformation is the ability to monetize your existing data for new use cases you never thought of or have never attempted before. The easiest way to do this is take you customer data, de-identify it, merge it with your other customers’ data in a data lake, and “sell” that data back to your customers. Customers LOVE to know how they stack up with their competitors along various demographics axes.

  • allow reports to be created faster for customers.
    • right now a customer requests a report from the professional services team, which gets entered into a backlog, may finally get released in 2 months, and needs to be reworked due to poor requirements gathering.
    • said differently, “improve cycle times by allowing better self-service data analytics tooling”
  • add NLP solutions for customers that may not be savvy with tools like Power BI or Tableau
  • most reports are simple csv extracts that have embedded SQL everywhere. Is there a way to build an evolutionary database schema, possibly using a semantic tier (SSAS)
  • since this is finance data there is a sensitivity to change. Only 3 product releases occur per year. That is far too slow and hard.

The Minimum Viable Product

The customer wanted us to do a little “experiment” (read: Proof of Concept) to see how fast our consultancy could deliver value before they signed on for the long-term capital project. Based on the above requirements I can’t even fathom the effort to roll out everything they wanted. We needed to start small. But whatever we did needed to have that “competitive differentiator”.

Doing something like AI or ML, while sexy, likely wouldn’t be a good “first step” since this requires real-time data to be meaningful and my customer didn’t have that yet. But, let’s face it, every customer wants to “infuse AI” into their product offerings.

After some deeper analysis we determined that our client hated when their customers downloaded raw data into csvs and then did the analytics themselves in Excel spreadmarts. Why? They had no clue what their customers were doing with the data and couldn’t create value-added services to keep that data within my client’s tooling.

Keep the customer sticky.

So, we started here. We didn’t want to lose all of that telemetry that may tell us what our next product should look like. This melds nicely with our client’s desire to make the data even more valuable for their customers.

The Deliverable

Our customer was already using Kafka-like technologies. We built upon those to ingest the necessary data in real-time to build a real-time expense portal (at least, enough for a PoC). This allowed us to display expenses at least 24 hours sooner than waiting for the source systems to batch-process the data.

For the dashboarding we used a combination of Power BI Embedded, PBI-RS for on-premise paginated reports, and embedded analytics graphing with D3.js where it made sense. Most importantly we wired up the portal so we could obtain the telemetry to determine what data users were downloading.

This was a 240 man-hour project to “prove the concept” that we could capture data in real-time, dashboard it, and provide telemetry to determine what uses customers were downloading the data. Within the project timeframe we did determine some interesting use cases where we could apply AI and ML in the future.

Case Study Summary

This project was meant to prove that “we know this stuff” and get the long-term analytics project. We were successful and began scoping the follow-on MVPs to solve some of the business challenges noted above. But we also laid the groundwork for a streaming data processing engine for both real-time dashboarding, telemetry ingestion, and future AI/ML pipelines.

After talking with the project stakeholders we learned of an ancillary benefit that we didn’t realize we introduced. We showed our client that using a DevOps mentality meant that we could iterate faster, yet still safely, than what they were accustomed to doing. They built a Center of Excellence around Site Reliability Engineering to infuse this culture in other departments and business initiatives.

Engage with me for your next assignment ›


Thanks for reading. If you found this interesting please subscribe to my blog.