How to run a Discovery Workshop

How to run a Discovery Workshop

In this post I'll show you what a Discovery Workshop is and why it is so valuable for my clients

Clients ask us what they are getting when I propose a Discovery Workshop. All data projects, even those managed internally, should start with basic Discovery. In this post we’ll pull back the curtain and show you why these are so successful. These are NOT typical Discovery Sessions you may be accustomed to.

What is a Discovery Workshop?

I almost never, NEVER, engage with a client for a long-term data project without doing a Discovery Workshop. These engagements are short (2 weeks to 2 months) where we can really dive in and understand your problem. Let’s be honest, most consultancies will listen to your problem on a single hour-long conference call (maybe 2) and will propose a rigidly scoped, fixed fee Statement of Work (SOW) for a 6 month project involving Project Managers, Business Analysts, and an army of ETL coders. That’s asinine. There is no way I can scope a project after a couple calls.

In the past, when I’ve been the customer sitting on the other side of the table, I immediately toss any proposal created like this right into the trash. It’s not thoughtful. It’s arrogant. It has no possibility of succeeding.

“Discovery sounds like a waste of time with no deliverables.”

I know. I’ve tried using different terms other than Discovery for this. I wish I had a better answer. I sometimes call these Envisioning Sessions, Project Kickoffs, Discovery Sessions, and Scoping Workshops.

But the goals are always the same:

  • 2 weeks (even less is fine) to 2 months (if the project is HUGE)
  • We learn about you, our client
  • You learn about us, your trusted advisor
  • We learn about your pressing data problem that you are hoping we can solvedl
  • We give you some suggestions based on our past experiences and what we hear
    • You tell us how stoopid we are and how little we’ve heard
    • We learn
  • We do small-scale experiments to prove to you that we actually do know this stuff
    • You tell us how stoopid we are and why our ideas won’t work
    • We learn
  • We iterate on the process and always REVIEW with you.
  • We provide a roadmap for the future state of whatever is your problem. The roadmap includes an executive summary, a project timeline, a cost rough order of magnitude, risks, and a proposal to do the actual work.
    • You tell us how stoopid we are and why this won’t work
    • we review and iterate on the plan with you so you feel confident in the proposal

I love doing experiments with customers. This is where we show you how easy our methods are. But often customers need something more bespoke for their use case and evolving needs. Other consultancies call these Proofs-of-Concept (PoC). I hate that term. It connotes getting to a decision and stopping your work. You proved something. Great. But that’s not how the data world works. We are constantly adjusting our approach based on our learnings. If we find ourselves in a hole, we stop digging. Said differently, we do small-scale experiments and iterate quickly.

“I can’t have my staff spend 2 weeks locked in a conference room with you scoping a project”

Exactly. And that’s not what we propose. Here’s the process:

  • We have 2 workshops per day, max
  • Each workshop is two hours.
    • a morning session is usually 10am-Noon. dl
    • an afternoon session is usually 2-4pm.
    • We want to be respectful of your staff’s other responsibilities
    • If we keep the sessions short with tight scope we can hold your attention
    • If your staff is checking email or playing Candy Crush on their phone, we’ve failed
  • Each session is tightly scoped
    • We provide handouts and homework to help guide the conversations
  • We provide the scope for each session and a recommended attendee list (RACI chart) in advance
  • We hold Review Sessions where we verbalize what we’ve heard so we can confirm assumptions
    • the Review Session is usually a day after the initial Discovery Session
    • in many cases customers will come to these review sessions and change their requirements or scope based on thoughtful analysis that has come from the initial session
  • We iterate over as many sessions as it takes.
    • We don’t hold needless sessions to pad our hours
    • We don’t need to have workshops every day.

We like to send two consultants to these Workshops. It’s easier if we have a facilitator/whiteboard-er and a notetaker.

The first 4 sessions (2 days) are very scripted for us. We need a baseline understanding of your need. After that we need to do just-in-time planning based on what we discover. Every engagement is different.

Between the Morning Session and Afternoon Session we are frantically transposing our notes into documentation, doing visios from whiteboards, discussing what we heard, building our “experiments”, and preparing for the Review Sessions.

Let’s get specific:

Deep Dive: The Kickoff Meeting

You should assemble your entire team for this meeting. All stakeholders, business and technical. We want to hear the project perspectives from all parties. And we want to hear the bickering.

We will provide handouts and homework but here is the rough 2 hour agenda for most engagements:

  • Why are we here?
  • introductions
    • names
    • responsibilities
    • expectations (we generate a RACI out of this)
  • Key objectives
  • team roles
    • what are the decisions you are trying to make
    • what are the types of questions you need to answer in support of those decisions
    • sample data, reports, dashboards, and spreadsheets that you can share
  • what are potential data sources we should think about
  • how do we measure success?
    • What are the KPIs and metrics?
    • We believe in Lean, therefore, what is the Minimum Viable Product (MVP)?
  • frank and honest conversations about organizational politics
    • who is the project sponsor?
    • has this project been attempted before?
    • are there any groups/teams that are hostile to the project?
    • are there things that are specifically out of scope?
    • are there technologies or ideas that are strictly verboten

Deep Dive: The Business Stakeholder Meeting(s)

This is usually the second workshop. This workshop is for the Line of Business (LoB) stakeholders only. We focus on the problem being solved, without having the IT team present.

We call this the WHATs Meeting. We want to learn WHAT the business problem is, and we don’t yet interject HOW we would solve it. We LISTEN.

This particular meeting has a dynamic agenda. Sometimes customers are very savvy and have an RFP already with defined scope. In those cases we just need to “fill in the blanks”. In other cases the LoB has a business idea but needs help with execution and eliminate the cognitive dissonance with IT.

We provide questionaires in advance to help the business stakeholders think about what they want. We can hold as many or as few of these Business Stakeholder Workshops as we need. Possible topics include:

  • Is existing data quality sufficient?
  • What data sources are you currently missing?
  • Do you have staff that understands data discovery and SQL?
    • Would you like assistance to understand how to do Self-Service Analytics
  • Do you have data scientists? Do they report to IT or to the LoB? How can they be more effective?

The Biggest Challenge We See: Getting Business Stakeholders to Envision the Art of the Possible

We often focus a significant amount of time on this. We call this Envisioning. Many times the LoBs see solutions solely through the lens of the under-utilized data they have in their data ecosystem today. This is a mistake. dlThe reason we are here is to ask the question If you had access to all of the data you need, what questions would you ask? There is a wealth of external data sources available that can disrupt your business. Here is a simple example:

  • How does weather impact your business? If I told you I could get you forecasted weather and historical weather, would that help you make better business decisions?

Start small but think BIG.

BTW, weather is the Number One requested data element that forward-thinking companies want to leverage.

Deep Dive: The IT Stakeholder Meeting(s)

This is usually the third Workshop. This meeting is just the IT team and we focus on what the IT team views as its most important concerns. Here are some sample topics we like to cover:

  • how does IT handle data projects today?
  • the IT landscape
    • naming conventions
    • software and frameworks in use
    • source control, code review policies, and general SDLC processes
  • the propensity for change in the organization
  • what are the specifc challenges with this project?
  • who will ultimately support this product? What are the concerns?
  • who is our dedicated IT SMEs for this project
    • we need to ensure these resources have the time dedicated to helping us

These conversations guide us in determining how to succeed at our project. We won’t propose Hadoop if the IT department is not comfortable with that technology.

We may have multiple IT Stakeholder meetings, sometimes very abbreviated, as we dive deeper into the problem space. IT must be on-board with any solutions.

Deep Dive: Session 3+: The Envisioning Review Session(s)

This is usually our next workshop and we gather both the LoB and IT stakeholders together. We review what we learned from each team. Sometimes these meetings get heated. That’s OK.

After the review we drive to organizational alignment to fuel creative thinking about where and how you can leverage data and analytics to power Digital Disruption.

This is the start of us iterating on SOLUTIONS with YOU. Many organizations don’t truly know how to analyze the data they already collect or how to identify the data they need to start collecting. There are easy solutions to these problems and we’ve written about these before.

We present these solutions as Reference Architectures. Some practitioners think RefArch’s are merely Visio diagrams. That’s wrong. A Visio diagram is a good way to visualize a solution, but our RefArch’s are actually working code that we can deploy quickly (especially if you use the public cloud) to do mundane tasks that traditionally take WEEKS…like ingesting new data sources for an important project.

Customer Responsibilities During Discovery Workshops

Clients get as much value out of Discovery as they put into it.

We are very mindful of your stakeholders’ day-to-day tasks. We want to make these sessions valuable. We need:

  • a conference room with a whiteboard and a projector
  • a printer for handouts and documentation
  • good attendance at the sessions. Again, we will provide scope and recommended attendee lists in advance
  • wifi
  • if possible, some sample data sets or access to data sources
    • if you use the public cloud this is much easier. We just need access to a dev subscription or a Resource Group with contributor RBAC’ing.

We are happy to do working lunches and dinners. These sessions are best done on-site, at least initially, but can be remote. When we are on-site we want to maximize our interaction time. Travel is expensive. The more we can do face-to-face, the better. Even if that means LONG days.

Why Our Discovery Workshops have a Competitive Differentiation

If you’ve read this far, now we are going to divulge our secret sauce. This is why our Discovery Workshops actually work. dl

We work Top-Down instead of Bottom-Up

Almost every IT project that fails does a bottom-up approach. They start with only a vague understanding of the true requirements. Then they start looking for data that fits their hypothesis. We call this data dredging and it should be avoided. dlThen they model the data and spend weeks massaging the data into the model. They build dashboards and paginated reports. After a few months they have something ready to show the business stakeholders… who invariably say the solution was NOT what they initially asked for.

Your project just failed.

Notice what happened there? The focus was on OUTPUTS. The data project was always concerned with what is the output of that project phase. Each OUTPUT is moving to a pre-conceived notional solution (a dashboard) that may not be what the LoB really wants. A dashboard is usually NOT what a business person wants. A business person wants to know What do I do next?. That’s hard to express in a dashboard. Power BI is not the answer to every business problem.

Successful people focus on INPUTS, failures focus on OUTPUTS

There’s a simple reason why IT projects tend to be structured as Bottom-Up. It’s easy to model these projects on a Gantt chart (project plan). “Step 1 leads to Step 2 leads to … a finished product”. It’s all very serial (we call this waterfall). We then spend inordinate amounts of time “working to the plan” instead of “doing what’s right”. Don’t fall into this trap.

The Project Manager has as much influence on the success of the project as a weatherman has influence on the weather.

That may sound harsh, but it’s true. No one likes to be behind on their project, or over-budget, but it happens. The real metrics are:

  • are we moving in the right direction?
  • are we moving quickly?

That, by the way, is the definition of velocity. It’s hard to design a project plan to measure velocity.

The Data Science Approach: Top-Down

Sometimes it’s better to work backwards and think like a data scientist. Here’s how a typical Top Down project looks, in chronological order:

  • identify potential business use cases and recommendations that the organization could deliver to its “customers”
  • identify data, even if it doesn’t exist today, that supports the business casedl
  • what are the analytic requirements to make it a reality
    • do we need data scientists?
    • Do we need to train our existing analysts so they can analyze their own data?
    • Do we need a dashboard solution, paginated reports, or merely a strategy statement that is supported by our data?

That’s it. There are obviously missing steps here like “how do we acquire the data and model it” but those questions are resolved in later iterations. Often, when you work Top Down you are blinded to the lower level steps initially. That’s OK. We can learn those as we move along. Contrast this with Bottom-Up where we know what the OUTPUT of every step is, but fail to see the Big Picture.

Said differently, this is the scientific method. We like to use deductive reasoning vs inductive reasoning. Our Discovery Workshops are geared this way.

More Secret Sauce: The Monetization Exercise

We are entrepreneurial data scientists at heart. We know that the most successful projects are the most profitable projects. So we start by thinking about how we can disrupt your ideas and monetize them. We seek to understand your organization’s products and services and how they are used today. Then we identify how we can take that data and identify new monetization opportunities. I’ve written many case studies on how we’ve done this. But here’s a quick story:

We did work for a company that provides TV schedule data for cable operators (known in the industry as MSOs). Traditionally this data was hand-curated by an army of Mechanical Turks. Think about how many local tv channels there are throughout just the US, with local programming 24x7, that is presented to you (hopefully accurately) on your set-top DVR. It’s a lot of important data. When Smart TVs were introduced a major manufacturer asked my client to develop an API to stream the “guide data” directly to the units and they wanted us to capture metrics around how the consumer interacted with the Smart TV, again using APIs we would develop.

Think about that for a second. In the past my client only PUSHED data OUT. They focused on OUTPUTS. Now, suddenly, we had the ability to RECEIVE valuable telemetry that could be used as an INPUT to… anything. It wasn’t long before stakeholders realized that this goldmine should be monetized.

The first try at the project FAILED. Why? The project team wanted to model the incoming data into a proper star schema. Then they needed to ETL the data into that format. Then dashboards, queries, and reports were devised to sell the data to advertisers. Did you notice what happened? It was BOTTOM-UP. The team focused on OUTPUTS instead of truly understanding what advertisers would buy.

So somebody asked me, “What reports should we sell to Madison Avenue?” My response: “I dunno, I’m not Madison Avenue. But I know advertising is data-driven. They are always looking for new DATA. Just give them the raw DATA (ie, focus on the INPUTS) and let them figure out the OUTPUTS they need.

How did I succeed at this project? Simple. The data is the product, not the dashboards. We simply landed the data into a well-defined data lake and sold monthly subscription access to Madison Avenue for a HUGE fee. We let the consumer have (controlled) access to the data and we watched them to determine what data and ideas they actually had. This became the INPUTS to future offerings. The entire project was delivered in a month. The Smart TV manufacturer now makes more money off of recurring revenue subscriptions to the data lake than they do from the one-time sale of the hardware.

That is a TOP DOWN solution that focuses on monetization. When a project is conceived in terms of monetization, it will almost always succeed.

That is Digital Disruption. When you think about Digital Transfornation, think about that story.

The monetization exercise works by understanding product usage patterns and customer behaviors associated with your current business. The process then seeks to identify “what to do next” when creating complementary offerings that can be packaged and delivered, profitably.

Discovery Workshop Feedback: What customers tell us

“You got us to think like a data scientist.”

We are data scientists at heart. We do small-scale experimentation iterations and try to learn from failure. And we do fail. Our customers like this and tell us this is a culture change for them. In some cases clients have brought us back in after a few years or months to help them when they regress into old behaviors.

And admittedly, we sometimes catch ourselves doing it the Old School Way. This is a journey, not a destination.

It’s not important to get it right the first time. It’s vitally important to get it right the last time.

“We love how you do everything with scores”

We do a lot of scoring during our Discovery Workshops.

Data scientists love to quantify things. We find that simple scoring mechanisms are great ways to evaluate “what to do next”. We score everything, usually on a range of 1-5. This allows us to prioritize initiatives. It’s surprising how easy it is to build consensus when scores are tallied. Numbers don’t lie (usually).

Common Objections to Discovery Workshops

“You said sometimes you do 2 weeks and sometimes 2 months. That sounds like you are padding your time. Let’s do this in 2 days.”

We’ve done 2 hour discovery sessions, 2 day discovery workshops, 2 week versions, and 2 month versions. There are some good reasons for the variance:

  • some projects are HUGE and really do need 2 months (but these are rare and we do NOT advocate a lengthy Discovery Phase)
  • 2 weeks is about the best. At the end of 2 weeks you should know if you want to fire us and, likewise, whether we are the right fit for your business problem.

The fact is, the longer the Discovery Phase, the more value you will likely get, to a point. We just do more experiments and iterations that will get you even closer to a finished product during any remaining time.

Any more than 2 weeks, for most projects, and I find we bump up against The Law of Diminishing Marginal Utility. dl Conversations begin to devolve into “why this project can’t succeed” or worse, conversations around HOW we’ve done the project in the past and WHY it failed. This isn’t productive. For truly HUGE projects we think it’s better to focus on a smaller MVP and therefore shrink the Discovery Phase.

Let’s be honest: after about 2 days we know the most important facets of your business problem. We spend any remaining time actually developing the solution. We don’t burn hours reformatting the Microsoft Word template we use for the final documentation. Ugh.

Ultimately, the duration of the Discovery Phase is your decision. Our goal is not to milk the engagement and burn hours. Our goal is to get you comfortable with us and win the business for any follow-on work (if we are a good fit). This is a great economy with lots of data projects. We are smart technologists that love solving problems. We want to waste our time even less than we want to waste YOUR time.

“Discovery Sessions sound waterfall. We are agile and won’t waste time on huge up-front requirements gathering exercises”

We agree. The goal of Discovery is not to gather all requirements and document them (although we can do that if it makes sense). The goal is to learn enough about your problem to design a thoughtful, honest project plan with mitigated risk that you are comfortable with.

Remember, we are also writing code and doing “experiments” with you. That’s not waterfall, that’s agile (it’s actual lean and kanban). We develop quickly and fail fast.

“I’m not going to pay you to learn my business, I already have a scope document and RFP. Just do the work as we outlined it.”

We hear this a lot. And we understand. But doing Discovery is really in your best interest. Even a well-written RFP will be interpreted incorrectly in most cases.

Think of Discovery as risk mitigation. If you engage with us for a month-long Discovery Engagement and you determine we aren’t a fit, you haven’t wasted much money or time. If you go with a 6 month project based on an RFC response, what will you do if the project is still at-risk after a year? You can certainly seek legal recourse, but your project is already delayed and failing.

To be honest, we don’t need more clients, we need more time

We employee smart technologists to solve thorny problems. And they love doing it. We don’t want to send our talent into organizations that we can’t evaluate during Discovery. Whenever we’ve done this in the past our projects have become Death Marches and our people move on to greener pastures.

“A Discovery Engagement will never get through our procurement department. We need defined deliverables”

This isn’t a problem. Our Discovery Engagement SOWs always minimally include:

  • a roadmap architecture document with an executive summary and project plan
  • a SOW with ROM costing estimates for any additional work (if needed)
    • you can always use this as a basis for an RFP to farm out the work to others. We don’t mind this.
  • working code from our collaborative experiments
    • We can tightly scope this with the necessary verbiage you want. Just let us know
  • anything else you need


In our experience, clients like to buy PROJECTS and not PRODUCTS.

…but not always. If you have a great idea for a new offering and a vendor tries to sell you their solution, be wary. The fact is, to truly monetize an idea requires a bespoke solution with a competitive differentiator. When you buy a PRODUCT you are most likely buying an antiquated approach with way too much vendor lock-in.

For example: we love to build Customer 360 solutions and have written about them extensively. CRM vendors will tell you to just use their tools. It never works. Clients end up spending too much time modeling the data to fit into the CRM system’s schema. And the result? You have the same solution as everyone else does. And the vendor has even more recurring revenue from the new data you are paying them to store for you.

Our solutions use Reference Architectures: solution frameworks with working code that is (hopefully) easily adaptable to your unique needs. Our Discovery Workshops are how we demonstrate and iterate on those solutions with you. We sometimes advocate using a COTS CRM solution. That’s OK, if it fits your needs.

Data acquisition is the biggest reason viable data projects fail.

“It’s hard to ingest new data sources.” That’s what we hear. The fact is, it isn’t hard. We follow Kappa Architecture principles and we can show you how we can ingest new data sources in HOURS and DAYS. With the data acquisition problem solved the risk profile of your project changes dramatically.

Half of the money you spend on data is wasted, the problem is no one know which half.

–Heard during a Discovery Workshop, with our apologies to John Wanamaker.

Engage with me for your next assignment ›

Thanks for reading. If you found this interesting please subscribe to my blog.