I can’t provide you with a fixed scope, fixed fee project. I mean, I could, but it would be a complete guess. And, I would need to pad my estimates to ensure I can achieve the scope and get paid. And trust me, the scope will be written such that I’ll always be able to deliver and get paid. But there are so many unknowns right now (for instance, your marketing team can’t tell me exactly what the formula should be) that I can’t give a thoughtful estimate. It feels like I should be able to do this in 2 weeks but I can’t possibly estimate for such a short project and still make a profit. This is why the minimum estimate you received from the other bidders is 3 months.
“Analytics work expands so as to fill the time available for its completion”. This is my version of Parkinson’s Law. This means that if you give me 3 months to do the project it will take 3 months. If you only give me 2 weeks I’ll likely still get you the same business outcome.
I can tell you through experience that if I provide you with any estimate, whether it’s a week or a year, my team will end up spending a lot of time doing tasks that truly aren’t adding value or getting you closer to answering the question. We’ll spend a lot of time creating CI/CD processes, building ETL automation, and creating unit tests…we’ll only dedicate the final 5% of time to analyzing the data. Invariably my team would say, “if we only had more time, we ran out of budget.” I hated doing that, which is why I won’t do it anymore.
Hofstadter’s law: “It always takes longer than you expect, even when you take into account Hofstadter’s Law.”
But ETL, star schemas, unit tests, and automation…none of those things get you closer to an answer to your simple question. In fact, for this project, I’m not sure what data I would put into a data warehouse or what the dashboard would even look like. Maybe something like this?
You want a simple YES/NO answer…but really you want to know what the ROI would be for an Instagram campaign given existing campaigns. ROI is a simple calculation. Do you really need the numerics in a fact table? Do you even need dashboards? Probably not. What you really need is a simple visualization where we can turn the knobs on the input variables and see the effect on ROI. I can do this with python and Jupyter notebooks in a couple of minutes.
You ask me: “I’ve heard nothing like this from our other proposals. I’m interested in hiring you. So, I know you won’t give me an estimate…but I really REALLY need one.”
I can’t give you that. There are so many input variables that we _ could_ use. I need to sit with your marketing analysts and maybe even your CMO, and together we need to look at those input variables until we collectively come up with the right set of inputs that everyone can agree on. In my past projects this can sometimes take awhile and get contentious. Everyone will want to add their opinions and inputs into the formulas. We won’t just talk about these formulas, we will actually look at the formulas using real data. We’ll tune the knobs together. We’ll have sessions where we learn about the data together.
Sometimes leaders will want to ingest more and more data, trying to add lift and get a better outcome…but that isn’t always needed. More data doesn’t always equal a better answer. So I’ll help your marketers understand where the cutoff is.
“Don’t let the perfect be the enemy of the good enough.” Getting an answer faster might be better than continuing to add more features.
Your marketing team also says they want a simple YES or NO answer. That’s likely not want they want. They’ll likely want a range of numbers where the Instagram campaign is showing positive ROI. But again, this is all very subjective. How will your marketing team know they didn’t miss an important variable in the formula? They will likely want to try a small A/B test to see if Instagram campaigns are bringing in new customers and the affect on CAC (customer acquisition costs). I can help them with DoE (Design of Experiments) and measurements. In my experience they’ll likely learn that the answer is gray. They may find that, in general, an Instagram campaign doesn’t provide a positive ROI given existing campaigns, but they may determine there is a segment of customers where it WILL provide lift. We will learn that together.
But, the real issue is we will NEVER know the formulas or data inputs upfront. This can’t be learned by a static requirements document. We will only learn these as we experiment with the data and talk through ideas together, while turning the knobs on the formulas. We need to do some deductive reasoning and hypothesis-building together. These working sessions may be multiple hours for multiple days. I won’t monopolize the marketing team’s schedule, but they MUST realize that if they want to see success that they must take a vested interest in this project. This isn’t a typical IT project. Once I have enough information from your team then I will work independently until I’m ready to ask more questions or present my rapid prototype.
What I just described is truly what data science really is. This is how successful data science, data, and analytics projects are structured. These projects are NOT structured like typical IT projects and “data science” is NOT always about building predictive ML models.
You: “Wow, this is different from how we run IT projects. We always start with well documented requirements documents. Without those, how do you get started?”
What we’ll do is start the project with a 1 or 2 day Design Thinking session. These are wildly fun and are geared that way to ensure attendees recognize this is a “safe space” and they can express their ideas without judgment. We definitely need the marketing team but I suggest you send some IT folks so they can listen in and help us understand our data needs. I structure these sessions so that we all agree on the problem as well as various ways to solve the problem. By the end we sometimes even build a rapid prototype so attendees can visualize what we are doing. For a project like this what we are really doing is getting an initial estimate of the data we want to use, whether we have that data already in our analytics sandbox (data lake), and what everyone initially thinks the ROI formula should be.
You ask: “So, you can give me an estimate after that Design Thinking session?”
Maybe. We’ll definitely know a lot more about the project and any estimates I would give would be at least a little more accurate and thoughtful. But there will still be a lot of variability.
Here’s what I recommend: I’ll give you a thoughtful estimate after that DT session and if you decide to proceed I’ll begin work. I’ll allow you to back out of the contract if you think my approach won’t work. Then, at the end of every 2 weeks we’ll have a retrospective. I’ll show you and marketing team what I’ve done and what I think still needs to be done…and, here’s the important part…I’ll let you know if I want to continue working with you. And, please, you tell me if you want me to continue working for you. I may determine that I don’t like working for you and your team, I don’t like your expectations, or maybe I don’t think I can ever provide value to you and answer your business question. Maybe we can’t get the data we need. I became a consultant because I don’t want to punch a clock anymore. Conversely, you may decide I’m a horrible data scientist or you may think I’m milking this cash cow engagement. Honestly, the biggest reason my clients end their relationship with me is I solved their business problem. The second biggest reason is we both agreed that maybe I can’t solve this business problem. If so, you can end the relationship. This incents me to show value quickly. This is why I structure these engagements as “open-ended staff aug” and not fixed fee/fixed scope.
You: “That seems like a lot of risk for you.”
It is. At any point you could end the engagement and now I have to find another client. So, I’m going to charge you a 25% premium over what those fixed fee, fixed scope consultancies will charge you. Listen, I’m not concerned about finding more clients or having my utilization rate above 80%. I know I’m a good analyst and my customers are always happy. I just want to consult for companies I admire, working with people I respect, doing interesting things with data that I love. Good work will always find me, regardless of economic conditions. Does that sound like someone you can trust?
Dave’s 4th Law of Consulting: To be able to say YES to yourself and your consulting business, you have to be able to say NO to any of your clients.
There’s another reason consultancies avoid “staff aug” engagements. Their employees hate them. A really good data scientist working on a long term contract has concerns about her future. Will she be pigeon-holed into working for the same company for years just because it is a profitable, lucrative client? Will she be stuck working for a manager she despises? Will her skills atrophy? Furthermore, isn’t staff aug for menial IT tasks, not for important data science work? We need our consultants to understand that at any point they can say NO to continuing a contract, regardless of how profitable and lucrative it is. We want our consultants to know they are in control. We value their skills and time and we understand their concerns.
I’m going to be honest. I left a lucrative role as the Chief Technology Officer for a consultancy because I was pigeon-holed working part-time on an engagement I hated. It was an extremely profitable engagement and I was offered bonuses and additional perks. I still quit.
You: “I understand, when can you start?”
Dave’s 5th Law of Consulting: How can you tell an experienced consultant from a new consultant? The new consultant says, “I need more clients”. The experienced consultant says, “I need more time.”
The above allegory is how all prescriptive analytics projects should be run, regardless of whether you outsource. These same Lean Principles can be used to run your internal analytics projects too. Prescriptive analytics is using potentially lots of data sources to drive strategic decision-making while answering broad questions like “what do we do next?”. The answers are not always quantifiably provable. There is often a bunch of qualitative features that need to be taken into consideration. Data scientists have tricks to turn these subjective attributes into something measurable that can be plugged into a formula. Prescriptive analytics is much harder than predictive analytics (building ML algorithms to forecast the future). Prescriptive analytics is where data scientists shine. We have the know-how to ask the right questions, perform deductive reasoning, design experiments, and present data in a compelling, “guided analytics” and storytelling approach such that executives can make data-driven decisions.
We aren’t building an e-commerce website or building clearly defined tabular reports (descriptive analytics). These are examples of IT projects that have a fixed scope with little risk and known outcomes. These projects can be run as standard scrum/agile/Kanban projects. Prescriptive analytics (and data science) projects don’t always have a known end-state. We have to run a lot of experiments to determine what to do next. These are risky projects and need to be run to control risk. This means they need to follow a more fail-fast, iterative, Lean paradigm.
Consultancies don’t like to run projects this way. It’s difficult to forecast headcount and revenue. Instead they will focus on fixed scope that has a lot of fluff and weasel words so they can always claim success at the end of the engagement. And you may be happy, but you are usually underwhelmed with the output. Instead, find a consultancy that will run a true iterative, Lean analytics project. Then pay them a little more for the risk they are taking. Design short “experimentation” sprints and listen carefully at each retrospective. Evaluate what they are saying:
If you are going to outsource consider outsourcing the easy stuff that is easy to scope.
Never outsource your core competency. Only you know how to build the formulas that will answer the prescriptive analytics questions you have, like “should I invest in another marketing campaign?” You can’t outsource that knowledge to a data consultancy.
Know-How pays much less than Know-When, Know-What, or Know-Why
I no longer do independent consulting as a data scientist. I work for the Microsoft Technology Center as a Decision Architect, consulting with our customers solving difficult problems with data in short rapid prototype sprints (usually 3 days). I hate to admit this but a common statement I hear from my customers is: “I wish we would’ve talked to you BEFORE we outsourced this analytics project. We are learning so much and we can see that our current vendor partner is not focused on the right things.”
Listen, data projects are risky, and difficult. Prescriptive analytics (using data to answer the question: what do we do next?) is nothing new, but savvy business people are demanding this more than traditional dashboards and reports. Not every data professional is ready to build solutions where the scope and outcome are not clearly defined up-front. Analytics in 2021 requires thinking differently, asking questions, experimentation, and fail-fast mentalities.
The MTC is staffed with data professionals that understand this. We are viewed as Trusted Advisors by our customers. We are usually booked out for 45-60 days because we are in demand. We are also honest, we will tell you if (or when) a problem doesn’t seem solvable. It happens. We can also work with your data consultants if they need assistance with certain aspects of the project…especially Design Thinking. We’ve helped to rescue at-risk and failing analytics projects simply by asking different questions. We are Thought Leaders and present at industry conferences. We live and breathe this stuff.
Does that sound like someone your team can trust?
Contact me on LinkedIn, or your Microsoft account team, and we can start our journey together.
Are you convinced your data or cloud project will be a success?
Most companies aren’t. I have lots of experience with these projects. I speak at conferences, host hackathon events, and am a prolific open source contributor. I love helping companies with Data problems. If that sounds like someone you can trust, contact me.
Thanks for reading. If you found this interesting please subscribe to my blog.
Dave Wentzel CONTENT
data science Digital Transformation data architecture etl data lake