Select to view content in your preferred language

What questions do you ask when starting a data analysis project?

806
4
05-28-2025 12:54 PM
Bud
by
Esteemed Contributor

A general question about managing data analysis projects.

What questions do you ask the requester?

  1. Who is making this request (what level of management)? Who is the intended audience? Who will be on the project team?
  2. What do you want? Can you mock up a rough example in Excel or Illustrator, etc.? What filetype/platform?
  3. When do you need it by?
  4. Where is the data? Where will the final product live and for how long? Who needs access to the final product?
  5. Why are we doing this? Why now? What happens if you don’t get it in time?
  6. How are we going to do the work? What's the plan?
  7. Cleaning - How much data cleaning is required? How long will it take? Will we clean the source data or a copy of the data? How accurate do you need the analysis result/data to be? For example, if we were to guess that the data is 95% accurate, is that sufficient?

Those questions are just a stab in the dark.

What questions do you ask when taking on a data analysis project? I ask because I'm moving to a new team and am looking to develop some techniques for success to avoid the usual pitfalls of data analysis projects.

0 Kudos
4 Replies
BillFox
MVP Notable Contributor

give them what they need, this may not what they are asking for

show us what you are asking for

who is going to maintain it, how often?

is it private, internal, public?

do you have Department Head approval before doing the work?

 

AlfredBaldenweck
MVP Regular Contributor

What other data is needed to get the final product?

Not analysis related, but this bit me pretty bad the other week: What size do you need the final layout to be?

BarryNorthey
Frequent Contributor

You might consider it from a metadata standpoint as in have the requester supply a (preliminary) Summary, Description, (credits), Use Limitations, etc., for whatever product they are requesting. 

0 Kudos
Bud
by
Esteemed Contributor

Here are my updated questions:

  1. Who
    1. Who is making this request (what level of management)? Do you have senior management buy-in/approval? Are there any conflicts? Has anyone else already done this or is planning to do it?
    2. Who will provide the data?
    3. Who will be on the project team?
    4. Who will maintain the product and how often?
  2. What
    1. What do you want? Can you show us? Can you mockup an example in Excel or Illustrator, etc.? Give them what they need, this may not what they are asking for.
    2. What does "Done" look like? How will we know the work is completed to satisfaction? Consider getting a written answer to those questions to avoid wandering scope.
    3. What filetype/platform?  Do you need a table, chart, print map, web map, or something else? What are the product dimensions/specs?
  3. When
    1. When will we have what we need to proceed?
    2. Where does it sit in our list of priorities?
    3. When do you need it by?
  4. Where
    1. Where is the data? Do the team members have access?
    2. Will the final product be sensitive, internal, or public? Who needs access?
    3. Where will the final product live and for how long?
  5. Why
    1. Why are we doing this?
    2. Why now? What happens if you don’t get it in time?
    3. What are you going to do with the final product?
    4. Does the product provide a good way to measure performance—such as measuring a team against a desired outcome?
  6. How
    1. How are we going to do the work? What's the plan?
  7. Cleaning
    1. Does the data exist? How reliable is it? Was the data entered correctly and is it easily query-able?
    2. How much data cleaning is required? How long will it take? Who will clean the data?
    3. Will we clean the source data or a copy of the data?
    4. How accurate do you need the analysis result/data to be? For example, if we were to estimate that the data is 95% accurate, is that sufficient?


Notes from a colleague:

 #4 and 7 both sort of cover this but it's critical to ask if the data exists, how reliable is it? In Work Order Management software terms, failure analysis is pretty popular right now. Everyone wants to say they can help predict future failures. But the number of organizations that aren't even capturing failure data or are capturing it in free form fields (like long descriptions) where the same problem can be reported dozens of different ways by different people makes it really difficult. Even when the data is structured, such as using problem/cause/remedy, most companies don't have all the possible options. Things constantly end up in the wrong bucket because the fields are required but the correct option doesn't exist in the eyes of the technician. Even if you have the data, you can't necessarily get anything valuable out of it.

Most analysis requests come from managers/executives that don't really understand the problem trying to be solved. You need the people that are capturing the data to explain how they capture it and the challenges they have with capturing it reliably.

#5 the first question "why are we doing this" and the follow up question (in my eyes, don't see it here) is "what are we going to do with this" is extremely critical to me. Managers love KPIs because it gives a way to measure the team against some desired outcome. For example, if you want to focus on quality, you will set objectives for number of defects. If you want to focus on customer service, you'll set objectives for time to resolution for example. Neither of these are inherently bad. 

But you have to be careful. KPIs often lead to undesirable outcomes, especially if people (managers and/or employees) are incentivized by the KPI. In software, if you measure teams on the number of bugs, genuine issues with the product are "working as designed". Customer service will close cases as soon as they respond even if the issue isn't resolved from the customer perspective. You essentially change the behavior of employees to meet the KPI but at the end of the day, your customer (internal or external) isn't benefitting from the changes.

Then in 6 months when customer satisfaction is down, the KPIs are changed to try and achieve that objective. And the cycle continues on and on.

0 Kudos