Why Data Science Projects Fail and How to Make Your’s a Success
From improving internal operations to creating external marketing campaigns, data science plays a huge role in how modern organizations operate. However, many data science experiments are just that and experiments often fail. Lets explore why data science projects fail and how to make your’s successful.
In 2016, Gartner estimated that 60% of data-science projects fail, rising to 85%, according to a 2017 another study. Understanding why data-science projects fail can help you avoid those common mistakes.
Failing to Establish Goals
Defining clear goals helps build a solid foundation for a data-science project. Asking the right questions is what will result in a worthwhile project.
Frequently, businesses will provide data scientists with big picture goals, such as “predict our customer satisfaction ratings over the next 12 months.”
Instead, try a more granular, actionable approach. Explain the business problem and strategy in – for example where to raise customer satisfaction rates to most cost effectively drive more sales. For there:
- Have your team identify your current customer satisfaction ratings for different segments based on clusters of attributes to see identify key segments and criteria.
- Ask them to build a model to predict customer satisfaction and resulting sales projections on these granular segments.
- Finally, match those segments to different channels to most cost-effectively reach these customers and drive sales.
This approach creates a clear set of goals, and brings the data science projects into the mainstream analytics processes to deliver results. Data scientists can engage with the more analytics assets, collaborate with the broader analytics team, produce actionable that are real to the business teams, and deliver real ROI.
Not Using the Right Data
Incomplete datasets result in blind spots and inaccurate insights from data science models. The inability to find the “right” data is another common issue that causes data science projects to fail.
Many organizations have a highly disparate data landscape and are flush with data silos spread in various departments, in different locations (on-premises, cloud, in SaaS applications and external services), and each with different owners. Not only does this make it difficult for data scientists to access to the right data, but often times they don’t even know certain data assets exist!
Google Research Director Peter Norvig is famously quoted as saying “Simple models and a lot of data trump more elaborate models based on less data.” This tells the data scientist that the mode data they can use, the better the results.
So how do you marry the data scientists with more data assets AND give them the ability to find the right data. There are two keys to this:
- Give the data science team a single point of access to the various analytics assets, including data ones, so they can find, explore and combine various assets together to create a deeper, wider dataset,
- Allow the wider analytics team share what they know about the data assets to help the data scientists create the right dataset for the problem at hand.
Through this, data scientists can become more familiar with the data aspects of their job and spend more time on what they are good at – the science pieces. They create more accurate models, fed with more data that are better at reaching the business goals.
Keeping Walls Between Teams
The different roles that perform analytics in an organization typically have different skills and bring different knowledge sets. Data analysts really know data. Business analysts really understand how to apply data to the business. And data scientists really know how to apply algorithmic science to the data.
But often times these three groups work independently focusing on their own tasks and problems. This not only creates process inefficiencies but keeps walls between the teams and barriers to sharing the knowledge each group and individual holds.
Successful data science requires an organization to break down these walls and unify the teams in terms of process and sharing. The single access point for analytics assets previously mentioned allows all the analytics professionals to:
- Share assets they have developed facilitating reuse and eliminating the need to recreate the wheel each time
- Offer knowledge about the assets so other team members can determine the fit and readiness to solve their problems
- Collaborate on projects with each role bringing to the table their skills and knowledge to get projects done faster and with greater accuracy
A major effect of this will be to bringing data science into the mainstream of the analytic processes and delivering faster and greater value to the business.
Successful data science and eliminating the major causes of failure requires breaking down the barriers we discussed earlier – creating well defined goals, using more of the “right” data, and unifying your analytics teams around sharing, knowledge and collaboration. There three key aspects can not only deliver faster, more successful individual projects, but also bring data science into the mainstream of your analytics processes to give even greater ROI to your overall analytics initiatives.
The Neebo Virtual Analytics Hub is a SaaS solution allowing analytics teams to find, create, collaborate and publish trusted analytics assets in complex hybrid landscapes. Neebo provides unified access across analytics silos, increases use of analytics assets and furthers data knowledge to build trust and rapidly answer new business questions. To learn more visit the Neebo website or test drive Neebo by registering for a free 14-day trial.