Why build a Data Science Center of Excellence?


Data science is transforming businesses and generating new value across industries. IDC predicts worldwide AI investment of $77.6B in 2022. Gartner predicts AI systems will create $3.9T in business value in 2022. In recent years, this value has been recognized by leaders of old and new industries alike. As machine learning technologies have matured, they have become a new tool for telecoms, retailers, banks, healthcare providers, insurers, utilities, and many more. Having the ability to employ machine learning is no longer a cutting-edge advantage, but a competitive expectation.


Despite maturing tools, data science still often requires dedicated technology and sophisticated teams to execute successfully. There are applications of data science to improve marketing, sales, operations, customer services, and other departments, but investing in the teams and technology to realize it in each department is costly and duplicative. A center of excellence addresses this problem by creating a dedicated data science capability that delivers value across departments.

A center of excellence address this problem by creating a dedicated data science capability that delivers value across departments.

What is a Data Science Center of Excellence?


A center of excellence (CoE) is a single team that focuses the vision, strategy, and infrastructure for a discipline. This is particularly useful for data science because of the niche skills and technology the discipline requires. The CoE team typically acts as an internal consultant, working with multiple divisions in the organization to identify and exploit opportunities in data science. A center of excellence may not be the only team wielding data science in an organization, but acts as a leader, innovator, and standards setter. Additionally, a successful CoE team will often serve as a learning resource for individuals practicing data science outside of the team.

What are the benefits of Data Science Centers of Excellence?


  • Consolidate expenses to reduce total cost
  • Standardize process, tools, and approach to common machine learning problems
  • Reduce time to delivery
  • Make data science accessible and affordable, even for niche use cases
  • Centralize a strategy for talent development and innovation

How do you build a Data Science Center of Excellence?


Determining where to start when building a machine learning and data science capability can be daunting. Naturally, one would like to see results as quickly as possible, without committing to major investments in staff and technology all at once. There is a common maturity model that can be used for building data science centers of excellence iteratively, while realizing gains across the organization. In this model, talent is pooled, a centralized consultancy is established, then that team develops standards and practices that spur innovation across the organization. This process often takes years to fully mature, but value is delivered quickly and with the flexibility to address changing opportunities efficiently.

The Data Science Center of Excellence Maturity Model

Foremost, an executive strategy should be set to formalize where data science can be applied and how much investment should be made. In this strategy there should be the top opportunities (operational gains, improved conversion rates, customer experience enhancement, etc) and an estimate of how valuable they are. This roadmap may change in time, but provides a guide for how much to invest in team and infrastructure.


Next, talent needs to be assembled. Typically, there are already talented, knowledgeable analysts, developers, or data scientists in the organization. This is a huge asset. A great way to start a data science CoE is by seeding it with existing employees who bring to the table a depth of knowledge about the business. This helps keep the data science team connected to business stakeholders and can accelerate development by avoiding ramp-up time. Joining the CoE team can also be a big development opportunity and morale booster for employees who get a chance to build their skills and operate across the business. It is important to invest in these employees and help them sharpen their skills. Additionally, outside talent can be helpful for bringing in expertise in data science tools and practices that will become standards for the CoE. Many successful teams start with a core group of existing employees, asses their skills for development gaps, and then use training and strategic hires to level them up.


Lastly, a fully functional data science team requires infrastructure to access data, train models, and deploy solutions. There is an ever-expanding list of options. The best choices will depend on the existing infrastructure in the organization, the characteristics of the data, the data science use cases, and other operational factors. Different team members will bring in different experiences and opinions on what is best. As much as possible, it is helpful to establish some key tools and patterns that the team can adopt, while allowing flexibility to test new tools as needed. It is often best to start with the data infrastructure. Accessing and transforming data is one of the most time-consuming and repetitive aspects of data science. A flexible, scalable data store is crucial in the long run.


How do you measure success?


A key part of the CoE strategy is maximizing value delivered while increasing efficiency in costs. Data Science is a revenue stream, not a cost center. Data science has the ability to increase revenue, reduce costs, capture market share, reduce churn, and improve experiences.

Data science has the ability to increase revenue, reduce costs, capture market share, reduce churn, and improve experiences.

It is important to track value from the beginning and tune growth accordingly. Although the CoE may exist outside of the departments it serves, it is critical to understand the impact it has on them and prioritize opportunities and investment accordingly. By recording the value of what is delivered, a strong case can be made to continue investment in a CoE that would be difficult to realize in scattered data science teams.

Like what you see?

Paul Lee

Adam Cornille is Director of Advanced Analytics at Logic20/20. He is a data science manager and practitioner with over a decade of field experience, and has trained in development, statistics, and management practices. Adam currently heads the development of data science solutions and strategies for improving business maturity in the application of data.


Follow Adam on LinkedIn