With the release of this issue of the Journal (December 2023), the 18th discussion will be opened. This discussion; ‘Are Data Scientists Going to Replace Statisticians? ’ invites readers to react with their opinion on the future ‘job’ profile of those working in/contributing to producing and disseminating official statistics. The discussion at the 9th EMOS. workshop (Praque, 26-27 October 2023), and the reflections in the editorial of Vol39, Nr4, together with the manuscript on ‘Data science skills for the next generation of statisticians’ (Antonucci, ao (Vol39, Nr 4,pp. 773-782)
as published in the same issue, constitute the background for this 18th discussion.
The readers are invited to react to a series of statements on this issue but are also free to give their overall opinion on this issue.
In speaking about statisticians, there appear to be two rather different approaches. The first is the approach from the more academic-oriented scholars, who define statistics as a branch of mathematics and a discipline that involves collecting, analyzing, interpreting, presenting, and organizing data. It provides a systematic framework for dealing with data, summarizing information, and making informed decisions based on empirical evidence.
The second approach, the approach that defines a statistician as a data scientist, sees statisticians as professionals who specialize in the field of statistics and collect, organize, analyze, and interpret data to help make informed decisions, draw conclusions, and solve real-world problems in various fields such as science, business, economics, social sciences, and more. The main difference between the two approaches is in the wider engagement, not only what concerns the domain for application, but also the origin of the data and the conceptual starting points. Though, many scholars, especially those working in ‘official statistics’ (roughly defined as statistics that are used in policymaking) consider their work of producing and disseminating statistics triggered by societal and policy issues as very near/equal to the work that data scientists are ought to be doing.
It is described in the above-mentioned article that some scholars argue that statistics isn’t necessary for data science, but the findings emphasize the complementary re- relationship between the two. They state that statistics ‘provides a foundation for data science, ensuring reliability and validity, while data science extends statistics to Big Data. Data scientists should recognize the importance of statistics, and statisticians should embrace the capabilities of data science []. However, although it seems clear that the two disciplines complement and compensate for each other, it is equally evident that statisticians often consider data science a threat’.
The authors show that there are substantial differences in salary for statisticians and data scientists as defined above. The statisticians overall earn less than the data scientist.
The readers are invited to comment on react in general on the manuscripts, but they can also concentrate on the following statements:
- Considering the type of activities and engagement modern, current official statistics are equal to data science.
- The list of jobs under the job label ‘statistician’ should include ‘statisticians working as a branch of mathematics’, and the list of jobs under the label ‘data scientist’ should include ‘working in official statistics’.
- A salary difference as described in Figure 3 of the above-mentioned manuscript (Antonucci, ao. Vol39, Nr.4) between statisticians and data scientists can only be justified when there is a societal/policy component in the job description of the data scientists and there is not such a component in the work of the statistician.