Standing on the Shoulders of Giants
January 24, 2024
Let’s start in 2012! Thomas H. Davenport and DJ Patil, Harvard Business Review: Data Scientist: The Sexiest Job of the 21st Century
The 2012 article Data Scientist: The Sexiest Job of the 21st Century claims that the term “Data Scientist” was coined in 2008.
Data Science did not begin in 2012 nor did it start in 2008.
Published in 2017, but first appeared as Version 1.00 in September 2015.
For a long time I have thought I was a statistician, interested in inferences from the particular to the general. But as I have watched mathematical statistics evolve, I have had cause to wonder and to doubt. … All in all I have come to feel that my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data.
Four major influences act on data analysis today:
- The formal theories of statistics
- Accelerating developments in computers and display devices
- The challenge, in many fields, of more and ever larger bodies of data
- The emphasis on quantification in an ever wider variety of disciplines
… data analysis is a very difficult field. It must adapt itself to what people can and need to do with data. In the sense that biology is more complex than physics, and the behavioral sciences are more complex than either, it is likely that the general problems of data analysis are more complex than those of all three. It is too much to ask for close and effective guidance for data analysis from any highly formalized structure, either now or in the near future. Data analysis can gain much from formal statistics, but only if the connection is kept adequately loose.
The statistics profession faces a choice in its future research between continuing concentration on traditional topics—based largely on data analysis supported by mathematical statistics—and a broader viewpoint—based on an inclusive concept of learning from data. The latter course presents severe challenges as well as exciting opportunities. The former risks seeing statistics become increasingly marginal …
The Data Modeling Culture
The Algorithmic Modeling Culture
Wikipedia: Florence Nightingale
Wikipedia: R (programming language)
Wikipedia: Attention Is All You Need
Looking back ten years in 2022.