Data professionals spend their time on a variety of tasks that require both technical and non-technical skills, says the 2022 State of Data Science Report from Anaconda. According to the survey respondents, those tasks mainly include:
- Data preparation and cleansing (38% of their time)
- Working with models through selection, training, and deployment (26%)
- Demonstrating data’s value through reporting and presentation (16%)
- Data visualization (13%)
In this article, we’ll look at other key findings relating to data professionals’ skills and responsibilities.
Key Skills
The field of data science is growing rapidly, with the U.S. Bureau of Labor Statistics stating that “employment of computer and information research scientists is projected to grow 22 percent from 2020 to 2030, much faster than the average for all occupations.”
However, 62 percent of respondents said their organizations are at least moderately concerned about the impact of a talent shortage. Respondents also noted a lack of skills within their organizations and cited a lack of tailored learning paths, hands-on projects, and mentorship opportunities for those wanting to develop skills in this area.
According to respondents, the top five skills missing in the data science areas of their organizations are:
- Engineering skills (38%)
- Probability and statistics (33%)
- Business knowledge (32%)
- Big Data management (31%)
- Communication skills (29%)
Using Open Source
Eighty-seven percent of survey respondents said their employers allow the use of open source software (OSS), with only eight percent saying its use was not allowed. However, only 52 percent of commercial respondents said their teams were encouraged to contribute to open source projects, which is down about 13 percent from 2021.
In terms of security, the survey asked how organizations secure their software supply chains and meet enterprise security standards. In response:
- 40 percent said they use vulnerability and security scanning tools
- 33 percent create and use custom and proprietary software
- 27 percent perform manual audits of models and applications
- 24 percent are not sure
Mitigating Bias
The social impact from bias in data and models was cited (32%) by respondents as one of the biggest problems in the data science/AI/ML space today, along with:
- Impacts to individual privacy (18%)
- Advanced information warfare (16%)
- Lack of diversity and inclusion in the profession (14%)
When asked what specific steps organizations are taking to mitigate bias and ensure fairness, 31 percent of respondents said they evaluate data collection methods according to internally set standards, and 25 percent manually assess data sets for fairness and bias.
Unfortunately, 24 percent of respondents said their organizations “do not have standards surrounding/have not implemented measures or tools to address fairness and bias mitigation in data sets and models,” and 15 percent said they aren’t sure what steps are being taken.
To learn more about data science and related skills, check out the resources below.
Learn More
- 2021 Data/AI Salary Survey from O’Reilly
- 5 Key Data Science Skills from FOSSlife
- Data Wrangling and Exploratory Data Analysis Explained from InfoWorld
- How to Become a Data Analyst from CompTIA
- What is a Data Scientist? From FOSSlife
Ready to find a job? Check out the latest job listings at Open Source JobHub.
Comments