06 September 2018

The 5 books every data scientist must read

The 5 books every data scientist must read

Approaching the world of Machine Learning and Big Data is certainly an interesting challenge, and to make the process easier and faster, here are 5 books that are absolutely must-read!

1. Machine learning with Python. Building algorithms to generate knowledge

Processing the magma of data available today is a fascinating and essential challenge for the contemporary world where knowledge and information are the primary value.
This book takes you into the world of machine learning and shows how Python is the ideal programming language for building sophisticated algorithms that can optimally query data and retrieve valuable insights.
This volume explains the use of dedicated Python libraries – including scikit-learn, Theano, and Keras – applied to areas such as data selection and compression, natural language analysis , prediction processing, and image recognition. The teaching approach is pragmatic: all concepts are accompanied by practical code examples.

This book is recommended for those who have some theoretical knowledge of machine learning and a good understanding of Python programming.

2. Big Data Analytics: The Data Scientist's Handbook

This book is intended as a comprehensive guide for those intending to enter this emerging profession as well as for those already experienced who wish to delve deeper into specific topics. The author illustrates the key concepts related to data management and advanced data analysis; he describes big data and the tools and architectures that enable its management ( Hadoop in particular); and he introduces data ingestion and processing with several analysis tools ( Hive, Pig, Spark, and R ), whose functionality is also illustrated through commented examples. One section is dedicated to predictive analytics and demonstrates techniques for creating predictive models , from data preparation to choosing the most suitable algorithm to performance evaluation. The text is a valuable tool for understanding the concepts related to data analysis (big data or traditional data), including for company management , who can draw useful information from advanced analytics to make decisions, assess risks, and design strategies.

3. Data Science: A Guide to the Basic Principles and Techniques of Data Science

This book is aimed at programmers who want to enter the world of data science by discovering how to combine skills ranging from mathematics to business analysis through—naturally—programming. The goal is to teach how to approach heterogeneous data and transform it into ideas and insights . Throughout the various chapters, the elements a data scientist must master are presented: defining the analysis domain, retrieving and cleaning raw data, calculating probabilities, statistical models, and even applying machine learning algorithms. There is also plenty of insight into how to normalize and prepare data before analysis, as well as tips on how to effectively present and communicate results. All key steps are accompanied by pseudocode examples to better illustrate the algorithms in use, while the code examples primarily use the Python language.

4. Data Scientists: Between Competitiveness and Innovation

Faced with the rise of the Algorithm Economy and Big Data, organizations are increasingly requiring a professional figure capable of communicating and collaborating with COBOTS and intelligent machines: the Data Scientist. Responding to this need, this volume provides practical guidance both for those who want to embark on and develop a career as a Data Scientist to the highest level and for companies looking to employ this figure to improve decision-making capacity and competitiveness. The text is also enriched by contributions from leading figures in the world of innovation , who offer an alternative and open perspective, and by testimonials and case histories that help clarify the content presented.

5. Artificial intelligence, personal data protection and regulation

This volume is, first and foremost, a challenge. As the new European General Data Protection Regulation (GDPR) comes into force, aiming to enhance protection and increase trust in the circulation of data and the digital economy, efforts are already underway here to go further. The GDPR represents a huge effort to move from the static concept of data as personal property to the dynamic one that views data, including personal data, as the lifeblood of the fourth industrial revolution. We are now in the age of artificial intelligence, intelligent machines, and the Internet of Things. Is it possible that the GDPR can strengthen people's trust by protecting their data, even in the new world of AI?

Source: Amazon.it