Skip to main content
  1. Posts/

Data Science Books

··386 words·2 mins·

πŸ“˜ 10 Free Data Science Books (and what you’ll learn πŸ‘‡) πŸš€
A free collection of books to learn data science from theory to practice. Includes concepts, programming (Python and R), statistics, and useful tools.

πŸ“š Books, downloads, and what each contains
#

  1. 🌟 Veridical Data Science β€” https://vdsbook.com/
    Contents: Introduction to the data science project lifecycle, data exploration, and prediction.

  2. πŸ“Š Data Science: Theories, Models, Algorithms, and Analytics β€” https://srdas.github.io/MLBook/index.html
    Contents: Core concepts, visualization, data handling, statistics, machine learning, and advanced applications.

  3. 🐍 Think Python (3E) β€” https://allendowney.github.io/ThinkPython/
    Contents: Python programming fundamentals, control flow, data structures, and object-oriented programming.

  4. 🐍 Python Data Science Handbook β€” https://github.com/jakevdp/PythonDataScienceHandbook
    Contents: Key Python tools for data science: NumPy, Pandas, plotting with Matplotlib, and basic machine learning.

  5. πŸ“ˆ R for Data Science β€” https://r4ds.hadley.nz/
    Contents: Using R for analysis, visualization, data manipulation, and transformation.

  6. πŸ“‰ Think Stats (3E) β€” https://allendowney.github.io/ThinkStats/
    Contents: Practical statistics for data science: exploratory analysis, probability, regression, and statistical models.

  7. πŸ“Š Statistics and Prediction Algorithms Through Case Studies β€” https://rafalab.dfci.harvard.edu/dsbook-part-2/
    Contents: Applied statistics and predictive algorithms with examples (useful even if you don’t use R).

  8. πŸ“‘ Probabilistic Programming & Bayesian Methods for Hackers β€” https://dataorigami.net/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/
    Contents: Bayesian methods and probabilistic programming with PyMC and practical examples.

  9. πŸ”’ Think Bayes (2E) β€” https://allendowney.github.io/ThinkBayes2/
    Contents: Practical approach to Bayesian statistics with Python code and real-world applications.

  10. πŸ’» Data Science at the Command Line β€” https://jeroenjanssens.com/dsatcl/
    Contents: How to use the command line (shell/UNIX) to manipulate, clean, explore, and automate data tasks.

πŸ’‘ Quick summary
#

If you’re starting in data science:

  • 🧠 Concepts and theory: the first books explain what data science is and how models work.
  • 🐍 Python programming: learn to write code to analyze data, from basics to popular libraries.
  • πŸ“Š Statistics: understanding numbers, probability, and predictive models is key for data-driven decisions.
  • πŸ“ˆ R and visualization: some guides focus on R, another widely used language for analysis.
  • πŸ’» Command-line tools and workflows: learn tools that speed up daily data science work.

πŸ‘‰ All of these resources are free and accessible online.

More information at the link πŸ‘‡

Also published on LinkedIn.
Juan Pedro Bretti Mandarano
Author
Juan Pedro Bretti Mandarano