More Data Science Books

Here is a list of even more useful data science books.

  • Programming Collective Intelligence by Toby Segaran (O’Reilly, 2007), available at O’Reilly or Amazon

  • Stat Labs: Mathematical Statistics Through Applications by Deborah Nolan and Terry Speed (Springer, 2000), available at SpringerLink or Amazon

  • Version Control with Subversion, 2nd Edition, by C. Michael Pilato, Ben Collins-Sussman, and Brian W. Fitzpatrick (O’Reilly, 2008), available at O’Reilly or Amazon or the book can be legally downloaded for free in pdf format.

  • Introduction to Data Technologies by Paul Murrell (CRC Press, 2009), available at O’Reilly or Amazon or the book can be legally downloaded for free in pdf format.

  • Computational Statistics by James E. Gentle (Springer, 2009), available at SpringerLink or Amazon

  • Hadoop: The Definitive Guide, 3rd Edition, by Tom White (O’Reilly, 2012)

  • Thinking with Data by Max Shron (O’Reilly, 2014), available at O’Reilly or Amazon

  • Machine Learning for Hackers by Drew Conway and John Myles White (O’Reilly, 2012), available at O’Reilly or Amazon

  • Bad Data Handbook by Q. Ethan McCallum (O’Reilly, 2013), available at O’Reilly or Amazon

  • Agile Data Science 2.0 by Russell Jurney (O’Reilly, 2017), available at O’Reilly or Amazon

  • Doing Data Science by Cathy O’Neil and Rachel Schutt (O’Reilly, 2014), available at O’Reilly or Amazon

  • Statistical Methods in Bioinformatics: An Introduction, 2nd Edition by Warren J. Ewens and Gregory R. Grant (Springer, 2005), available at SpringerLink or Amazon

  • The Elements of Statistical Learning, Second Edition, by Trevor Hastie, Robert Tibshirani, and Jerome Friedman (Springer, 2009), available at SpringerLink or Amazon

  • Introductory Time Series with R by Andrew V. Metcalfe and Paul S.P. Cowpertwait (Springer, 2009), available at SpringerLink or Amazon

  • Forest Analytics with R by Andrew P. Robinson and Jeff D. Hamann (Springer, 2011), available at SpringerLink or Amazon

  • Applied Spatial Data Analysis with R, Second Edition, by Roger S. Bivand, Edzer Pebesma, and Virgilio Gómez-Rubio (Springer, 2013), available at SpringerLink or Amazon

  • An Introduction to Statistical Learning, Second Edition, by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani (Springer, 2021), available at SpringerLink or Amazon

  • Time Series Analysis and Its Applications, Fourth Edition by Robert H. Shumway and David S. Stoffer (Springer, 2017), available at SpringerLink or Amazon

  • An Approach to Providing Mathematical Annotation in Plots, by Paul Murrell and Ross Ihaka, Journal of Computational and Graphical Statistics 9(3):582-599, 2000, JSTOR

  • Creating More Effective Graphs by Naomi B. Robbins (Wiley-Interscience, 2005)

  • The Elements of Graphing Data, Revised Edition, by William S. Cleveland (Hobart Press, 1994)

  • The Grammar of Graphics, 2nd Edition, by Leland Wilkinson (Springer, 2005)

  • Graphics of Large Datasets, by Antony Unwin, Martin Theus, Heike Hofmann (Springer, 2006)

  • How to display data badly, by Howard Wainer, The American Statistician 38(2), 1984, JSTOR

  • Maps for Advocacy, by Tactical Technology Collective (2008) www.tacticaltech.org/maps-advocacy

  • S-PLUS Trellis Graphics User’s Manual, by Richard A. Becker and William S. Cleveland (1996), www.stat.purdue.edu/~wsc/papers/trellis.user.pdf

  • The Visual Display of Quantitative Information, 2nd Edition, by Edward R. Tufte (Graphics Press, 2001)

  • Beautiful Evidence by Edward R. Tufte (Graphics Press, 2006)

  • Visual Explanations by Edward R. Tufte (Graphics Press, 1997)

  • Envisioning Information by Edward R. Tufte (Graphics Press, 1990)

  • Seeing with Fresh Eyes by Edward R. Tufte (Graphics Press, 2020)

  • Visualizing Data, by William S. Cleveland (Hobart Press, 1993)

  • Visualizing Data, by Ben Fry (O’Reilly, 2008)

  • Visualizing Information for Advocacy, by Tactical Technology Collective (2008) visualisingadvocacy.org/

  • Learn Git in a Month of Lunches by Rick Umali (Manning, 2015)

  • Interactive Data Visualization for the Web, 2nd Edition, by Scott Murray (O’Reilly, 2017)

  • C++ Software Design by Klaus Iglberger (O’Reilly, 2022)

  • Sport Business Analytics by C. Keith Harrison and Scott Bukstein (CRC Press, 2017)

  • Introduction to DevOps with Chocolate, LEGO and Scrum Game by Dana Pylayeva (APress, 2017)

  • The Atlas of the Real World by Daniel Dorling, Mark Newman, and Anna Barford (Thames & Hudson, 2008)

  • Tableau Desktop Cookbook by Lorna Brown (O’Reilly, 2021)

  • Innovative Tableau by Ryan Sleeper (O’Reilly, 2020)

  • Practical Tableau by Ryan Sleeper (O’Reilly, 2018)

  • Communicating Data with Tableau by Ben Jones (O’Reilly, 2014)

  • Tableau Strategies by Ann Jackson and Luke Stanke (O’Reilly, 2021)

  • Tableau Prep: Up & Running by Carl Allchin (O’Reilly, 2020)

  • 97 Things Every Engineering Manager Should Know by Camille Fournier (O’Reilly, 2020)

  • D3 for the Impatient by Philipp K. Janert (O’Reilly, 2019)

  • Building Tools with GitHub by Chris Dawson and Ben Straub (O’Reilly, 2016)

  • Git for Teams by Emma Jane Hogbin Westby (O’Reilly, 2015)

  • Version Control with Git, 2nd Edition, by Jon Loeliger and Matthew McCullough (O’Reilly, 2012)

  • Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra, and Thomas Wolf (O’Reilly, 2022)

  • Practical Natural Language Processing by Sowmya Vajjala, Bodhisattwa Majumder, Anuj Gupta, and Harshit Surana (O’Reilly, 2020)

  • Natural Language Processing with PyTorch by Delip Rao and Brian McMahan (O’Reilly, 2019)

  • GPT-3 by Sandra Kublik and Shubham Saboo (O’Reilly, 2022)

  • Natural Language Processing with Spark NLP by Alex Thomas (O’Reilly, 2020)

  • Deep Learning Cookbook by Douwe Osinga (O’Reilly, 2018)

  • Hands-On Machine Learning with Scikit-Learn, Keras and TensorFlow, 2nd Edition, by Aurélien Géron (O’Reilly, 2019)

  • Machine Learning by Peter Flach (Cambridge, 2012)

  • Deep Learning for Coders with Fastai and PyTorch by Jeremy Howard and Sylvain Gugger (O’Reilly, 2020)

  • Blast by Ian Korf, Mark Yandell, and Joseph Bedell (O’Reilly, 2003)

  • Developing Bioinformatics Computer Skills by Cynthia Gibas and Per Jambeck (O’Reilly, 2001)

  • Learning Microsoft Power BI: Transforming Data into Insights by Jeremey Arnold (O’Reilly, 2022)

  • Becoming a Data Head by Alex J. Gutman and Jordan Goldmeier (Wiley, 2021)

  • Computational Mathematics with SageMath by Paul Zimmermann (SIAM, 2018)

  • 97 Things Every Cloud Engineer Should Know by Emily Freeman and Nathen Harvey (O’Reilly, 2021)

  • Raspberry Pi Cookbook, 3rd Edition, by Simon Monk (O’Reilly, 2020)

  • 97 Things About Ethics Everyone in Data Science Should Know, by Bill Franks (O’Reilly, 2020)

  • 97 Things Every Data Engineer Should Know, by Tobias Macey (O’Reilly, 2021)

  • 97 Things Every Programmer Should Know, by Kevlin Henney (O’Reilly, 2010)

  • Sage for Undergraduates, 2nd Edition, by Gregory V. Bard (AMS, 2022)

  • Sage Beginner’s Guide, by Craig Finch (Packt, 2011)

  • Command-Line Rust, by Ken Youens-Clark (O’Reilly, 2022)

  • Programming Rust, by Jim Blandy, Jason Orendorff, and Leonora F. S. Tindall (O’Reilly, 2021)

  • Rust for Rustaceans by Jon Gjengset (No Starch Press, 2022)

  • Learning GNU Emacs, 3rd Edition, by Debra Cameron, James Elliott, Marc Loy, Eric Raymond, and Bill Rosenblatt (O’Reilly, 2005)

  • The Rust Programming Language, by Steve Klabnik and Carol Nichols (No Starch Press, 2019)

  • Introducing Data Science by Davy Cielen, Arno D. B. Meysman, and Mohamed Ali (Manning, 2016)

  • flex & bison by John Levine (O’Reilly, 2009)

  • Learning React, 2nd Edition, by Alex Banks and Eve Porcello (O’Reilly, 2020)

  • Practical Time Series Analysis by Aileen Nielsen (O’Reilly, 2020)

  • Practical Statistics for Data Scientists by Peter Bruce, Andrew Bruce, and Peter Gedeck (O’Reilly, 2017)

  • Building Secure and Reliable Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Piotr Lewandowski, Ana Oprea, and Adam Subblefield (O’Reilly, 2020)

  • JavaScript: The Good Parts by Douglas Crockford (O’Reilly, 2008)

  • Learning PHP, MySQL & JavaScript, 5th Edition, by Robin Nixon (O’Reilly, 2018)

  • Programming JavaScript Applications by Eric Elliott (O’Reilly, 2014)

  • Speaking JavaScript by Axel Rauschmayer (O’Reilly, 2014)

  • Data Science on AWS by Chris Fregly and Antje Barth (O’Reilly, 2021)

  • Data Science from Scratch, 2nd Edition, by Joel Grus (O’Reilly, 2019)

  • Think Like a Data Scientist by Brian Godsey (Manning, 2017)

  • Design Patterns by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (Addison Wesley, 1995)

  • Mastering Kafka Streams and ksqlDB by Mitch Seymour (O’Reilly, 2021)

  • Understanding Compression by Colt McAnlis and Aleks Haecky (O’Reilly, 2016)

  • Kubernetes Operators by Jason Dobies and Joshua Wood (O’Reilly, 2020)

  • Production Kubernetes by Josh Rosso, Rich Lander, Alexander Brand, and John Harris (O’Reilly, 2021)

  • Spark: The Definitive Guide by Bill Chambers and Matei Zaharia (O’Reilly, 2018)

  • Foundations for Architecting Data Solutions by Ted Malaska and Jonathan Seidman (O’Reilly, 2018)

  • High Performance Spark by Holden Karau and Rachel Warren (O’Reilly, 2017)

  • Mastering Azure Analytics by Zoiner Tejada (O’Reilly, 2017)

  • Programming Hive by Edward Capriolo, Dean Wampler, and Jason Rutherglen (O’Reilly, 2012)

  • The Enterprise Big Data Lake by Alex Gorelik (O’Reilly, 2019)

  • Stream Processing with Apache Spark by Gerard Maas and Francois Garillot (O’Reilly, 2019)

  • Modern Statistics for Modern Biology by Susan Holmes and Wolfgang Huber (Cambridge, 2019)

  • Data Science at the Command Line by Jeroen Janssens (O’Reilly, 2015)

  • Fundamentals of Data Visualization by Claus O. Wilke (O’Reilly, 2019)

  • Presenting to Win by Jerry Weissman (Pearson, 2009)

  • JavaScript Patterns by Stoyan Stefanov (O’Reilly, 2010)

  • JavaScript Enlightenment by Cody Lindley (O’Reilly, 2013)

  • Mapping Experiences, 2nd Edition, by James Kalbach (O’Reilly, 2021)

  • Introduction to JavaScript Object Notation by Lindsay Bassett (O’Reilly, 2015)

  • JavaScript Cookbook, 2nd Edition, by Shelley Powers (O’Reilly, 2015)

  • Kubernetes Best Practices by Brendan Burns, Eddie Villalba, Dave Strebel, and Lachlan Evenson (O’Reilly, 2020)

  • Kubernetes Patterns by Bilgin Ibryam and Roland Huss (O’Reilly, 2019)

  • Data Analysis with Open Source Tools by Philipp K. Janert (O’Reilly, 2011)

  • Learning to Love Data Science by Mike Barlow (O’Reilly, 2015)

  • Statistical Modeling, 2nd Edition, by Daniel T. Kaplan (2012) dtkaplan.github.io/SM2-bookdown/

  • JavaScript: The Definitive Guide, 7th Edition, by David Flanagan (O’Reilly, 2020)

  • CSS: The Definitive Guide, 4th Edition, by Eric Meyer and Estelle Weyl (O’Reilly, 2018)

  • Code Complete, 2nd Edition, by Steve McConnell (Microsoft, 2004)

  • Software Engineering at Google by Titus Winters, Tom Manshreck, and Hyrum Wright (O’Reilly, 2020)

  • Asked and Answered by Pamela E. Harris and Aris Winger (2020)

  • Practices and Policies by Pamela E. Harris and Aris Winger (2021)

  • Read and Rectify by Pamela E. Harris and Aris Winger (2022)

  • Testimonios by Pamela E. Harris, Alicia Prieto-Langarica, Vanessa Rivera Quiñones, Luis Sordo Vieira, Rosaura Uscanga, and Andrés R. Vindas Meléndez

  • Unleash Different by Rich Donovan (2018)

  • Hadoop Application Architectures by Mark Grover, Ted Malaska, Jonathan Seidman, and Gwen Shapira (O’Reilly, 2015)

  • MapReduce Design Patterns by Donald Miner and Adam Shook (O’Reilly, 2013)

  • Advanced Analytics with Spark, 2nd Edition, by Sandy Ryza, Uri Laserson, Sean Owen, and Josh Wills (O’Reilly, 2017)

  • Low-Power Computer Vision by George K. Thiruvathukal, Yung-Hsiang Lu, Jaeyoun Kim, Yiran Chen, and Bo Chen (CRC Press, 2022)

  • Statistics Done Wrong by Alex Reinhart (No Starch Press, 2015)

  • Statistics in a Nutshell by Sarah Boslaugh (O’Reilly, 2008)

  • Learning Spark by Jules S. Damji, Brooke Wenig, Tathagata Das, and Denny Lee (O’Reilly, 2020)

  • Strengthening Deep Neural Networks by Katy Warr (O’Reilly, 2019)

  • Reinforcement Learning by Phil Winder (O’Reilly, 2021)

  • Machine Learning Design Patterns by Valliappa Lakshmanan, Sara Robinson, and Michael Munn (O’Reilly, 2021)

  • Fundamentals of Deep Learning by Nithin Buduma (O’Reilly, 2017)

  • Deep Learning by Josh Patterson and Adam Gibson (O’Reilly, 2017)

  • AI and Machine Learning for Coders by Laurence Moroney (O’Reilly, 2021)

  • Building Machine Learning Powered Applications by Emmanuel Ameisen (O’Reilly, 2020)

  • Deep Learning for the Life Sciences by Bharath Ramsundar, Peter Eastman, Patrick Walters, and Vijay Pande (O’Reilly, 2019)

  • Generative Deep Learning by David Foster (O’Reilly, 2019)

  • Deep Learning from Scratch by Seth Weidman (O’Reilly, 2019)

  • Grokking Deep Learning by Andrew Trask (O’Reilly, 2019)

  • Real-World Machine Learning by Henrik Brink, Joseph W. Richards, and Mark Fetherolf (Manning, 2017)

  • Deep Learning and the Game of Go by Max Pumperla and Kevin Ferguson (Manning, 2019)

  • TensorFlow for Deep Learning by Bharath Ramsundar and Reza Bosagh Zadeh (O’Reilly, 2018)

  • Learning TensorFlow by Tom Hope, Yehezkel S. Resheff, and Itay Lieder (O’Reilly, 2017)

  • Practical Deep Learning for Cloud, Mobile, and Edge by Anirudh Koul, Siddha Ganju, and Meher Kasam (O’Reilly, 2020)

  • Algorithms in a Nutshell, 2nd Edition, by George T. Heineman, Gary Pollice, and Stanley Selkow (O’Reilly, 2016)

  • Making Data Visual by Danyel Fisher and Miriah Meyer (O’Reilly, 2018)

  • Baseball Hacks by Joseph Adler (O’Reilly, 2006)

  • Programming PHP, 2nd Edition, by Rasmus Lerdorf, Kevin Tatroe, and Peter MacIntyre (O’Reilly, 2006)

  • Software Architecture: The Hard Parts by Neal Ford, Mark Richards, Pramod Sadalage, and Zhamak Dehghani (O’Reilly, 2022)

  • AWS Cookbook by John Culkin and Mike Zazon (O’Reilly, 2022)

  • Migrating to AWS: A Manager’s Guide by Jeff Armstrong (O’Reilly, 2020)

  • Building Machine Learning Pipelines by Hannes Hapke and Catherine Nelson (O’Reilly, 2020)

  • Kafka: The Definitive Guide, 2nd Edition, by Gwen Shapira, Todd Palino, Rajini Sivaram, and Krit Petty (O’Reilly, 2022)

  • Data Algorithms by Mahmoud Parsian (O’Reilly, 2015)

  • Mining the Social Web by Matthew A. Russell and Mikhail Klassen (O’Reilly, 2019)

  • Bioinformatics Data Skills by Vince Buffalo (O’Reilly, 2015)

  • Data Analytics with Hadoop by Benjamin Bengfort and Jenny Kim (O’Reilly, 2016)

  • Architecting Modern Data Platforms by Jan Kunigk, Ian Buss, Paul Wilkinson, and Lars George (O’Reilly, 2019)

  • Hadoop in Practice by Alex Holmes (Manning, 2015)

  • Agile for Everybody by Matt LeMay (O’Reilly, 2019)

  • 97 Things Every Scrum Practitioner Should Know by Gunther Verheyen (O’Reilly, 2020)

  • Learning Agile by Andrew Stellman and Jennifer Greene (O’Reilly, 2015)

  • Agile Project Management by Sam Ryan (2019)

  • Agile Practice Guide (Project Management Institute, 2017)