- STAFF PICKS
- GIFTS + GIFT CARDS
- SELL BOOKS
- FIND A STORE
New Trade Paper
Ships in 1 to 3 days
This title in other editions
Doing Data Science: Straight Talk from the Frontlineby Rachel Schutt
Synopses & Reviews
Now that answering complex and compelling questions with data can make the difference in an election or a business model, data science is an attractive discipline. But how can you learn this wide-ranging, interdisciplinary field? With this book, youll get material from Columbia Universitys "Introduction to Data Science" class in an easy-to-follow format.
Each chapter-long lecture features a guest data scientist from a prominent company such as Google, Microsoft, or eBay teaching new algorithms, methods, or models by sharing case studies and actual code they use. Youll learn whats involved in the lives of data scientists and be able to use the techniques they present.
Guest lectures focus on topics such as:
If youre familiar with linear algebra, probability and statistics, and have some programming experience, this book will get you started with data science.
Doing Data Science is collaboration between course instructor Rachel Schutt (also employed by Google) and data science consultant Cathy ONeil (former quantitative analyst for D.E. Shaw) who attended and blogged about the course.
Now that people are aware that data can make the difference in an election or a business model, data science as an occupation is gaining ground. But how can you get started working in a wide-ranging, interdisciplinary field thats so clouded in hype? This insightful book, based on Columbia Universitys Introduction to Data Science class, tells you what you need to know.
In many of these chapter-long lectures, data scientists from companies such as Google, Microsoft, and eBay share new algorithms, methods, and models by presenting case studies and the code they use. If youre familiar with linear algebra, probability, and statistics, and have programming experience, this book is an ideal introduction to data science.
Doing Data Science is collaboration between course instructor Rachel Schutt, Senior VP of Data Science at News Corp, and data science consultant Cathy ONeil, a senior data scientist at Johnson Research Labs, who attended and blogged about the course.
About the Author
Cathy ONeil earned a Ph.D. in math from Harvard, was postdoc at the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then chucked it and switched over to the private sector. She worked as a quant for the hedge fund D.E. Shaw in the middle of the credit crisis, and then for RiskMetrics, a risk software company that assesses risk for the holdings of hedge funds and banks. She is currently a data scientist on the New York start-up scene, writes a blog at mathbabe.org, and is involved with Occupy Wall Street.
Rachel Schutt is the Senior Vice President for Data Science at News Corp. She earned a PhD in Statistics from Columbia University, and was a statistician at Google Research for several years. She is an adjunct professor in Columbias Department of Statistics and a founding member of the Education Committee for the Institute for Data Sciences and Engineering at Columbia. She holds several pending patents based on her work at Google, where she helped build user-facing products by prototyping algorithms and building models to understand user behavior. She has a master's degree in mathematics from NYU, and a master's degree in Engineering-Economic Systems and Operations Research from Stanford University. Her undergraduate degree is in Honors Mathematics from the University of Michigan.
Table of Contents
DedicationPrefaceChapter 1: Introduction: What Is Data Science?Chapter 2: Statistical Inference, Exploratory Data Analysis, and the Data Science ProcessChapter 3: AlgorithmsChapter 4: Spam Filters, Naive Bayes, and WranglingChapter 5: Logistic RegressionChapter 6: Time Stamps and Financial ModelingChapter 7: Extracting Meaning from DataChapter 8: Recommendation Engines: Building a User-Facing Data Product at ScaleChapter 9: Data Visualization and Fraud DetectionChapter 10: Social Networks and Data JournalismChapter 11: CausalityChapter 12: EpidemiologyChapter 13: Lessons Learned from Data Competitions: Data Leakage and Model EvaluationChapter 14: Data Engineering: MapReduce, Pregel, and HadoopChapter 15: The Students SpeakChapter 16: Next-Generation Data Scientists, Hubris, and EthicsIndexColophon
What Our Readers Are Saying
Other books you might like
Computers and Internet » Computers Reference » General