1. Section 1: Introduction

What is the data science? 

drawing useful conclusions from data using computation as our primary tool.

And data science, as a practice, has three core activities: 

  1. Exploration is figuring out what patterns exist in the data.
  2. When you have many observations about some phenomenon, what can you conclude about the phenomenon itself?
  3. Oftentimes instead of just looking at large tables of numbers, we'll draw data visualizations because it's much easier to interpret lot of information at once if it's portrayed in some kind of visual way.
  4. Once we've found a pattern, we need to perform statistical inference, and that's because some patterns are there just by chance and some are there because they're a reflection of some underlying process that's really interesting about the world.
  5. So the goal of statistical
  6. inference is to quantify whether
  7. the patterns that we observe
  8. during the exploration phase are
  9. reliable.
  10. If we collected more data, would
  11. we see this pattern again or
  12. not?
  13. The primary tool we have is
  14. randomization because by
  15. simulating random processes, we
  16. can see what kinds of patterns
  17. appear just by chance.
  18. And if the pattern we observe is
  19. not the kind of thing that could
  20. just appear by chance, then we
  21. can conclude that it's because
  22. of some robust or reliable
  23. pattern in the underlying
  24. phenomenon we want to study.
  25. And finally, we'll perform
  26. prediction.
  27. This is where we have partial
  28. information about something we
  29. want to know, and we want to
  30. guess about the things we don't
  31. know yet.
  32. Here we're making informed
  33. guesses, quantitative guesses
  34. using a discipline called
  35. machine learning.
  36. Normally when we write programs,
  37. we just focus on the particular
  38. logic of what the computer
  39. should do, but machine learning
  40. is about not programming every
  41. detail, but instead using the
  42. data to make decisions or choice
  43. within that program.
  44. So when we write a program, for
  45. instance, to recognize speech or
  46. automatically translate
  47. languages or control a car or a
  48. robot, we don't actually write
  49. down all the details of what to
  50. do, but instead use examples
  51. from the world to help computers
  52. automatically learn how to
  53. behave.
  54. And that's a form of prediction,
  55. one that we'll talk about in
  56. this course.
  57. And these three stages
  58. correspond to how we'll approach
  59. the material in this course.
  60. We'll first talk about how to
  61. identify patterns, then we'll
  62. talk about quantifying whether
  63. those patterns are reliable.
  64. And finally, based on the
  65. patterns we've discovered, the
  66. reliable ones can help us make
  67. informed guesses about the
  68. information that we wish we
  69. knew.
  70. Once you can do all that,
  71. you're well on your way to being
  72. a data scientist.
  73. Now in the process of doing all
  74. these things, it's important
  75. that you learn how to program a
  76. computer, because computing
  77. underlies each step of the way
  78. and learning to program is just
  79. an essential part of
  80. participating in this
  81. discipline.

2. Section 2:  

I BUILT MY SITE FOR FREE USING