Google top introduction to data science books and you'll find posts like "6 Books Every Data Scientist Should Keep Nearby" and "80 Best Data Science Books That Are Worthy Reading." These are great BUT who has time to read 80 Data Science books, or better yet, who wants to give up data coding time for data reading?
In David Robinson's post "Don't teach students the hard way first," David goes over why learning tidyverse first is preferable to learning base R first. To me, it's all about doing data things early from the start, which is why I recommend R for Data Science by Garrett Grolemund & Hadley Wickham (Link) The book is an excellent resource for data manipulation, graphics, and EDA (exploratory data analysis) in the tidyverse; it's the ultimate data science 101 book and jump-starts the R user to the ins and outs of data science. To illustrate why I prefer tidyverse over Base R, here's a simple example using the nba_draft_2015 dataset from the fivethirtyeight library which contains information such as player name, position, and draft year.
The first 6 rows of the data
Base R: Filter
WHAT I DON'T LIKE ABOUT BASE R
tidyverse: Filter
WHAT I LIKE ABOUT tidyverse
Which script do you prefer?
After running both Base R: Filter and tidyverse: Filter R scripts above, you'll find that they're actually producing identical code! The filters both result in data frames with 128 rows of players with a position of Center (C) In summary, I recommend R for Data Science by Garrett Grolemund & Hadley Wickham (Link) because it allows the R user to get started doing data science fast. It is also an added bonus that tidyverse code is readable and easy to understand. Your coworkers will thank you when it is time for code review ☺ Don't be the base r guy
1 Comment
10/6/2022 05:14:43 pm
Information career next. Art hundred door figure structure soldier be. Against serious parent floor pattern over movement huge.
Reply
Leave a Reply. |
gABEData Scientist Archives
January 2019
Categories |