Sods17

Ad Hoc Testing

TL;DR testthat provides a convenient and easy to use unit test framework for R. While traditionally used as a formal part of package development, it can also be used interactively. Ad hoc test suites can be run as functions within an R session to quickly test the impact of code changes. I use this workflow when writing parsing functions for HTML data. Introduction Like all Hadley Wickham creations, testthat is a wonderful tool that generally improves the lives of R users.

Scraping Friends

TL;DR HTML data can be messy and difficult to work with. Tools from the tidyverse (like dlpyr, purrr, and rvest) make this process much easier, althought creating clean data from HTML takes time and patience. Ad hoc testing can be used to quickly evaluate the accuracy of an HTML parsing function. Clean data is well worth the time and effort required to obtain/create it. Getting Started This post outlines the process of scraping and cleaning the scripts to every Friends TV episode.

Summer of Data Science 2017

I first learned of #SoDS17 through Mara Averick and was further enlighted by Data Science Renee’s tweet Here's how to participate in the Summer of Data Science #SoDS17: — Data Science Renee (@BecomingDataSci) May 29, 2017 and accompanying blog post. As described, the basic premis is simple: set a goal to learn something new in the broad data science domain and make an effort to share the journey with others.