Learning Club 05-07: Starting to love rmarkdown (Naive Bayes, Clustering, Linear Regression)

I remember when I had an R course at university I was really not a fan of rmarkdown and knitr. But since I participate in a Learning Club, where people are encouraged to document and present their code, data and results, I started to love it. Prior to that I’ve always documented my assignments at the university either … Continue reading Learning Club 05-07: Starting to love rmarkdown (Naive Bayes, Clustering, Linear Regression)

Netflix Socks Part 1: Set up the Arduino

Some weeks ago Netflix posted a cool project on their website, Netflix socks that will pause your show when you fall asleep. Since I fall asleep a lot during watching TV, I thought this was a cool idea for a small project. In this post I’ll show you how to set up the Arduino for … Continue reading Netflix Socks Part 1: Set up the Arduino

Data Analysis with Microsoft Excel: Tables

Auf ambassadorbase.at ist mein Artikel auf Deutsch verfügbar. In my job and my studies I recently finished I work with lots of different data sources and you will also meet all of them throughout your career as a data scientist. Data can be given to you as an SQL dump, XML files and many other … Continue reading Data Analysis with Microsoft Excel: Tables

Finding data sets Part 2: TV, music, book ratings and sports data

The first part gave a more general overview on where to get data. This section will give you specific data sources, e.g. if you like sports, movies, books, … and so on. Over the next couple of weeks you’ll find these posts on my blog: General data sources TV, music, book ratings and sports data … Continue reading Finding data sets Part 2: TV, music, book ratings and sports data

Finding data sets Part 1: General data sources

I often encounter interesting algorithms or R packages which I want to test. The nice ones provide data for testing but often it is only dummy data. To get a good understanding of the method and its limitations real data might be required. Sometimes I would also like to explore data I have not used … Continue reading Finding data sets Part 1: General data sources

Easy and efficient way to log overwriting of a directory in SQL Server Integration Services (SSIS)

Recently I had the problem that I had a File System Task that moved a file but whenever the file was already there the package failed. So I set OverwriteDestination to TRUE. But now I lost complete control over which files were just moved and which did overwrite some already existing directory. My desired result … Continue reading Easy and efficient way to log overwriting of a directory in SQL Server Integration Services (SSIS)

Workaround: Restore failed (MSSQL Server)

That a database restore fails, can be due to several reasons. I actually didn’t figure out why it didn’t work for me but I found a workaround. Since it is a very similar process I will also show you how you can copy a database on the same server with a different name. Idea 1: … Continue reading Workaround: Restore failed (MSSQL Server)

Use R to connect to twitter and create a wordcloud of your tweets

Recently I wanted to create a wordcloud of my tweets and do further analysis. In this post I am going to show you how to connect to twitter in R and how to make a wordcloud from your tweets. To follow this tutorial, you need a Twitter account. First steps in R Install required libraries … Continue reading Use R to connect to twitter and create a wordcloud of your tweets

Learning Club 01: Find and explore a dataset

The first activity of the data science learning club I am participating in is to find and explore a dataset. I already described the data I found and will use in the last post. You can follow all my learning club related activities here. The tasks of this activity are (quoted from the thread above): … Continue reading Learning Club 01: Find and explore a dataset

Learning Club 00: Set up your development environment (Getting started with R)

A few weeks ago I became aware of Renee’s (owner of the blog Becoming a data scientist) plan to start a data science learning club and I thought it was a cool idea. In the learning club she will post activities and the first one was about setting up your development environment: Activity 00: Set … Continue reading Learning Club 00: Set up your development environment (Getting started with R)