Module #7: Distribution Analysis using RStudio

For this weeks assignment I used RStudio once again and created a histogram based on distribution analysis using the mtcars data set.

A histogram is a visual representation of the distribution of a data set. Therefore, based on the shape of the histogram we can distinguish charateristics of our dataset visually.

In the histogram above I analyzed the distribution of the mpg provided by each vehicle in the mtcars data set. Our data shows a double-peaked or bimodal distribution. Below is the r code used to create the histogram.

hist(mtcars$mpg, main=”MPG of Mtcars Dataset”, xlab=”MPG”, col=”green”, xlim=c(10,35), breaks=20)

For this visual I decided to create a box and whiskers plot based on the horse power of the mtcars data set. The median horse power is displayed by the bold vertical line running down the middle of the red rectangle. The majority of these vehicles have about 100–175 hp and there is one outlier that is past the 300 hp mark and is shown by a dot. Below is the r code for the box and whisker plot.

boxplot(mtcars$hp, horizontal=TRUE, main=”Horse Power of Mtcars”, xlab=”HP”, col=”red”)

RStudio offers many more ways to show distributions and there are also many more ways to customize and be discriptive with the graphs.

Isaac Mendez

An Original Blog for my Visual Analytics Course