Basic R Programming | Data Manipulation in R with dplyr


dplyr is a powerful R-package to transform and summarize tabular data with rows and columns. dplyr is not a part of the default package of R.

To install it separately; use the following command:
install.packages("dplyr")
To load it into the memory; use the following command:
library(dplyr)

The package contains a set of functions (or “verbs”) that perform common data manipulation operations as below, we would use the mtcars dataset available in dplyr package itself.
Select allows you to select specific columns from large data sets. 
Filter your data to select specific rows based on certain condition, this enables easy filtering, zoom in, and zoom out of relevant data.
Arrange the rows of our data into an order by sorting the data in ascending or descending order based on a column
Mutate helps add new variables to an existing data set

Summarise chunks of you data in some way. This summarizes multiple values to a single value in a dataset.

Load dataset in R:
R allows data import from a Comma Separated Values (CSV), Excel and Tables format as well.  
Lets load a dataframe from a csv file and view few records to understand the data in a DataFrame

 Understanding the remission dataset
General Functions
 Statistical Functions

No comments:

Post a Comment

Note: only a member of this blog may post a comment.