Skip to content

R Programming – Useful Functions

August 2, 2014

Description

The R Programming Language  is a very useful language for anyone who does a lot of Data Science related activities. These activities can be reading in and analyzing data, to using either plotting functions or even doing data manipulation to get it into the form you want.

The purpose of this blog entry is to simply present some of the most commonly used and most useful functions that can be done in R.

 

Useful Functions

General

Comments
#Comments can be created with the '#' character
Set working Directory
setwd("/path/to/dir")
Summarize Data
#Summarize Entire Data Set
summary(data)

#Summarize Single Field in Data Set
summary(data$field)
Print to Console
print("Print String")
Install package
install.packages("library_name")
Import package
library(package_to_import_no_quotes)
Clear Console
Ctrl + L

Reading in Data

CSV
data = read.csv("file.csv")

Writing out Data

CSV
write.csv(data, file = "fileName.csv")

String Operations

String Contains
grepl("regex", string)
String Replace
gsub('regex', 'value_to_replace_with', string)
String toLowerCase
tolower(string)
String Concatenate
paste("value1", "value2")
"value1 value2"

paste("value1", "value2", sep="-")
"value1-value2"

Convert Data Types

To String
output = as.character(value)
To Integer
output = as.numeric(value)

Plotting

General Plot
plot(data$x, data$y)
Histogram
hist(data$x)

Data Manipulation

Filter
filteredData = data[data$test_field == "wanted_value", ]
Create New Field in Data Frame based on existing fields
data$newField = data$field1 + data$field2
Group By
groupedByData = aggregate(FIELD_TO_HAVE_FUN_EXECUTE_ON ~ GROUP_BY_FIELD, data=dataToGroupBy, FUN=sum)
Merge Data
mergedData = merge(xData, yData, by.x="xField", by.y="yField")
Combine 2 Lists
list1 = c(1,2,3)
list2 = c(4,5,6)
newList = c(list1, list2)
Combine by Column
cbind(1, 1:7)

[,1] [,2]
[1,]   1   1
[2,]   1   2
[3,]   1   3
[4,]   1   4
[5,]   1   5
[6,]   1   6
[7,]   1   7

Combine by row
rbind(1, 1:7)

[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]   1   1   1   1   1   1   1
[2,]   1   2   3   4   5   6   7

Rename Column
names(data)[which(names(data) %in% c("old_field_name"))] = "new_field_name"
Advertisements

From → Data Science, Guide

Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: