Home
>
Data Science
>
Update Pulling Data in R for Smart Analysis

March 31, 2022 by Phu Nguyen

Update Pulling Data in R for Smart Analysis

Main Contents:

Pulling Data in R for Smart Analysis is an article under the topic Data Science Many of you are most interested in today !! Today, let’s InApps.net learn Pulling Data in R for Smart Analysis in today’s post !

Different Ways to Handle Data in R

R can read data from:

Spread sheets
Excel sheets
Databases
Images
Text files
Many other special formats

Smartanalysis_fig2

Get Data Into R

Whether data is local or available on the Web, with R programming you will be able to successfully import data in different formats.

Read Data From Files

Ideally, data is available on the file stored within the system. All that is required to read or write this data is identification of the current directory in which the file is stored.

Setting Directories

One of the foremost things required is to set up the working directory.
To identify the directory(folder) use the command getwd()
On the linux pc, output is displayed with the path as follows:

1 2	> getwd() [1] “/home/test”

On Windows it is depicted as:

1	c:datatest

To set the directory in which the data file is saved, use the command setwd (“path”) where path has directories with subdirectories where the datafile is located. For example, if data is in file temp.txt and the file is in folder /home/test/example/ then issue:

1	setwd(“/home/test/example/”)

On Windows it will be represented as:

1	setwd(“C: mydatatest”)

It is necessary to know the folder in which the file is saved.

Reading Text File

Data contained in text files can be read in R session using scan command.
Remember to use option what=”” with scan command which indicates that input will be of character data type.
For this session, I have created the textsample.txt file which can be read in R session.

1	> fdata<– scan(“textsample.txt”,what=“”)

Now, fdata is to hold the data from the .txt file.
Let’s review the few first entries with command head(fdata):

1 2	> head(fdata) [1] “this” “is” “a” “sample” “file” “generated”

To change to lower-case use tolower.

1	> fdata<–tolower(fdata)

There are many words in the file that are stored separately. Some of the words are also repetitive.

To count the frequency of the words use

1	> ft<–table(fdata)

To view a pie graph of ft use command

> pie(ft)

Smartanalysis_fig3

From the above graph, the words “file” and “the” have the highest frequency.

The maximum frequency of the words in ft can be found directly by using the max command.

1 2	> max(ft) [1] 4

Look at the output of the command.

> head(ft)

fdata

a be by can character command

1 3 1 2 1 1

The plot shows the words against frequency graph.

1	> dotchart(ft)

Smartanalysis_fig4

Commands to Read Data From File

It is not unknown that some of the most common data files available are csv and .xls format files, where csv is a file with comma separated values and xls is the file extension of an excel file.

Smartanalysis_fig5

Some of the most common data file formats that can be handled through commands are read.csv and read.table:

> read.csv(“test.csv”,header=TRUE)

1 Status Age V1 V2 V3 V4

2 P 23646 45190 50333 55166 56271

3 CC 26174 35535 38227 37911 41184

4 CC 27723 25691 25712 26144 26398

5 CC 27193 30949 29693 29754 30772

6 CC 24370 50542 51966 54341 54273

7 CC 28359 58591 58803 59435 61292

8 CC 25136 45801 45389 47197 47126

> read.table(“test.csv”,header=TRUE)

Status Age V1 V2 V3 V4

1 P 23646 45190 50333 55166 56271

2 CC 26174 35535 38227 37911 41184

3 CC 27723 25691 25712 26144 26398

4 CC 27193 30949 29693 29754 30772

5 CC 24370 50542 51966 54341 54273

6 CC 28359 58591 58803 59435 61292

7 CC 25136 45801 45389 47197 47126

Fetch Data Directly From the Web

It is possible to read data directly from the Web. The data available in the Web link or URL will be directly fetched through R in the memory. Data is set on the network at http://lib.stat.cmu.edu/datasets/csb/ch3a.dat.

Read the data directly with read.csv or read.table command.

<br />
data1&lt;-read.table( “http://lib.stat.cmu.edu/datasets/csb/ch3a.dat”)<br />
&gt; head(data1)<br />
        V1    V2    V3    V4    V5<br />
1 07/08/91 47.33 52.82 19.58 17.78<br />
2 07/09/91 42.58 53.25  9.42  6.06<br />
3 07/10/91 59.55 56.32 19.83 14.81<br />
4 07/11/91 52.92 50.06 15.08  9.75<br />
5 07/12/91 55.25 59.50 28.75 27.21<br />
6 07/13/91 54.75 56.80 27.83 20.84</p>
<div style="clear:both; margin-top:0em; margin-bottom:1em;"><a href="https://www.inapps.net/update-data-fabric-or-data-mesh-find-the-happy-medium/" target="_blank" rel="dofollow follow noopener noreferrer" class="u2036dd6f7b8fa52f03fe6e0b77efe5f8" data-wpel-link="internal"><style> .u2036dd6f7b8fa52f03fe6e0b77efe5f8 { padding:0px; margin: 0; padding-top:1em!important; padding-bottom:1em!important; width:100%; display: block; font-weight:bold; background-color:inherit; border:0!important; border-left:4px solid inherit!important; text-decoration:none; } .u2036dd6f7b8fa52f03fe6e0b77efe5f8:active, .u2036dd6f7b8fa52f03fe6e0b77efe5f8:hover { opacity: 1; transition: opacity 250ms; webkit-transition: opacity 250ms; text-decoration:none; } .u2036dd6f7b8fa52f03fe6e0b77efe5f8 { transition: background-color 250ms; webkit-transition: background-color 250ms; opacity: 1; transition: opacity 250ms; webkit-transition: opacity 250ms; } .u2036dd6f7b8fa52f03fe6e0b77efe5f8 .ctaText { font-weight:bold; color:#141414; text-decoration:none; font-size: 16px; } .u2036dd6f7b8fa52f03fe6e0b77efe5f8 .postTitle { color:#E67E22; text-decoration: underline!important; font-size: 16px; } .u2036dd6f7b8fa52f03fe6e0b77efe5f8:hover .postTitle { text-decoration: underline!important; } </style><div style="padding-left:1em; padding-right:1em;"><span class="ctaText">Read More: </span>  <span class="postTitle">Update Data Fabric or Data Mesh? Find the Happy Medium</span></div></a></div><p>data2&lt;-read.csv( “http://lib.stat.cmu.edu/datasets/csb/ch3a.dat”)<br />&gt; head(data2)<br /> X07.08.91….47.33….52.82….19.58….17.78<br />1 07/09/91 42.58 53.25 9.42 6.06<br />2 07/10/91 59.55 56.32 19.83 14.81<br />3 07/11/91 52.92 50.06 15.08 9.75<br />4 07/12/91 55.25 59.50 28.75 27.21<br />5 07/13/91 54.75 56.80 27.83 20.84<br />6 07/14/91 35.33 40.88 11.83 15.65

data1<–read.table( “http://lib.stat.cmu.edu/datasets/csb/ch3a.dat”)

> head(data1)

V1 V2 V3 V4 V5

1 07/08/91 47.33 52.82 19.58 17.78

2 07/09/91 42.58 53.25 9.42 6.06

3 07/10/91 59.55 56.32 19.83 14.81

4 07/11/91 52.92 50.06 15.08 9.75

5 07/12/91 55.25 59.50 28.75 27.21

6 07/13/91 54.75 56.80 27.83 20.84

data2<–read.csv( “http://lib.stat.cmu.edu/datasets/csb/ch3a.dat”)

> head(data2)

X07.08.91….47.33….52.82….19.58….17.78

1 07/09/91 42.58 53.25 9.42 6.06

2 07/10/91 59.55 56.32 19.83 14.81

3 07/11/91 52.92 50.06 15.08 9.75

4 07/12/91 55.25 59.50 28.75 27.21

5 07/13/91 54.75 56.80 27.83 20.84

6 07/14/91 35.33 40.88 11.83 15.65

data1 and data2 are objects that hold the same file with different formats.

Reading Spreadsheets

To read spreadsheet data we need to install the library gdata.

1 2	> install.packages(“gdata”) > library(gdata)

With this package the new command read.xls will be available.
The data file test.xls can be read with read.xls(“test.xls”).

Fill Spread Sheet Type Data Through the Editor in R

1	x<–edit(as.data.frame(NULL))

Smartanalysis_fig6

Datasets in R

One can pull datasets available in R with data() which will show the lists of data sets available in R.

1	data(Airpassengers)

To see the description of the data use the command:

1	help(AirPassengers)

To see the actual data use head command:

1 2	> head(AirPassengers) [1] 112 118 132 129 121 135

More about data can be found at r-manual
Here is the Github repo link for codes we have used in this post.

Featured image via Flickr Creative Commons.

Manjusha Joshi is a freelancer for free open source software in scientific computing. She is a mathematician and a member of the Pune Linux User group.

Source: InApps.net

Rate this post

Phu Nguyen

As a Senior Tech Enthusiast, I bring a decade of experience to the realm of tech writing, blending deep industry knowledge with a passion for storytelling. With expertise in software development to emerging tech trends like AI and IoT—my articles not only inform but also inspire. My journey in tech writing has been marked by a commitment to accuracy, clarity, and engaging storytelling, making me a trusted voice in the tech community.

Let’s create the next big thing together!

Coming together is a beginning. Keeping together is progress. Working together is success.

Let’s talk

Recommended

Tech News

February 11, 2025 by Tam Ho

Update Pulling Data in R for Smart Analysis

Read more about Pulling Data in R for Smart Analysis at Wikipedia

Different Ways to Handle Data in R

Get Data Into R