1

I am a beginner in R and I have been working on a data that has 2075260 rows and ten columns. The file has a .txt extension. While reading the dataset by read.csv or read.table and running str on the data frame I get this message.

data <- read.csv("mydata.txt")

 str(data)

'data.frame':   2075260 obs. of  1 variable:
 $ V1: Factor w/ 2075260 levels "1/1/2007;00:00:00;2.580;0.136;241.970;10.600;0.000;0.000;0.000",..: 2075260 491041 491042 491043 491044 491045 491046 491047 491048 491049 ...

I want store this data in a dataframe that has 2075260 obs and ten variables but there appears to be a problem that I am not able to figure out. I searched similar questions asked but could'nt find the answer. Your answer will be very much appreciated!

Regards,

cderv
  • 6,272
  • 1
  • 21
  • 31
Shapoor Hamid
  • 21
  • 1
  • 1
  • 3
  • Does read.csv("mydata.txt",sep=";") work? – Florian Jul 15 '17 at 13:19
  • use `read.csv2` as you have `;` separator. You can use `read.table` and custom your parameter for your file too. (dec = "." and sep = ";" it seems.) Do not forget to read the help page for a function in R, you'll find plenty of information – cderv Jul 15 '17 at 13:20
  • Possible duplicate of [how to read text file into R](https://stackoverflow.com/questions/20650037/how-to-read-text-file-into-r) – cderv Jul 15 '17 at 13:21
  • The read.table function runs the data and produces the above result. – Shapoor Hamid Jul 15 '17 at 13:21
  • First open your file with a text editor, see if there are **header**, and what is the real **separator**. Then use `read.table(header = TRUE or FALSE, sep = "real separator")`. – xtluo Jul 15 '17 at 13:39
  • Hey! it worked! I found the part I was missing! Thanks for your assistance! It was very much helpful! – Shapoor Hamid Jul 15 '17 at 13:43

1 Answers1

4

You should use read.table for a .txt file, read.csv should be used for .csv files instead. However use read.table:

data <- read.table("mydata.txt",sep=" ",header=T)

With sep you specify the "character" that appears between 2 columns, for e.g ',' or space (" ") or a tab ("\t"). Also with header you specify if in your data is present a line that contains the names for each columns (it's the first line).

Will
  • 1,619
  • 5
  • 23