0

I have data that is saved to a text file with headers.

grade (X)     number of students (Y)
100                   7
99                    4
98                    9
97                    14
96                    11
95                    9
94                    15

and so on until

5                     2
4                     3
3                     1
2                     1
1                     1

I tried loading the data using this code:

> df <- read.table("data.txt",
+                  header = TRUE)

It gave me this error:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : 
  line 1 did not have 7 elements

This is the very first time I use R. Is there a simple way or a code sample to do it?

M--
  • 25,431
  • 8
  • 61
  • 93
Hussam Hallak
  • 303
  • 4
  • 21
  • 3
    What is your separator? By default, it would be " " (space) so that you may want when importing file to specify delimiter with `sep=` . For tabulation, use `sep="\t"` – Eric Lecoutre Feb 08 '17 at 20:32
  • 1
    This question doesn't really have anything to do with histograms yet. You need to focus on just reading in the data correctly. R expects well formed "rectangular" data files. your columns should have a consistent delimiter between them. Having column names with spaces and punctuation can make things difficult. Where did this raw data come from? – MrFlick Feb 08 '17 at 21:17
  • To clarify, when reading in `grade (X) number of students (Y)`, it finds many column headings. Then when it comes to the 1st line of the data (second line of the file) it only finds two values, which is why it generates an error. Since you're using RStudio, I'd recommend using the import dataset functionality under the 'Environment' tab and pay close attention to the code that RStudio is suggesting. – Axeman Feb 09 '17 at 07:47
  • @MrFlick I generated this data from a python script I wrote. I can change how it is formatted. What is the format that will make it easy to read the data in R? – Hussam Hallak Feb 09 '17 at 13:15
  • Tab separated values would be best (or you can use spaces but take out leading spaces, and don't bother trying to align column values). Choose column names without spaces or punctuation. That will make things easier for R. Then use `read.table()` with `sep="\t"`. – MrFlick Feb 09 '17 at 15:42
  • So you mean something like: 1007 then next line 994.....etc? – Hussam Hallak Feb 09 '17 at 17:35
  • @HussamHallak yes. – MrFlick Feb 09 '17 at 18:43
  • @MrFlick thanks.. Then this code should work to generate a simple histogram correct? # Read values s_data <- read.table("C:/R/data.txt", header=T, sep="\t") # Concatenate the vectors s <- c(s_data$grades, s_data$count) hist(s_data, col=heat.colors, breaks=1) – Hussam Hallak Feb 09 '17 at 19:37

0 Answers0