0

I have a five CSV file of data as shown below. It have first row "Category: All categories" and in the second row the column name Name and Marks.

Category: All categories    

Name Marks
Mohit 100
Raman 71
Kaveri 45
William 42
Ram Pravesh 37

I want to delete first row of the data so that it looks like this for all five files together.

Student Score
Mohit 100
Raman 71
Kaveri 45
William 42
Ram Pravesh 37

I am doing it manually, I believe there might be some short code to this problem.

djMohit
  • 151
  • 1
  • 10

1 Answers1

3

you can specify column names and how many lines to skip directly in read.csv.

For example below:

read.csv(file = "yourfile.csv",
         skip = 3, # How many lines you want to skip when you read it
         header = FALSE, # Skip the header too
         col.names = c("Student", "Score"), # Supply your own column names
         stringsAsFactors = FALSE
         )

For a full reproducible example:

# Generate data in text format
raw_file <-
  '
  Category: All categories    

Name, Marks
Mohit, 100
Raman, 71
Kaveri, 45
William, 42
Ram Pravesh, 37
'

# make a temp file to place data
temp_file <- tempfile(fileext = ".csv")

# write the temp file
writeLines(raw_file,con = temp_file)

read.csv(file = temp_file,
         skip = 4, # How many lines you want to skip when you read it
         header = FALSE, # Skip the header too
         col.names = c("Student", "Score"), # Supply your own column names
         stringsAsFactors = FALSE
)

This will yield the following:

      Student Score
1       Mohit   100
2       Raman    71
3      Kaveri    45
4     William    42
5 Ram Pravesh    37

You also mentioned reading in multiple files:

# Get all the files that end in a csv in a given folder that you specify
files_to_read <- list.files(path = "some_path", pattern = ".csv", full.names = T)

# I like `purrr` here because it makes a few things easier
# Read in and row bind all csv to return a single data frame
library(purrr)
out <- map_dfr(files_to_read, ~read.csv(file = .x,
         skip = 4, # How many lines you want to skip when you read it
         header = FALSE, # Skip the header too
         col.names = c("Student", "Score"), # Supply your own column names
         stringsAsFactors = FALSE
))
MDEWITT
  • 2,338
  • 2
  • 12
  • 23
  • how to save these files as separate file.... The above will merge all files in a single file .. I want the file to saved as multiple file. – djMohit Nov 27 '19 at 05:15
  • Instead of `map_dfr` just use `map`. That will return a list. If you want to save them out, then map(out, write.csv) @djM – MDEWITT Nov 27 '19 at 12:45