I have this example dataset:
sub_id,age,country,score
{subID},{Age},{CountryOfOrigin},{Qscore}
1,23,UK,15
2,28,uk,19
3,40,United Kingdom,33
4,19,france,21
5,36,Italy,16
6,24,UK,18
7,26,greece,16
8,22,italy,15
I'd like to read this in and perform some computations/analyses. I want the header row, but the row causes problems. I tried reading it in and dropping the first row ({with these}... it's a nonsense row), but because of the mixed datatypes within the column when it was read in, R won't let me perform the computations on anything because the data are not numeric anymore.
This is an example of a much larger dataframe, so I can't do that and manually specify the columns that I want to change to numeric.
It seems like the best solution would be to read the csv file in, with the header, but skip the first row.
df <- read.csv('scores.csv',
header=TRUE,
skip=1)
This works, but it converts all of my column names! For example df$Qscore
becomes df$X.Qscore.
, which is obviously not ideal. I can at least perform the computations on that, but I don't know what I'm doing wrong.
I also tried reading in just the headers and then the data without the headers, and sticking them together but there were lots of issues with that too. This has to be such a common issue...
Note: I'm new to R and I have an issue that seems like it would be very common, but I'm unable to find the answer on here (probably because I don't know what to search for?), so apologies if this is a massive duplicate...