-1

screenshot of my csv file How would I organize the data from https://users.stat.ufl.edu/~winner/data/sexlierel.dat to make an accurate analysis? I am having trouble plotting the different types of data with the way it is given to me.

description: https://users.stat.ufl.edu/~winner/data/sexlierel.txt

```{r}
data_set <- read.csv("project_data.csv", header = TRUE)

names(data_set)

summary(data_set)

summary(data_set$Gender)

data=data.frame("Gender","Count")

```

I am trying to find the relationship with a scatterplot between the number of people in each category (count). I feel like this is difficult to do with the way the data is given. Is there a way I should rearrange my csv file?

```{r}
scatter=ggplot(data=data, aes("Gender", "Count")) + geom_point()
```
Rachel
  • 1
  • 1
  • 1
    Hi Rachel, welcome to StackOverflow. Can you be more specific with your question? What kind of plot are you trying to make? Have you tried anything already to build a plot? If so, provide that code too. Also, it will be easier to help if we have the data; see this for some help: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Harrison Jones Apr 26 '22 at 17:50
  • Try to keep your questions specific and try to cut out unrelated distractions. This will help answerers find the key problem. For example, your question is unrelated to r-markdown. You should omit the tag and remove the Rmd syntax. Even though your final report will use R markdown, that is not where your current problem is. – Michael Dewar Apr 27 '22 at 00:49
  • @rachel meredith - what are you actually trying to plot. If you drew the graph by hand with a piece of paper what would it look like? I haven't tested your data but I expect it has 1 and 2 on the x axis and some dots directly above those at various points. What are you expecting? – CALUM Polwart Apr 28 '22 at 18:26

2 Answers2

1

I don't think that data is a "true" CSV file. There are no commas or other delimiters.

you may need to look at read.tsv which is tab separated data?

CALUM Polwart
  • 497
  • 3
  • 5
  • I converted the data above into a csv file so I could rearrange the data if I needed to in excel. I am trying to do a scatterplot between the different categories using count as the number of people in each one. – Rachel Apr 27 '22 at 17:09
  • In your gender example do you simply want the total gender=1 and total gender=2? It is much more reprdocuable to do rearranging in R than in excel – CALUM Polwart Apr 28 '22 at 18:28
0

As @CALUM Polwart said, this is not a comma separated file. It is a fixed width file. You can also consider spaces as the delimiter. There are many packages with functions that can help. For example, you could use

library(data.table)
data_set <- fread("so/sexlierel.txt")

or

library(tidyverse)
data_set <- readr::read_table("so/sexlierel.txt")

You may want to set the column names when you read it. You could use

library(tidyverse)
data_set <- readr::read_table("so/sexlierel.txt", col_names = c("gender", "scale", "psm", "ptl", "religiosity", "count"))

or

library(tidyverse)
data_set <- readr::read_table("so/sexlierel.txt")
names(data_set) <- c("gender", "scale", "psm", "ptl", "religiosity", "count")
Michael Dewar
  • 2,553
  • 1
  • 6
  • 22