0

I'm coming because, I don't need help to realize the exercise, but I need help on an error that I can't fix..

This is the subject:

In R the more appropriate indicator for missing data is “NA” (not available). Therefore, replace each occurrence of “?” with “NA”. a. For this exercise, create an R data frame for the mammographic data using only datapoints that have no missing values. This can be done using the complete.cases function which inputs a data frame and returns a Boolean vector v, where v[i] equals TRUE iff the i the data-frame sample is complete (meaning it does not possess an NA). For example, if the data-frame is stored in mammogram.frame, then mammogram2.frame = mammogram.frame[complete.cases(mammogram.frame),] creates a new data frame called mammogram2.frame that has all the complete mammogram data samples.

So I coded that:

mammogram = read.table("https://archive.ics.uci.edu/ml/machine-learning-databases/mammographic-masses/mammographic_masses.data",  
        sep=",", 
        col.names=c("Birads","Age","Shape","Margin","Density","Severity"), 
        fill=TRUE, 
        strip.white=TRUE)

#Replace N/A by -1
mammogram2.frame = mammogram.frame[complete.cases(mammogram.frame),]

#Display data frame
mammogram2

However I get this error:

> mammogram2.frame = mammogram.frame[complete.cases(mammogram.frame),] Error: object 'mammogram.frame' not found

I can't find on internet any solution about it, I tried lot of stuff but the missing values are still '?'

Thank

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Emixam23
  • 3,854
  • 8
  • 50
  • 107
  • 4
    You read your data into `mammogram` but are then trying to operate on `mammogram.frame`. It's a typo that R explicitly reported: `Error: object 'mammogram.frame' not found` – thelatemail Feb 08 '17 at 23:25
  • 1
    Also, you haven't converted your `"?"` cases to `NA` before attempting to do `complete.cases`, so your code would not work even if you were working on the right dataset. – thelatemail Feb 08 '17 at 23:31
  • Thank you but then it prints code and not the values, why then? mammogram2 should be a list/tab/array right? – Emixam23 Feb 08 '17 at 23:32
  • I am completely lost. `mammogram.frame`, `mammogram2.frame` and `mammogram2` don't exist going by the code you have shown. It is impossible they were created because your code errors out when attempting to make them. Whoever wrote this exercise shouldn't really be naming `data.frame`s things like `name.frame`. It's confusing and redundant. You just want to do `mammogram2 = mammogram[complete.cases(mammogram),]` But as noted above, you need to replace `"?"` with `NA` for `complete.cases` to work as intended. – thelatemail Feb 09 '17 at 00:07
  • I just do `mammogram = data.frame`. I'm searching to change those "?" But I can't find the way how to do it. The value isn't na so any .na() function doesn't affect it.. – Emixam23 Feb 09 '17 at 00:14
  • See http://stackoverflow.com/questions/3357743/replacing-character-values-with-na-in-a-data-frame – thelatemail Feb 09 '17 at 00:19
  • 1
    @CyrusMohammadian - `sum(mammogram == "?")` returns `162` for me. – thelatemail Feb 09 '17 at 00:19
  • 2
    You can't keep shifting the goal posts with your question. It was about `complete.cases`, then about replacing with `NA`, now it's about preparing data for some svm analysis. – thelatemail Feb 09 '17 at 00:34
  • 1
    Don't edit questions to ask new, different questions. Start a new post instead. – MrFlick Feb 09 '17 at 05:03
  • Ok then I will, I'm sorry I'm new and I'm lost between the documentation and the exercise.. – Emixam23 Feb 09 '17 at 05:37
  • I asked another question which is more clear and easier to read/understand. -> http://stackoverflow.com/questions/42152063/r-remplacing-values-of-data-frame – Emixam23 Feb 10 '17 at 05:26

0 Answers0