0

I am parsing XML-TEI file. Following this explanation for different length of vectors, I have this script:

cat_xmlID <- getNodeSet(doc, "//ns:category/@xml:id", ns)
cat_xmlID

role=unlist(cat_xmlID[27:29])
context=unlist(cat_xmlID[9:25]) 
sphere=unlist(cat_xmlID[4:7])
sex=unlist(cat_xmlID[49:50]) 


n <- max(length(context), length(role), length(sphere), length(sex))
length(context) <-n 
length(role) <-n 
length(sphere) <-n 
length(sex) <-n 
catTab_ObjV=cbind(role, context, sphere, sex)
catTab_ObjV

Result:

    role          context        sphere      sex        
id "active"      "ritual"       "inside"    "male_Sx"  
id "passive"     "battle"       "outside"   "female_Sx"
id "both_active" "singleCombat" "unknown_S" NA         
   NA            "prayer"       "B_ctx_S"   NA         
   NA            "assembly"     NA          NA         
   NA            "feast"        NA          NA         
   NA            "wedding"      NA          NA         
   NA            "burial"       NA          NA         
   NA            "seduction"    NA          NA         
   NA            "meeting"      NA          NA         
   NA            "complaint"    NA          NA         
   NA            "lawsuit"      NA          NA         
   NA            "threat"       NA          NA         
   NA            "revenge"      NA          NA         
   NA            "visit"        NA          NA         
   NA            "unknown_C"    NA          NA         
   NA            "B_ctx_C"      NA          NA  

Of course, I have a lot of NAs. I didn't understand the explanation on how to get rid of NAs in the above-mentioned post and in several other posts — none give a relevant explanation for NA for cbind function. I must say I am a beginner in R...

Can you help me? In advance, thank you.

Community
  • 1
  • 1
Vanessa
  • 121
  • 12
  • Can you post sample data? – Aurèle Mar 17 '17 at 23:37
  • I have added sample data. – Vanessa Mar 18 '17 at 00:25
  • What is your desired output? Is it a rectangular table, if so what do you want instead of NAs? Or is it a ragged array, in which case replace cbind(.....) with list(role = role, context = context, .....)? – Aurèle Mar 18 '17 at 00:42
  • I want no value at all, so no NA. The goal of this template is to list the attributes of each category (role, context, sphere, sex). So NA means nothing here. It will not to be used for computation, but only to display what I am going to use for my analytical investigation. – Vanessa Mar 18 '17 at 01:02
  • If my answer solved your issue, could you please mark it as accepted? – Aurèle Mar 29 '17 at 11:37

1 Answers1

0

The short answer is: you cannot (get rid of NAs in cbind()), because cbind() is meant to output a rectangular table (data.frame or matrix), and those mandate consistent length of all columns. NA is the closest you can get to "no value". There is also NULL, but NULL cannot populate a cell of a table.

It seems a data.frame or a matrix is not the appropriate data structure here, which could be a ragged array in the form of a list: list(role = role, context = context, sphere = sphere, sex = sex).

But your comment makes it clearer that the real question is about display and nothing else, so for this particular case only, since lists are not well suited for clean display, let's keep it a matrix.

So it really depends on how you intend to display the data in your final output (LaTeX? HTML? copy-paste from a print()?) For instance, you could go with a "hacky" replacement of NA with empty strings "" such as: catTab_ObjV[is.na(catTab_ObjV)] <- ""

Or use a dedicated package such as xtable, which doesn't even require you to "remove" NAs:

library(xtable)

foo <- c("a", "b", "c")
bar <- c("d", "e", NA)
tab <- cbind(foo, bar)
tab
#      foo bar
# [1,] "a" "d"
# [2,] "b" "e"
# [3,] "c" NA 

print.xtable(xtable(tab), type = "html")
# <!-- html table generated in R 3.3.2 by xtable 1.8-2 package -->
# <!-- Sat Mar 18 11:54:06 2017 -->
# <table border=1>
# <tr> <th>  </th> <th> foo </th> <th> bar </th>  </tr>
# <tr> <td align="right"> 1 </td> <td> a </td> <td> d </td> </tr>
# <tr> <td align="right"> 2 </td> <td> b </td> <td> e </td> </tr>
# <tr> <td align="right"> 3 </td> <td> c </td> <td>  </td> </tr>
# </table>

exemple xtable

Aurèle
  • 12,545
  • 1
  • 31
  • 49
  • Thanks for your reply and my apologies for my late reply. I want to print on screen, so I use `write.csv2` but still I have the **NA**. I have tried your solution, but it doesn't work, probably because it's made or LateX. – Vanessa Apr 05 '17 at 14:24
  • `write.csv2` has a `na` argument, which defaults to `"NA"`. Use `write.csv2(catTab_ObjV, "path/to/csv", na = "")` – Aurèle Apr 05 '17 at 14:35
  • I follow your example: `write.csv2(catTab_ObjV,file="categories' template of objective variables taxonomy.csv", na = "")`, but still I have the `"NA"` – Vanessa Apr 05 '17 at 15:14
  • How do you open the resulting csv? – Aurèle Apr 05 '17 at 15:39
  • My apologies! I first looked with data global environment of R. But after reading your last question, I have opened it with Excel. It works!! Thanks so much. – Vanessa Apr 05 '17 at 15:58