0

So I need to find the total number of issues by each author. Firstly I used this to separate the Author, Volume and its Issue. They are in this format

Edit: Was able to solve this. Appreciate the helps! Here's how I got it done.

meannation1 <- aggregate (Documents~Author, summation, length)
q7 <- merge(meannation,meannation1)
q7$Publication_Productivity <- ifelse(q7$Documents <= 1, '1', ifelse(q7$Documents <= 3, '2', '3'))
names(q7) <- c("Authors", "tf-Mean", "tfidf-Mean", "Total Number of Publications", "Publication Productivity")

I merged the data frame from my previous question that I created to find the mean, and then once the new data frame was created I just renamed the columns!

darkpunk
  • 17
  • 9
  • Just merge the summary/aggregated info back onto the original dataframe: `paperdata <- merge(paperdata, aggreIssues, by="Author", all.x=TRUE)` – DanY Oct 06 '18 at 01:46
  • can i use this to add to the new dataframe since they will have new columns? – darkpunk Oct 06 '18 at 01:51
  • If I understand your question in the last comment correctly, then yes, that is exactly what merge is designed to do. Read up on it [here](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right). – DanY Oct 06 '18 at 01:54
  • Yes, I tried that but then it just repeats all my authors like this: `Abbe.Lianne 0.015376045 0.062490348 2 3 2 Abbe.Lianne 0.015376045 0.062490348 2 3 3 Abbe.Lianne 0.015376045 0.062490348 2 3` doesn't seem to consolidate them. Any idea why? – darkpunk Oct 06 '18 at 02:05
  • Post a small version of your dataset(s) and the desired outcome, and me or someone else can help you. – DanY Oct 06 '18 at 02:13
  • Welcome to SO! Please, provide a [mcve] which may help others to understand and reproduce your problem. Thank you. – Uwe Oct 06 '18 at 05:14

1 Answers1

1

I can't really follow your example above. However, here's an example with data.table that should help you figure out what you're trying to do:

#create example data
df <- data.frame(
    letters = c("a", "a", "a", "b", "b", "b"),
    ints    = c(1, 4, 1, 2, 2, 6),
    nums    = seq(from=1.1, length.out=6)
)

# convert to a data.table
library(data.table)
setDT(df)

# calculate and append "mean of ints" column by letter
df[ , mean_ints := mean(ints), by=letters]

# calculate and append "sum of nums" column by letter
df[ , sum_nums := sum(nums), by=letters]

# show result
df

#   letters ints nums mean_ints sum_nums
#1:       a    1  1.1  2.000000      6.3
#2:       a    4  2.1  2.000000      6.3
#3:       a    1  3.1  2.000000      6.3
#4:       b    2  4.1  3.333333     15.3
#5:       b    2  5.1  3.333333     15.3
#6:       b    6  6.1  3.333333     15.3
DanY
  • 5,920
  • 1
  • 13
  • 33
  • Well the merge method mentioned gives a good idea and the above one too. I might just have to play around and see how I can get it to work. But thank you so much! – darkpunk Oct 06 '18 at 03:49