I hope someone could help me or at least give me a good advice. I have a large dataframe to store scientific papers (classified by Author/Year/Journal). Most of the scientific papers give me more records, so I am trying to write a function (until now without success) that return me a unique value (named n) that identifies the paper from which the record belongs.
Asked
Active
Viewed 91 times
0
-
3Stefano, welcome to SO. Please provide us with a reproducible example and try to explain (and show) what you expect your output to look like. You should also show us what you have tried so far. There are a bunch of really good examples of how to do this here: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Brandon Bertelsen Dec 28 '12 at 17:43
1 Answers
2
For calculating unique values, you could use the digest
function from the digest
package.
For example,
library(digest)
digest(c("Granger", "1987", "Econometrica"))
returns a unique MD5 string for a publication. digest
is not vector-able, i.e. you have to use sapply
or similar to calculate the id for each row of your data frame.

Karsten W.
- 17,826
- 11
- 69
- 103
-
1or, less robustly, just `paste` together the authors/date/journal to get an ID string. – Ben Bolker Dec 28 '12 at 18:10
-
2you could also use `interaction` to make a unique id for combinations of columns: `with(d, as.numeric(interaction(Author, Year, Journal, drop=TRUE)))` – Matthew Plourde Dec 28 '12 at 18:22
-
Hi everybody. I try the solution proposed by Matthew and it works very well!I supposed I was enough clear in my example, but next time I will provide all the necessary details.I appreciate all the tips! – stefano Dec 28 '12 at 22:18