Questions tagged [top-n]

322 questions
196
votes
6 answers

Oracle SELECT TOP 10 records

I have an big problem with an SQL Statement in Oracle. I want to select the TOP 10 Records ordered by STORAGE_DB which aren't in a list from an other select statement. This one works fine for all records: SELECT DISTINCT APP_ID, NAME, …
opHASnoNAME
  • 20,224
  • 26
  • 98
  • 143
78
votes
2 answers

Evaluation & Calculate Top-N Accuracy: Top 1 and Top 5

I have come across few (Machine learning-classification problem) journal papers mentioned about evaluate accuracy with Top-N approach. Data was show that Top 1 accuracy = 42.5%, and Top-5 accuracy = 72.5% in the same training, testing condition. I…
D_9268
  • 1,039
  • 2
  • 9
  • 17
41
votes
1 answer

How to see top n entries of term-document matrix after tfidf in scikit-learn

I am new to scikit-learn, and I was using TfidfVectorizer to find the tfidf values of terms in a set of documents. I used the following code to obtain the same. vectorizer = TfidfVectorizer(stop_words=u'english',ngram_range=(1,5),lowercase=True) X =…
Amrith Krishna
  • 2,768
  • 3
  • 31
  • 65
25
votes
2 answers

Oracle SQL query: Retrieve latest values per group based on time

I have the following table in an Oracle DB id date quantity 1 2010-01-04 11:00 152 2 2010-01-04 11:00 210 1 2010-01-04 10:45 132 2 2010-01-04 10:45 318 4 2010-01-04 10:45 122 1 2010-01-04 10:30 …
Tom
  • 1,713
  • 5
  • 19
  • 24
20
votes
2 answers

In MariaDB how do I select the top 10 rows from a table?

I just read online that MariaDB (which SQLZoo uses), is based on MySQL. So I thought that I can use ROW_NUMBER() function However, when I try this function in SQLZoo : SELECT * FROM ( SELECT * FROM route ) TEST7 WHERE ROW_NUMBER() < 10 then I…
Caffeinated
  • 11,982
  • 40
  • 122
  • 216
17
votes
5 answers

Oracle SQL - How to Retrieve highest 5 values of a column

How do you write a query where only a select number of rows are returned with either the highest or lowest column value. i.e. A report with the 5 highest salaried employees?
Trevor
  • 235
  • 1
  • 2
  • 6
13
votes
2 answers

SUM of only TOP 10 rows

I have a query where I am only selecting the TOP 10 rows, but I have a SUM function in there that is still taking the sum of all the rows (disregarding the TOP 10). How do I get the total of only the top 10 rows? Here is my SUM function : SUM(…
Cfw412
  • 131
  • 1
  • 1
  • 3
12
votes
6 answers

Tidyverse: filtering n largest groups in grouped dataframe

I want to filter the n largest groups based on count, and then do some calculations on the filtered dataframe Here is some data Brand <- c("A","B","C","A","A","B","A","A","B","C") Category <- c(1,2,1,1,2,1,2,1,2,1) Clicks <-…
Shinobi_Atobe
  • 1,793
  • 1
  • 18
  • 35
12
votes
2 answers

How to find column-index of top-n values within each row of huge dataframe

I have a dataframe of format: (example data) Metric1 Metric2 Metric3 Metric4 Metric5 ID 1 0.5 0.3 0.2 0.8 0.7 2 0.1 0.8 0.5 0.2 0.4 3 0.3 0.1 0.7 0.4 0.2 …
tfcoe
  • 391
  • 4
  • 13
12
votes
2 answers

Find names of top-n highest-value columns in each pandas dataframe row

I have the following dataframe: id p1 p2 p3 p4 1 0 9 1 4 2 0 2 3 4 3 1 3 10 7 4 1 5 3 1 5 2 3 7 10 I need to reshape the data frame in a way that for each id it will have the top 3 columns with…
chessosapiens
  • 3,159
  • 10
  • 36
  • 58
12
votes
4 answers

How to get top n companies from a data frame in decreasing order

I am trying to get the top 'n' companies from a data frame.Here is my code below. data("Forbes2000", package = "HSAUR") sort(Forbes2000$profits,decreasing=TRUE) Now I would like to get the top 50 observations from this sorted vector.
Teja
  • 13,214
  • 36
  • 93
  • 155
10
votes
1 answer

Spark sql top n per group

How can I get the top-n (lets say top 10 or top 3) per group in spark-sql? http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ provides a tutorial for general SQL. However, spark does not implement subqueries…
Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
8
votes
5 answers

Finding top N columns for each row in data frame

given a data frame with one descriptive column and X numeric columns, for each row I'd like to identify the top N columns with the higher values and save it as rows on a new dataframe. For example, consider the following data frame: df =…
Diego
  • 34,802
  • 21
  • 91
  • 134
7
votes
1 answer

Is there a way to get the nlargest items per group in dask?

I have the following dataset: location category percent A 5 100.0 B 3 100.0 C 2 50.0 4 13.0 D 2 75.0 3 59.0 4 …
whisperstream
  • 1,897
  • 3
  • 20
  • 25
6
votes
1 answer

Finding the top k matches in Pytorch

I'm using the following code to find the topk matches using pytorch: def find_top(self, x, y, n_neighbors, unit_vectors=False, cuda=False): if not unit_vectors: x = __to_unit_torch__(x, cuda=cuda) y = __to_unit_torch__(y,…
user1241241
  • 664
  • 5
  • 20
1
2 3
21 22