Questions tagged [top-n]
322 questions
196
votes
6 answers
Oracle SELECT TOP 10 records
I have an big problem with an SQL Statement in Oracle. I want to select the TOP 10 Records ordered by STORAGE_DB which aren't in a list from an other select statement.
This one works fine for all records:
SELECT DISTINCT
APP_ID,
NAME,
…

opHASnoNAME
- 20,224
- 26
- 98
- 143
78
votes
2 answers
Evaluation & Calculate Top-N Accuracy: Top 1 and Top 5
I have come across few (Machine learning-classification problem) journal papers mentioned about evaluate accuracy with Top-N approach. Data was show that Top 1 accuracy = 42.5%, and Top-5 accuracy = 72.5% in the same training, testing condition.
I…

D_9268
- 1,039
- 2
- 9
- 17
41
votes
1 answer
How to see top n entries of term-document matrix after tfidf in scikit-learn
I am new to scikit-learn, and I was using TfidfVectorizer to find the tfidf values of terms in a set of documents. I used the following code to obtain the same.
vectorizer = TfidfVectorizer(stop_words=u'english',ngram_range=(1,5),lowercase=True)
X =…

Amrith Krishna
- 2,768
- 3
- 31
- 65
25
votes
2 answers
Oracle SQL query: Retrieve latest values per group based on time
I have the following table in an Oracle DB
id date quantity
1 2010-01-04 11:00 152
2 2010-01-04 11:00 210
1 2010-01-04 10:45 132
2 2010-01-04 10:45 318
4 2010-01-04 10:45 122
1 2010-01-04 10:30 …

Tom
- 1,713
- 5
- 19
- 24
20
votes
2 answers
In MariaDB how do I select the top 10 rows from a table?
I just read online that MariaDB (which SQLZoo uses), is based on MySQL. So I thought that I can use ROW_NUMBER() function
However, when I try this function in SQLZoo :
SELECT * FROM (
SELECT * FROM route
) TEST7
WHERE ROW_NUMBER() < 10
then I…

Caffeinated
- 11,982
- 40
- 122
- 216
17
votes
5 answers
Oracle SQL - How to Retrieve highest 5 values of a column
How do you write a query where only a select number of rows are returned with either the highest or lowest column value.
i.e. A report with the 5 highest salaried employees?

Trevor
- 235
- 1
- 2
- 6
13
votes
2 answers
SUM of only TOP 10 rows
I have a query where I am only selecting the TOP 10 rows, but I have a SUM function in there that is still taking the sum of all the rows (disregarding the TOP 10). How do I get the total of only the top 10 rows?
Here is my SUM function :
SUM(…

Cfw412
- 131
- 1
- 1
- 3
12
votes
6 answers
Tidyverse: filtering n largest groups in grouped dataframe
I want to filter the n largest groups based on count, and then do some calculations on the filtered dataframe
Here is some data
Brand <- c("A","B","C","A","A","B","A","A","B","C")
Category <- c(1,2,1,1,2,1,2,1,2,1)
Clicks <-…

Shinobi_Atobe
- 1,793
- 1
- 18
- 35
12
votes
2 answers
How to find column-index of top-n values within each row of huge dataframe
I have a dataframe of format: (example data)
Metric1 Metric2 Metric3 Metric4 Metric5
ID
1 0.5 0.3 0.2 0.8 0.7
2 0.1 0.8 0.5 0.2 0.4
3 0.3 0.1 0.7 0.4 0.2 …

tfcoe
- 391
- 4
- 13
12
votes
2 answers
Find names of top-n highest-value columns in each pandas dataframe row
I have the following dataframe:
id p1 p2 p3 p4
1 0 9 1 4
2 0 2 3 4
3 1 3 10 7
4 1 5 3 1
5 2 3 7 10
I need to reshape the data frame in a way that for each id it will have the top 3 columns with…

chessosapiens
- 3,159
- 10
- 36
- 58
12
votes
4 answers
How to get top n companies from a data frame in decreasing order
I am trying to get the top 'n' companies from a data frame.Here is my code below.
data("Forbes2000", package = "HSAUR")
sort(Forbes2000$profits,decreasing=TRUE)
Now I would like to get the top 50 observations from this sorted vector.

Teja
- 13,214
- 36
- 93
- 155
10
votes
1 answer
Spark sql top n per group
How can I get the top-n (lets say top 10 or top 3) per group in spark-sql?
http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/ provides a tutorial for general SQL. However, spark does not implement subqueries…

Georg Heiler
- 16,916
- 36
- 162
- 292
8
votes
5 answers
Finding top N columns for each row in data frame
given a data frame with one descriptive column and X numeric columns, for each row I'd like to identify the top N columns with the higher values and save it as rows on a new dataframe.
For example, consider the following data frame:
df =…

Diego
- 34,802
- 21
- 91
- 134
7
votes
1 answer
Is there a way to get the nlargest items per group in dask?
I have the following dataset:
location category percent
A 5 100.0
B 3 100.0
C 2 50.0
4 13.0
D 2 75.0
3 59.0
4 …

whisperstream
- 1,897
- 3
- 20
- 25
6
votes
1 answer
Finding the top k matches in Pytorch
I'm using the following code to find the topk matches using pytorch:
def find_top(self, x, y, n_neighbors, unit_vectors=False, cuda=False):
if not unit_vectors:
x = __to_unit_torch__(x, cuda=cuda)
y = __to_unit_torch__(y,…

user1241241
- 664
- 5
- 20