4

I have data type of the following format:

Popularity:xx
Name:xx
Author:xx
Sales:xx
Date Published: xx

I am free to choose whatever way I can store my data in.

I will need to perform some queries on the data, for example

  1. What are the top 'N' Books for the year 'M'
  2. What are the average sales of the top 'N' songs for author 'X'?

It should be kept in mind that further queries may be added.

What will be the different ways to represent the data to perform the queries (in Java)? What will be the merits?

Note: (Not looking for a DB solution)

CoderBC
  • 1,262
  • 2
  • 13
  • 30
  • Any idea about the volumetry of data you might have? – DamCx Nov 15 '16 at 15:34
  • From the question 1, shoudn't it be something like 'N' Books for the year 'M'? – Sean83 Nov 15 '16 at 15:36
  • @DamCx it's not specified, but given its a record of published books over the years, it can be a lot. (~300,000 books are published in the USA each year). However, for start we can consider it to be at a smaller scale. – CoderBC Nov 15 '16 at 15:37
  • @Sean83 yes, edited. – CoderBC Nov 15 '16 at 15:37
  • 4
    Your question is not the right one. You probably don't want to store the millions of books in memory. You want them in a database, and use database queries to find the information you want. The few books found by the queries would just be instances of a Book class. If you're not looking for a DB solution, then you are probably writing a toy project with a few books. Just store them in a list, and use loops and streams to find the ones you want. – JB Nizet Nov 15 '16 at 15:38
  • @CoderBC well, some DB with a single table could to the trick, then, given there are no complexity in the data type you have. I think this could be easier to manipulate than a CSV file, when you will have to query it. – DamCx Nov 15 '16 at 15:39
  • Yes like what @JBNizet said, I don't understand why you mentioned data-structures on db query quetion. I am confused. Are you looking for something like hashmap or hastable? – Sean83 Nov 15 '16 at 15:41
  • @JBNizet yes agreed. You can consider it a toy project, because honestly, I am just experimenting with the various approaches that can be taken with this. – CoderBC Nov 15 '16 at 15:43

4 Answers4

1

JDK comes bundled with Java DB and seems perfectly fine for your use case.

Edit: Sorry I misread the question as a dB solution because it seems you need it. That said you should look for a DB solution where you just query your books from.

If you actually do want to perform queries on data-structures in memory you can use Apache Commons Collections which support filtering.

If you do want to use a data-structure like a Vector which seems like a solution, you need to build indexes to improve performance. Then lookup in the indices and get the book needed. If you know which searches are necessary you can group chosen indexes and create a block to easily search. Essentially creating your own cube data-structure. Could be a nice project.

Timmetje
  • 7,641
  • 18
  • 36
  • yes sorry I did not mention that before more explicitly. – CoderBC Nov 15 '16 at 15:43
  • 1
    I edited my questions for a solution to query/filter a datastructure. There are some readily solutions, but using basic data-structures like Hashtables and Dictionaries are hard to query very efficiently. – Timmetje Nov 15 '16 at 15:50
1

Arraylist of a class. Create a class with those variables as, well, class variables, then instantiate an Arraylist of said object. Then you can perform searches based on the values of certain variables. For example:

//replace "ClassName" with the name of the class
ArrayList<"ClassName"> array = new ArrayList<"ClassName">();
ArrayList<"ClassName"> results = new ArrayList<"ClassName">();


for("ClassName" obj:array)
{
    if(obj.getAuthor().equals("Author Name"))
    {
         results.add(obj);
    }
}

There are many ways to sort the results, including using Collections.sort(); which seems to be the best way to go about it.

Sort ArrayList of custom Objects by property

EDIT: I went ahead and gave you an example based on the specific case you outlined in the comments. This should help you out a lot. As stated before, I had a similar issue for a lab in University, and this was a way I did it.

Community
  • 1
  • 1
  • How would you find the top songs of an artist 'x'? – CoderBC Nov 15 '16 at 15:49
  • CoderBC- in the class, have a method that gets the author , then add the song to a "results" arraylist. Then, sort the list according to the .getPopularity() value. I would recommend making that value a float on a scale from 1-10, being unpopular 10 being extremely popular. There is quite a few tutorials out there on how to sort an arraylist. Here is one: http://stackoverflow.com/questions/2784514/sort-arraylist-of-custom-objects-by-property – Zachary Kirchens Nov 15 '16 at 15:58
  • CoderBC- After that, it's just a matter of going through the list. If you wanted to display only the top 5, you could have a counter as well (in this case, you would probably have to have a conventional "for" loop, IE for(int x=0; x – Zachary Kirchens Nov 15 '16 at 16:02
1

You could use a Bean to wrap your data:

public class Record {

int popularity;
String name;
String author;
int sales;
int yearPublished;

public Record(int popularity, String name, String author, int sales, int yearPublished) {
    super();
    this.popularity = popularity;
    this.name = name;
    this.author = author;
    this.sales = sales;
    this.yearPublished = yearPublished;

}
//getter and setter...

public String toString(){
    return name;
}

And this is a typical usage querying with java8:

Record Record1 = new Record(10,"Record 1 Title","Author 1 Record",10,1990);
Record Record2 = new Record(100,"Record 2 Title","Author 2 Record",100,2010);
Record Record3 = new Record(140,"Record 3 Title","Author 3 Record",120,2000);
Record Record4 = new Record(310,"Record 4 Title","Author 1 Record",130,2010);
Record Record5 = new Record(110,"Record 5 Title","Author 5 Record",140,1987);
Record Record6 = new Record(160,"Record 6 Title","Author 1 Record",15,2010);
Record Record7 = new Record(107,"Record 7 Title","Author 1 Record",4,1980);
Record Record8 = new Record(1440,"Record 8 Title","Author 8 Record",1220,1970);
Record Record9 = new Record(1120,"Record 9 Title","Author 9 Record",1123,2010);

List<Record> Records = Arrays.asList(Record1,Record2,Record3,Record4,Record5,Record6,Record7,Record8,Record9);
//top 2 record of year 2010
int m = 2;
int year = 2010;
System.out.println(Arrays.toString(Records.stream().filter(s -> s.getYearPublished() == year).sorted((r1, r2) -> Integer.compare(r2.popularity, r1.popularity)).limit(m).toArray()));
//average top 2 record of Author 1 Record
String author= "Author 1 Record";
int n = 2;
System.out.println(Records.stream().filter(s -> author.equals(s.getAuthor())).sorted((r1, r2) -> Integer.compare(r2.popularity, r1.popularity)).limit(n).mapToInt(Record::getSales).average().getAsDouble());

This prints:

[Record 9 Title, Record 4 Title]
72.5
user6904265
  • 1,938
  • 1
  • 16
  • 21
0

Having a collection of objects you can use stream api to collect/filter/reduce your results. There is not so much to it. The main problem is to not load all of the objects to memory and to be able to retrieve them from whatever store efficiently by using indexes, reverse-indexes. One of the frameworks which came to my mind is Apache spark

wmlynarski
  • 516
  • 5
  • 8