0

I am working on a Java program and I have a data.csv file with 100 rows. I want to randomly select 10 rows. The Data looks like this:

    T1  T2  T3  T4  T5
    1   1.0  1   0   1
    1   1.0  0   0   1
    0   0.0  1   1   0

I have managed to read in the CSV file using the following code:

public static void main(String[] args) throws IOException {
    try
    {
        Scanner readIn = new Scanner (new File ("data.csv") );
        while ( readIn.hasNext() )
        {               
            line = readIn.nextLine();
            str = line.split(",",-1);        
        }
        readIn.close();     
    }
    catch (ArrayIndexOutOfBoundsException ob)
    {
        System.out.println("File not found..." );
    } 
    catch (FileNotFoundException e) 
    {
        e.printStackTrace();
    }
}   
clearlight
  • 12,255
  • 11
  • 57
  • 75
user3225573
  • 57
  • 1
  • 1
  • 6
  • have you considered adding all the rows to an ArrayList then removing 10 rows at random? maybe not the most efficient way but it's possibly the easiest to code. – Patrick Parker Mar 01 '17 at 17:04
  • what collections are you using for the data? or are you using a normal array? – Ousmane D. Mar 01 '17 at 17:05
  • What is your question? – tnw Mar 01 '17 at 17:05
  • P.S. why is this tagged "opencsv" if you are not using that library? – Patrick Parker Mar 01 '17 at 17:06
  • @tnw his question is to provide him some algorithm to randomly pick 10 rows from possible 100 rows. – Ousmane D. Mar 01 '17 at 17:06
  • @OusmaneDiaw That's a code request, not a question. I'd like to know OP's specific problem **and question**. – tnw Mar 01 '17 at 17:08
  • A good starting point would be to actually save the data you are reading in instead of just reading in a line and then overriding it in with the next iteration. – OH GOD SPIDERS Mar 01 '17 at 17:08
  • @tnw you're correct. – Ousmane D. Mar 01 '17 at 17:09
  • I have created an ArrayList of strings and I have added each item to the list. I also have SecureRandom random = new SecureRandom(); int row = random.nextInt(list.size()); System.out.println(list.get(row)); This gives me only one item selected at random, but I need 10 random items out of 100. I can code if given an algorithm to start with...please. – user3225573 Mar 01 '17 at 17:34

1 Answers1

0

Updated code with duplicate number check:

    int j=0;
    Map<Integer,Integer> numberMap=new HashMap<Integer,Integer>();
    SecureRandom random = new SecureRandom(); 
    while(j!=10)
    {
        int row = random.nextInt(list.size()); 
        if(!numberMap.containsKey(Integer.valueOf(row)))
        {
            numberMap.put(Integer.valueOf(row), Integer.valueOf(row));
            System.out.println("Row "+row+"="+list.get(row));
            j++;
        }
    }
Atul
  • 1,536
  • 3
  • 21
  • 37
  • 2
    This is obviously not the perfect solution as there is a slight chance that the random value could generate the same row more than once. – Ousmane D. Mar 01 '17 at 19:11
  • Yes, in that case we need to store previous number something say map and check for if exist condition. – Atul Mar 01 '17 at 19:15
  • 1
    ***DON'T*** create a new `SecureRandom()` on each iteration of the loop!!! – pjs Mar 01 '17 at 19:20
  • 3
    This works if the sample size is a small proportion of the data. For larger proportions, it's faster and easier to use `Collections.shuffle()` and iterate through the shuffled result. – pjs Mar 01 '17 at 19:27
  • 2
    This user has smaller data set for larger data set he has to follow http://stackoverflow.com/questions/4040001/creating-random-numbers-with-no-duplicates – Atul Mar 01 '17 at 19:28