1

I have assignment that requires us to read from a text file of covid 19 codon sequences. I have read in the first line as a string. I am able to convert this one line into 3 character substrings. However, my issue is now to do this for the rest of the file. When I add a hasNext method, it doesn't seem to work the same as my testline.

{
//Open the file
File file = new File("D://Downloads/covid19sequence.txt");
Scanner scan = new Scanner(file); String testLine = ""; String contents = ""; String codon2 = "";
double aTotal, lTotal, lPercentage; 
ArrayList<String> codonList = new ArrayList<String>();

//Read a line in from the file and assign codons via substring
testLine = scan.nextLine();
for (int i = 0; i < testLine.length(); i += 3)
{   

    String codon = testLine.substring(i, i + 3);
    codonList.add(codon);

}
while(scan.hasNext())

System.out.println(codonList); 

}

For reference here is the output for the testline:

[AGA, TCT, GTT, CTC, TAA, ACG, AAC, TTT, AAA, ATC, TGT, GTG, GCT, GTC, ACT, CGG, CTG, CAT, GCT, TAG]

Ikefactor
  • 27
  • 3
  • [Please post the code as text](https://meta.stackoverflow.com/questions/285551/why-not-upload-images-of-code-errors-when-asking-a-question#:~:text=You%20should%20not%20post%20code,order%20to%20reproduce%20the%20problem.) – Yassin Hajaj Apr 08 '21 at 20:55
  • I tried that and it says that it is not formatted properly. I am not sure how to get around that? – Ikefactor Apr 08 '21 at 20:57
  • @YassinHajaj I was able to get it pasted correctly. – Ikefactor Apr 08 '21 at 21:06
  • How many lines are there in the input file? Did you try to read this file line by line? – Nowhere Man Apr 08 '21 at 21:12
  • That is what I am trying to do is convert each line into a 3 character substring like my testline. There's 497 lines in the input file, but it is my understanding you might not always know how big a file is. – Ikefactor Apr 08 '21 at 21:14

2 Answers2

1

Use while(scan.hasNextLine()) to go through text file, you may do it like this:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
import java.util.ArrayList;

public class Codons {

    public static void main(String[] args) throws FileNotFoundException {
        File file = new File("D://Downloads/covid19sequence.txt");
        Scanner scan = new Scanner(file); String testLine = ""; String contents = ""; String codon2 = "";
        double aTotal, lTotal, lPercentage; 
        ArrayList<String> codonList = new ArrayList<String>();

        //Read a line in from the file and assign codons via substring
        
        
        while(scan.hasNextLine()) {
            testLine = scan.nextLine();
            for (int i = 0; i < testLine.length(); i += 3)
            {   

                String codon = testLine.substring(i, i + 3);
                codonList.add(codon);

            }
        }
        scan.close();

        System.out.println(codonList); 

    }

}
Austin He
  • 131
  • 3
  • I've tried moving it. I will try this now. – Ikefactor Apr 08 '21 at 21:27
  • I am not getting any output with this. I am printing the array list to make sure they are all 3 character substrings. Once I know that it is converting correctly, I will remove the println as there is no reason other than to make sure it is converting correctly. – Ikefactor Apr 08 '21 at 21:29
  • @Ikefactor I tried to create a text file like the file you provided. After running my codes, the output is like this: [AGA, TCT, GTT, CTC, TAA, ACG, AAC, TTT, AAA]. Isn't that what you expected? – Austin He Apr 08 '21 at 21:38
  • I can output my testline and get 20 codons from the first line. When I added the hasNext, output was only 7 codons and it didn't start at the second line. Maybe I don't need to output it, I just wanted to see the breakdown into substrings? The endgame is to then loop through the file to find specific codons and report on the % of the file each one represents. – Ikefactor Apr 08 '21 at 21:42
  • I changed it to this: { testLine = scan.nextLine(); for (int i = 0; i < testLine.length(); i += 3) { String codon = testLine.substring(i, i + 3); codonList.add(codon); } }while(scan.hasNext()) System.out.println(codonList); I now get more than one line, however, it's the first line repeating. [AGA, TCT, GTT, CTC, TAA, ACG, AAC, TTT, AAA, ATC, TGT, GTG, GCT, GTC, ACT, CGG, CTG, CAT, GCT, TAG] [AGA, TCT, GTT, CTC, TAA, ACG, AAC, etc – Ikefactor Apr 08 '21 at 21:48
  • @Ikefactor So my understanding is that after you convert the first line into 3 character substrings and add all of them to codonList, you will start to convert the second line into substrings, right? – Austin He Apr 08 '21 at 22:07
  • @Ikefactor your while loop should be like this: do { testLine = scan.nextLine(); for (int i = 0; i < testLine.length(); i += 3) { String codon = testLine.substring(i, i + 3); codonList.add(codon); } }while(scan.hasNext()); . You should add "do" at start. – Austin He Apr 08 '21 at 22:26
  • Correct. The entire file will be converted to 3 character substrings. Can't believe I didn't think about a do while loop. Thanks. I'll try it! – Ikefactor Apr 08 '21 at 22:35
  • I still don't seem to get any output with the do while. – Ikefactor Apr 08 '21 at 23:12
0

If a Scanner is used it may be better to implement a separate method reading the contents using the scanner line by line and splitting the line into 3-character chunks as suggested here:


static List<String> readCodons(Scanner input) {
    List<String> codons = new ArrayList();
    while (input.hasNextLine()) {
        String line = input.nextLine();
        Collections.addAll(codons, line.split("(?<=\\G...)"));
    }
    return codons;
}

Test (using Scanner on the basis of a multiline String):

// each line contains 20 codons
String contents = "AGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAG\n"
                + "GATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGA\n"
                + "ATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCACTCGGCTGCATGCTTAGAG\n";
List<String> codons = readCodons(new Scanner(contents));
for (int i = 0; i < codons.size(); i++) {
  if (i > 0 && i % 10 == 0) {
      System.out.println();
  }
  System.out.print(codons.get(i) + " ");
}

Output

AGA TCT GTT CTC TAA ACG AAC TTT AAA ATC 
TGT GTG GCT GTC ACT CGG CTG CAT GCT TAG 
GAT CTG TTC TCT AAA CGA ACT TTA AAA TCT 
GTG TGG CTG TCA CTC GGC TGC ATG CTT AGA 
ATC TGT TCT CTA AAC GAA CTT TAA AAT CTG 
TGT GGC TGT CAC TCG GCT GCA TGC TTA GAG 

Similar results should be provided if a scanner is created on a text file:

try (Scanner input = new Scanner(new File("codons.data"))) {
    List<String> codons = readCodons(input);
    // print/process codons
}
Nowhere Man
  • 19,170
  • 9
  • 17
  • 42