First of all, Hi to everyone. I'm a beginner with C# and trying to do this homework. My problem is, reading a specific part of a .pdb (protein data bank) file and split that specific lines into an array or list. Then I will use it for a Forms App
So .pdb file index is looks like this;
HEADER ANTIFREEZE 17-SEP-97 7MSI TITLE TYPE III ANTIFREEZE PROTEIN ISOFORM HPLC 12 COMPND MOL_ID: 1; COMPND 2 MOLECULE: TYPE III ANTIFREEZE PROTEIN ISOFORM HPLC 12; SOURCE MOL_ID: 1; SOURCE 2 ORGANISM_SCIENTIFIC: MACROZOARCES AMERICANUS; ATOM 1 N MET A 0 18.112 24.345 32.146 1.00 51.10 N ATOM 2 CA MET A 0 18.302 23.436 31.020 1.00 49.06 C ATOM 3 C MET A 0 18.079 24.312 29.799 1.00 46.75 C ATOM 4 O MET A 0 16.928 24.678 29.560 1.00 48.24 O ATOM 5 CB MET A 0 17.257 22.311 31.008 1.00 48.14 C ATOM 6 N ALA A 1 19.106 24.757 29.076 1.00 43.47 N HETATM 491 O HOH A 101 23.505 19.335 23.451 1.00 35.56 O HETATM 492 O HOH A 102 19.193 19.013 25.418 1.00 12.73 O HETATM 493 O HOH A 103 7.781 12.538 12.927 1.00 80.11 O
.... and goes on like this
I only need to read the lines that starts with "ATOM" keyword. Then I want to split their informations to variables and to an array or list. After that I want to print the maximum value of X Coordinate to a label.
For example;
ATOM 1 N MET A 0 18.112 24.345 32.146 1.00 51.10 N
1 stands for atom number
N stands for atom name
MET stands for amino acid name
18.112 stands for X coordinate etc.
WHAT I DID
I used the codes from a similar question that was asked here before but i couldn't implement it to my project. First I created a Class for variables
class Atom
{
public int atom_no;
public string atom_name;
public string amino_name;
public char chain;
public int amino_no;
public float x_coordinate;
public float y_coordinate;
public float z_coordinate;
public float ratio;
public float temperature;
}
For the main class; NOTE: I should mention that there's not single whitespace beetween variables. For example between "MET" and "A" there are extra 3 or 4 whitespaces. I've tried to remove them while reading file but I don't know if that worked..
private void button1_Click(object sender, EventArgs e)
{
string filePath = @"path_of_file";
string stringToSearch = @"ATOM";
List<Atom> Atoms = new List<Atom>();
using (StreamReader sr = new StreamReader(filePath))
{
string[] lines = File.ReadAllLines(filePath);
foreach (string line in lines)
{
if (line.Contains(stringToSearch)) // i have tried to read the parts that starts with ATOM
{
while (sr.Peek() >= 0) //this while part is from the question asked before
{
string[] strArray;
string line1 = sr.ReadLine(); // i've added theese 2 lines to remove the extra whitespaces
var lineParts = line1.Split(" ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
strArray = line1.Split(' ');
Atom currentAtom = new Atom();
currentAtom.atom_no = int.Parse(strArray[0]);
currentAtom.atom_name = strArray[1];
currentAtom.amino_name = strArray[2];
currentAtom.chain = char.Parse(strArray[3]);
currentAtom.amino_no = int.Parse(strArray[4]);
currentAtom.x_coordinate = float.Parse(strArray[5]);
currentAtom.y_coordinate = float.Parse(strArray[6]);
currentAtom.z_coordinate = float.Parse(strArray[7]);
currentAtom.ratio = float.Parse(strArray[8]);
currentAtom.temperature = float.Parse(strArray[9]);
Atoms.Add(currentAtom);
}
}
}
}
listBox1.DataSource = Atoms;
listBox1.ValueMember = "atom_no";
listBox1.DisplayMember = "atom_name";
}
I didn't add the part that i want to print the max value of X Coordinate to a label yet. I'm testing at this point with listbox. So when I run the code and press the button gives me "Input string was not in a correct format" error at the currentAtom.atom_no = int.Parse(strArray[0]);
line.
I know that my code looks like mess and sorry If I've stolen your time with this. I would be much appreciated if you guys can help me do this Forms app for my homework. If not, still thank you for reading it. Have a nice and healhty day..