3

I have the sequence DNA and I want to find nucleotide of the sequence at the position which was chosed by people. Below is the example:

Enter the sequence DNA: ACTAAAAATACAAAAATTAGCCAGGCGTGGTGGCAC (the length of sequence is 33) Enter the position: (12)

I hope the result is the position number 12 the nucleotides are AAA.

I have no problem finding the amino acid of the position. Below is the current code I have.

print "ENTER THE FILENAME OF THE DNA SEQUENCE:= ";
$DNAfilename = <STDIN>;
chomp $DNAfilename;
unless ( open(DNAFILE, $DNAfilename) ) {
  print "Cannot open file \"$DNAfilename\"\n\n";
}
@DNA = <DNAFILE>;
close DNAFILE;
$DNA = join( '', @DNA);
print " \nThe original DNA file is:\n$DNA \n";
$DNA =~ s/\s//g;

print" enter the number ";
$po=<STDIN>;

@pos=$DNA;
if ($po>length($DNA)) 
{
  print" no data";
}

else 
{
  print " @pos\n\n";
}

Please advice how can I find the position at the DNA sequence.

knittl
  • 246,190
  • 53
  • 318
  • 364
Phan
  • 47
  • 1
  • 6

2 Answers2

2
my $nucleotide = substr $DNA, $po, 3;

This will take the 3 nucleotides from positions $po upto $po+2 and assign it to $nucleotide.

hexcoder
  • 1,218
  • 8
  • 13
  • Thank you very much. I had to find the amino acid but with the long sequence (3272733 bp) when I enter high the number position (14488)this program did not run.Please show me the problem in this program. – Phan Aug 17 '11 at 10:18
  • Please try the proposal from [Xaerxess](http://stackoverflow.com/users/708434/xaerxess) below. It should extract the data, you want. Otherwise give us the output and/or error message you get. – hexcoder Aug 17 '11 at 11:16
1

That'll be something like this:

use strict;
use warnings;

print 'ENTER THE FILENAME OF THE DNA SEQUENCE:= ';
my $DNA_filename = <STDIN>;
chomp $DNA_filename;
unless (open(DNAFILE, $DNA_filename))
{
    die 'Cannot open file "' . $DNA_filename . '"' . "\n\n";
}

my @DNA = <DNAFILE>;
close DNAFILE;

my $DNA_string = join('', @DNA);
print "\n" . 'The original DNA file is:' . "\n" . $DNA_string . "\n";
$DNA_string =~ s/\s//g;

print ' enter the number ';
my $pos = <STDIN>;

if ($pos > length($DNA_string)) 
{
    print ' no data';
}
else
{
    print ' ' . substr($DNA_string, $pos, 3) . "\n\n";
}

Some comments:

  1. Always use strict and use warnings - it'll help you to write better and bug-free code.
  2. I personally don't like using interpolation in double quoted strings, hence those concatenations.
  3. Result's position is starting with 0 - if you want, you may change last if's condition and else.

Edit: I've misread part of question about nucleotides, as @hexcoder wrote, you want substr($DNA_string, $pos, 3).

Grzegorz Rożniecki
  • 27,415
  • 11
  • 90
  • 112