Preview

Project 2 DNA

Powerful Essays
Open Document
Open Document
1347 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Project 2 DNA
Project 2: DNA Analysis
Due Dates:
Checkpoint 1 1/7/14 10%
Final Due Date 1/12/14

Students will write a program that uses arrays and files to analyze DNA sequences and determine if they represent proteins.

Special thanks to Stuart Reges and Marty Stepp of UW for use of this assignment.
I. Background
Deoxyribonucleic acid (DNA) is a complex biochemical macromolecule that carries genetic information for cellular life forms and some viruses. DNA is also the mechanism through which genetic information from parents is passed on during reproduction. DNA consists of long chains of chemical compounds called nucleotides. Four nucleotides are present in DNA: Adenine (A), Cytosine (C), Guanine (G), and Thymine (T). Certain regions of the DNA are called genes. Most genes encode instructions for building proteins (they're called "protein-coding" genes). These proteins are responsible for carrying out most of the life processes of the organism. Nucleotides in a gene are organized into codons. Codons are groups of three nucleotides and are written as the first letters of their nucleotides (e.g., TAC or GGA). Each codon uniquely encodes a single amino acid, a building block of proteins.

The sequences of DNA that encode proteins occur between a start codon (which we will assume to be ATG) and a stop codon (which is any of TAA, TAG, or TGA). Not all regions of DNA are genes; large portions that do not lie between a valid start and stop codon are called intergenic DNA and have other (possibly unknown) function. Computational biologists examine large DNA data files to find patterns and important information, such as which regions are genes. Sometimes they are interested in the percentages of mass accounted for by each of the four nucleotide types. Often high percentages of Cytosine (C) and Guanine (G) are indicators of important genetic data.

In this assignment, you will write a program the reads named nucleotide sequences from an input file and performs analysis on the

You May Also Find These Documents Helpful

  • Good Essays

    Btec Level 3 Unit 25 D2

    • 1411 Words
    • 6 Pages

    Deoxyribose Nucleic Acid (DNA) is a polynucleotide molecule that encodes the genetic instructions used in the development and functioning of all known living organisms and many viruses. Most DNA molecules are double stranded helices, consisting of two polynucleotide strands made up of simpler molecules known as nucleotides. A nucleotide is made up of an organic nitrogenous base, a deoxyribose sugar and phosphate groups. It is order of these bases which make up the genetic code; a set of rules, by which information is encoded within genetic material.…

    • 1411 Words
    • 6 Pages
    Good Essays
  • Good Essays

    Homework04

    • 519 Words
    • 3 Pages

    1. When data are read from a text file, you can use the BufferedReader to read one line at a time. After a line of data is read, there is no way of going back to read it again. To overcome this you can first read all the data into a structured object to store them, and then process the data later. Please use the DNA class (we have developed in the past a few weeks, which has properties of ID and seq, and the set/get methods) to develop a Java program to read in a FASTA format DNA sequence file, and parse out each sequence record into the part of ID and sequence. The ID is identified between the ">" and the "|" in the header line, and the sequence is the concatenation of all lines of the sequence part into a single string. Each DNA sequence record can then be stored into an array element of the DNA class. Use a loop in your program to prompt the user to enter a sequence ID, and if the ID exists print out the sequence. If the ID does not exist, print out a warning message. Exit the loop if the user enters “quit”. Please use the sequence file (seq.fasta) as the input file. Below is a sample output of the program: (2 points)…

    • 519 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Dna Sci/230

    • 494 Words
    • 2 Pages

    DNA stands for Deoxyribonucleic acid and looks like a spiral. The spiral is also known as a double helix. The strands are made up of our genetic information, composed of genes and chromosomes. There are four bases divided among purines and pyrimidines. On the purines there are Adenine (A) and Guanine (G). On the pyrimidines there are Cytosine (C) and Thymine (T). The base pairs are Adenine and Thymine (A-T) and Cytosine and Guanine (C-G). DNA is found in the nucleus of every human cell. Humans have 46 chromosomes. When a cell reproduces, the chromosomes get copied and distributed to each offspring.…

    • 494 Words
    • 2 Pages
    Good Essays
  • Good Essays

    Cell Physiology Study Guide

    • 3842 Words
    • 16 Pages

    Study Guide for Lectures 8/24, 8/26, 8/31, and 9/2 Note: It is highly suggested that you reference the figures/figure legends shown in class to further help you understand class material. The Genome Project (~92% complete): • 20,000-28,000 genes in the human genome o The genome was found using a shotgun sequence. o 10% of the genome is Long Intersperse Nuclear Elements (=LINEs). This leaves the questions as to which part of the genome are LINEs?  80,000-120,000 proteins in a cell.  200,000-2,000,000 peptides in a cell. o Ex: Insulin A paper looked at 30% of the genome at 5 nucleotide resolutions and found:  In terms of transcripts (mRNA) may not have a poly-A tail.  43.7% of transcripts never had a poly-A tail.  36.9% of transcripts had the poly-A tail removed.  19.4% of transcripts have a poly-A tail.  Only 30% of mRNA associated with ribosome had a poly-A tail. Splicing: ~80% of human multi-exon genes have a splice variant. Moonlighting Proteins: term used to describe a protein that has more than one job. Ex: protein used in intermediate metabolism and mRNA destruction. Ornithine 1. 2. 3. 4. Decarboxylase (ODC) Short half-life ODC catalyzes the decarboxylation of Ornithine, which results in a polyamine. Ornithine Decarboxylase Antizyme is a Ornithine Decarboxylase inhibitor. Ornithine Decarboxylase Antizyme expression requires that ribosomes shift from the first open reading frame to a second opening reading frame. This shift is stimulated by polyamines.…

    • 3842 Words
    • 16 Pages
    Good Essays
  • Satisfactory Essays

    GE Hw 2

    • 248 Words
    • 2 Pages

    B) For this part I would use the DNA library because it will have the full length DNA sequence. The cDNA would be missing tRNA and rRNA because it is made from the mRNA.…

    • 248 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    DNA consists of two polynucleotide chains and these nucleotides consist of a deoxyribose sugar, a nitrogenous base and a phosphate group. The bases are Adenine, Cytosine, Guanine and Thymine. The sequence of these bases on DNA determines the structure of these proteins. A gene is a sequence of bases which codes for a single polypeptide. Chromosomes carry these genes and these genes come in specific forms called an allele which is how living organisms vary from each other. For example, humans are made up of an XY or XX chromosome. Females are XX and males are XY, however in some animals their sex is determined by the ZW sex-determination…

    • 768 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    Dna Chip

    • 901 Words
    • 4 Pages

    Inside every cell: DNA Serves as a genetic blueprint Dna molecule: berkeley.edu Relating Gene Expression DNA RNA Protein High throughput protein assays complicated We measure transcript level Is a gene expressed? Is protein produced Ideally measure protein levels 4 Introduction Two Popular Microarraying Platforms Spotted microarrays…

    • 901 Words
    • 4 Pages
    Powerful Essays
  • Good Essays

    Deoxyribonucleic acid, DNA and ribonucleic acid, RNA are known to be responsible for the transmission of hereditary information as well as the production of protein by cells. DNA makes up genes and posses instructions for making proteins and RNA. In eukaryotic cells, a chromosome consists of a continuous molecule of DNA and several types of associated proteins. Genes are DNA sequences that code for the synthesis of a polypeptide. Today, it is known that within the double helix lies…

    • 1000 Words
    • 3 Pages
    Good Essays
  • Good Essays

    The need for bioinformatics capabilities has been precipitated by the explosion of publicly available genomic information resulting from the Human Genome Project. Bioinformatics is a field that deals with the application of computer technology to gather, store, analyze, and integrate mainly molecular biological information. Using this information in a digital format, bioinformatics can solve problems of…

    • 620 Words
    • 3 Pages
    Good Essays
  • Good Essays

    DNA COMPUTING

    • 2608 Words
    • 11 Pages

    DNA is a basic storage medium for all living cells. The main function of DNA…

    • 2608 Words
    • 11 Pages
    Good Essays
  • Good Essays

    Ib Diploma Biology Notes

    • 1053 Words
    • 5 Pages

    DNA is a huge information database that carries the complete set of instructions for making all the proteins a cell will ever need! Although there are only four different bases in DNA (A, C, G and T), the order in which the bases occur determines the information to make a protein, just like the 26 letters of the alphabet combine to form words and sentences:…

    • 1053 Words
    • 5 Pages
    Good Essays
  • Powerful Essays

    Study Guide Exam 4

    • 4607 Words
    • 19 Pages

    Repetitive DNA – present in multiple copies in the genome, ¾ of Repetitive DNA is made up of transposable elements and sequences related to them…

    • 4607 Words
    • 19 Pages
    Powerful Essays
  • Good Essays

    Dna Computing

    • 453 Words
    • 2 Pages

    For building a computer two things are necessary: a method of storing information and a few simple operations for handling that informations. Today’s modern computers store information as binary strings as 0s & 1s in memory and manipulate them with the help of arithmetic & logical operations. Similarly, DNA computers can be made to store informations as sequence of 4 letters A, G, C & T and manipulate informations using properties of DNA polymerase enzyme. Deoxyribo Nucleic Acid(DNA) acts as a genetic code for all living organisms. DNA are polymer chains or n-letters sequence of namely adenine, guanine, cytosine and thymine which are joined by weak H-bonds. DNA computing is accomplished in a suspended solution of a DNA, where certain combinations of DNA molecule are interpreted as a particular result to a problem encoded in the original molecules present using various methods such as synthesis, denaturation, ligation, annealing, cutting, PCR amplification, in-vitro DNA recombination.…

    • 453 Words
    • 2 Pages
    Good Essays
  • Better Essays

    Bioinformatics

    • 1997 Words
    • 8 Pages

    Bioinformatics is the branch of biological science which deals with the study of methods for storing, retrieving and analyzing biological data, such as nucleic acid (DNA/RNA) and protein sequence, structure, function, pathways and genetic interactions. It is very important since it contains large amount of information regarding biomolecules that a human mind is not able to store and process such data. There are different data bases that can be used like National Center for Biotechnology Information (NCBI), European Molecular Biology Laboratory-European Bioinformatics Institute database (EMBL-EBI), GenBank (US-based), SwissProt/UniProt, DNA Data Bank of Japan (DDBJ), Entrez and PubMed.…

    • 1997 Words
    • 8 Pages
    Better Essays
  • Powerful Essays

    Time Pass

    • 3919 Words
    • 16 Pages

    Bioinformatics derives knowledge from computer analysis of biological data. These can consist of the information stored in the genetic code, but also experimental results from various sources, patient statistics, and scientific literature. Research in bioinformatics includes method development for storage, retrieval, and analysis of the data. Bioinformatics is a rapidly developing branch of biology and is highly interdisciplinary, using techniques and concepts from informatics, statistics, mathematics, chemistry, biochemistry, physics, and linguistics. It has many practical applications in different areas of biology and medicine.…

    • 3919 Words
    • 16 Pages
    Powerful Essays

Related Topics