Beginning Perl for Bioinformatics James Tisdall
Publisher: O'Reilly
First Edition October 2001 ISBN: 0-596-00080-4, 384 pages

This book shows biologists with little or no programming experience how to use Perl, the ideal language for biological data analysis. Each chapter focuses on solving particular problems or class of problems, so you'll finish the book with a solid understanding of Perl basics, a collection of programs for such tasks as parsing BLAST and GenBank, and the skills to tackle more advanced bioinformatics programming.





Preface What Is Bioinformatics? About This Book Who This Book Is For Why Should I Learn to Program? Structure of This Book Conventions Used in This Book Comments and Questions Acknowledgments 1. Biology and Computer Science 1.1 The Organization of DNA 1.2 The Organization of Proteins 1.3 In Silico 1.4 Limits to Computation 2. Getting Started with Perl 2.1 A Low and Long Learning Curve 2.2 Perl's Benefits 2.3 Installing Perl on Your Computer 2.4 How to Run Perl Programs 2.5 Text Editors 2.6 Finding Help 3. The Art of Programming 3.1 Individual Approaches to Programming 3.2 Edit—Run—Revise (and Save) 3.3 An Environment of Programs 3.4 Programming Strategies 3.5 The Programming Process 4. Sequences and Strings 4.1 Representing Sequence Data 4.2 A Program to Store a DNA Sequence 4.3 Concatenating DNA Fragments 4.4 Transcription: DNA to RNA 4.5 Using the Perl Documentation 4.6 Calculating the Reverse Complement in Perl 4.7 Proteins, Files, and Arrays 4.8 Reading Proteins in Files 4.9 Arrays 4.10 Scalar and List Context 4.11 Exercises 5. Motifs and Loops 5.1 Flow Control 5.2 Code Layout 5.3 Finding Motifs 5.4 Counting Nucleotides 5.5 Exploding Strings into Arrays 5.6 Operating on Strings 5.7 Writing to Files



5.8 Exercises 6. Subroutines and Bugs 6.1 Subroutines 6.2 Scoping and Subroutines 6.3 Command-Line Arguments and Arrays 6.4 Passing Data to Subroutines 6.5 Modules and Libraries of Subroutines 6.6 Fixing Bugs in Your Code 6.7 Exercises 7. Mutations and Randomization 7.1 Random Number Generators 7.2 A Program Using Randomization 7.3 A Program to Simulate DNA Mutation 7.4 Generating Random DNA 7.5 Analyzing DNA 7.6 Exercises 8. The Genetic Code 8.1 Hashes 8.2 Data Structures and Algorithms for Biology 8.3 The Genetic Code 8.4 Translating DNA into Proteins 8.5 Reading DNA from Files in FASTA Format 8.6 Reading Frames 8.7 Exercises 9. Restriction Maps and Regular Expressions 9.1 Regular Expressions 9.2 Restriction Maps and Restriction Enzymes 9.3 Perl Operations 9.4 Exercises 10. GenBank 10.1 GenBank Files 10.2 GenBank Libraries 10.3 Separating Sequence and Annotation 10.4 Parsing Annotations 10.5 Indexing GenBank with DBM 10.6 Exercises 11. Protein Data Bank 11.1 Overview of PDB 11.2 Files and Folders 11.3 PDB Files 11.4 Parsing PDB Files 11.5 Controlling Other Programs 11.6 Exercises 12. BLAST 12.1 Obtaining BLAST 12.2 String Matching and Homology



12.3 12.4 12.5 12.6 12.7

BLAST Output Files Parsing BLAST Output Presenting Data Bioperl Exercises

13. Further Topics 13.1 The Art of Program Design 13.2 Web Programming 13.3 Algorithms and Sequence Alignment 13.4 Object-Oriented Programming 13.5 Perl Modules 13.6 Complex Data Structures 13.7 Relational Databases 13.8 Microarrays and XML 13.9 Graphics Programming 13.10 Modeling Networks 13.11 DNA Computers A. Resources A.1 Perl A.2 Computer Science A.3 Linux A.4 Bioinformatics A.5 Molecular Biology B. Perl Summary B.1 Command Interpretation B.2 Comments B.3 Scalar Values and Scalar Variables B.4 Assignment B.5 Statements and Blocks B.6 Arrays B.7 Hashes B.8 Operators B.9 Operator Precedence B.10 Basic Operators B.11 Conditionals and Logical Operators B.12 Binding Operators B.13 Loops B.14 Input/Output B.15 Regular Expressions B.16 Scalar and List Context B.17 Subroutines and Modules B.18 Built-in Functions



What Is Bioinformatics? About This...
