Cover image for Beginning Perl for bioinformatics
Beginning Perl for bioinformatics
Tisdall, James D.
Personal Author:
First edition.
Publication Information:
Beijing ; Sebastopol, CA : O'Reilly, [2001]

Physical Description:
xiii, 368 pages : illustrations ; 24 cm
General Note:
"An introduction to Perl for biologists"--Cover.
Format :


Call Number
Material Type
Home Location
Item Holds
QA76.73.P22 T57 2001 Adult Non-Fiction Central Closed Stacks

On Order



With its highly developed capacity to detect patterns in data, Perl has become one of the most popular languages for biological data analysis. But if you're a biologist with little or no programming experience, starting out in Perl can be a challenge. Many biologists have a difficult time learning how to apply the language to bioinformatics. The most popular Perl programming books are often too theoretical and too focused on computer science for a non-programming biologist who needs to solve very specific problems. Beginning Perl for Bioinformatics is designed to get you quickly over the Perl language barrier by approaching programming as an important new laboratory skill, revealing Perl programs and techniques that are immediately useful in the lab. Each chapter focuses on solving a particular bioinformatics problem or class of problems, starting with the simplest and increasing in complexity as the book progresses. Each chapter includes programming exercises and teaches bioinformatics by showing and modifying programs that deal with various kinds of practical biological problems. By the end of the book you'll have a solid understanding of Perl basics, a collection of programs for such tasks as parsing BLAST and GenBank, and the skills to take on more advanced bioinformatics programming. Some of the later chapters focus in greater detail on specific bioinformatics topics. This book is suitable for use as a classroom textbook, for self-study, and as a reference.The book covers:

Programming basics and working with DNA sequences and strings Debugging your code Simulating gene mutations using random number generators Regular expressions and finding motifs in data Arrays, hashes, and relational databases Regular expressions and restriction maps Using Perl to parse PDB records, annotations in GenBank, and BLAST output

Author Notes

James Tisdall has worked as a musician, a programmer at Bell Labs (where he programmed for speech research and discovered a formal language for musical rhythm), and as a bioinformaticist at Mercator Genetics in Menlo Park, California, and at Fox Chase Cancer Center in Philadelphia. He has a B.A. in mathematics from the City College of New York and an M.S. in computer science from Columbia University; he is working towards a Ph.D. in computer science at the University of Pennsylvania. In his spare time, Jim teaches computer music at the Settlement Music School in Philadelphia. He is also the author of O'Reilly's Beginning Perl for Bioinformatics.

Table of Contents

1 Biology and Computer Science
The Organization of DNA
The Organization of Proteins
In Silico
Limits to Computation
2 Getting Started with Perl
A Low and Long Learning Curve
Perl's Benefits
Installing Perl on Your Computer
How to Run Perl Programs
Text Editors
Finding Help
3 The Art of Programming
Individual Approaches to Programming
Edit-Run-Revise (and Save)
An Environment of Programs
Programming Strategies
The Programming Process
4 Sequences and Strings
Representing Sequence Data
A Program to Store a DNA Sequence
Concatenating DNA Fragments
Transcription: DNA to RNA
Using the Perl Documentation
Calculating the Reverse Complement in Perl
Proteins, Files, and Arrays
Reading Proteins in Files
Scalar and List Context
5 Motifs and Loops
Flow Control
Code Layout
Finding Motifs
Counting Nucleotides
Exploding Strings into Arrays
Operating on Strings
Writing to Files
6 Subroutines and Bugs
Scoping and Subroutines
Command-Line Arguments and Arrays
Passing Data to Subroutines
Modules and Libraries of Subroutines
Fixing Bugs in Your Code
7 Mutations and Randomization
Random Number Generators
A Program Using Randomization
A Program to Simulate DNA Mutation
Generating Random DNA
Analyzing DNA
8 The Genetic Code
Data Structures and Algorithms for Biology
The Genetic Code
Translating DNA into Proteins
Reading DNA from Files in FASTA Format
Reading Frames
9 Restriction Maps and Regular Expressions
Regular Expressions
Restriction Maps and Restriction Enzymes
Perl Operations
10 GenBank
GenBank Files
GenBank Libraries
Separating Sequence and Annotation
Parsing Annotations
Indexing GenBank with DBM
11 Protein Data Bank
Files and Folders
PDB Files
Parsing PDB Files
Controlling Other Programs
12 Blast
Obtaining BLAST
String Matching and Homology
BLAST Output Files
Parsing BLAST Output
Presenting Data
13 Further Topics
The Art of Program Design
Web Programming
Algorithms and Sequence Alignment
Object-Oriented Programming
Perl Modules
Complex Data Structures
Relational Databases
Microarrays and XML
Graphics Programming
Modeling Networks
DNA Computers
A Resources