Cover image for XML processing with Python
XML processing with Python
McGrath, Sean 1965-
Personal Author:
Publication Information:
Upper Saddle River, NJ : Prentice Hall PTR, [2000]

Physical Description:
xxiv, 527 pages : illustrations ; 24 cm + 1 computer optical disc (4 3/4 in.).
General Note:
Includes index.

CD-ROM includes complete Python distribution for Windows and Linux, Pyxie open-source libraries, utility programs, sample source code.
Format :


Call Number
Material Type
Home Location
Item Holds
QA76.76.H94 M3885 2000 Book and Software Set Central Closed Stacks

On Order




Author Notes

SEAN McGRATH is a leading XML/SGML expert and active member of the XML developer's community. He served as an invited expert for the W3C special interest group that standardized XML. He is Chief Technology Officer of Propylon (, developers of mobile portal software. His books include XML by Example: Building eCommerce Applications and ParseMe.1st: SGML for Software Developers (Prentice Hall PTR).

About the Series Editor

CHARLES F. GOLDFARB is the father of markup languages, a term that he coined in 1970. He is the inventor of SGML, the International Standard on which both XML and HTML are based. You can find him on the Web at

Table of Contents

Breakthrough techniques for building XML applications -- fast!
Includes a detailed Python tutorial
Learn about DOM and SAX application development with Python
Exclusive coverage of the new Pyxie XML processing library
CD-ROM includes Python and Pyxie distributions for Windows NT and Linux--plus powerful utilities and lots of working code
"XML processing is the newest required skill for webmasters and application developers. The Python language and Sean McGrath's book make it fun to learn and easy to do. " --Charles F. Goldfarb
When it comes to XML processing, Python is in a league of its own.
If you're doing XML development without Python, you're wasting time!
Python offers outstanding productivity -- especially in the areas that matter most to XML developers, such as XML parsing, DOM/SAX implementations, string processing, and Internet APIs.
And now there's Pyxie -- the new open source library that makes Python XML processing even easier and more powerful
In XML Processing with Python, top XML developer Sean McGrath delivers the hands-on explanations and examples you need to get results with Python and Pyxie fast -- even if you've never used them before!
Install Python and the Pyxie XML package
Learn the fundamentals of Python: control structures, classes, nested lists, dictionaries, and regular rexpresions
Process XML with regular expression-driven, event-driven, and tree-driven techniques
Understand Python's support for DOM and SAX APIs
Explore the power of Python/XML through worked examples of GUI development, database integration, and an XML query-by-example implementation.
Elegant, easy, powerful and fun, Python helps you build world-class XML applications in less time than you ever imagined
If you know XML, one book has all the techniques, code, and tools you'll need to process it: XML Processing with Python.
Cd-Rom Included
The accompanying CD-ROM contains everything you need to develop XML applications with Python -- including complete Python distributions for Windows and Linux
the Pyxie open-source libraries
powerful utility programs
an extensive library of sample source code tested on both Windows NT and Linux
1 Introduction
Purpose of This Book
The Pyxie Open Source Project
How to Read This Book
A Note about Platforms
Structure of Code Samples. And Finally
2 Installing Python
Getting a Python Distribution
Installing the Software
Testing the Python Installation
Using a Python Program File
In Conclusion
3 Installing the XML Package
Testing the XML Package Installation
Testing the pyExpat Module
Testing SAX Support
4 Tools of the Trade
The xmln and xmlv Parsing Utilities
Simple XML-Processing Tasks with xmln and xmlv
The GetURL Utility-A Web Resource Retriever in Python
The PYX2XML Utility: Converting PYX to XML
The C3 Utility: An XML Document Editor/Viewer in Python
In Conclusion
5 Just Enough Python
Basic Control Structures
Data Structures
Object Orientation
Design Principles
In Conclusion
6 Some Important Details
Dealing with Long Lines
Using the dir Function
Working with Docstrings
Importing Modules
Executing Python Programs
Using the Special Object None
Memory Management
Copying Objects
Determining Object Identity
Handling Errors
The Dynamic Nature of Python
Named Parameters
The Pass Statement
7 Processing XML with Regular Expressions
Command-Line Arguments
A Module Test Harness for xgrep
What If There Are No Command-Line Parameters? Adding Support for Wildcards
Parsing Command-Line Options
A Pattern-Matching Dry Run
Introducing Regular Expressions
Using Escape Sequences in Regular Expressions
Compiling Regular Expressions
Adding Regular Expressions to xgrep
xgrep in Action
Parsing XML with Regular Expressions
Cautionary Tales
Avoiding False Positive Matches
Shallow Parsing XML with Python Regular Expressions
Current Implementation of xgrep
8 Event-driven XML Processing
Making xgrep XML-Aware
Invoking xmln from xgrep
Adding PYX Support for xgrep
Adding XML Search Features to xgrep
Using Long Option Names in getopt
Using "Bit Twiddling" to Handle the Many Options Available
The Match Printing Function
Some Examples
Generalizing the Idea of Event-Based XML Processing
A Standardized Event-Driven Processing Model
Advantages and Disadvantages of Event-Driven Processing
In Conclusion
9 Tree-driven XML Processing
Modelling a Node
Navigating a Tree
Building xTree Structures
Building an xTree By Using PYX
A Test Harness for Pyxie
Handling Line Ends
A Syntax for Tree Processing with xgrep
Adding Support for Attributes
Some Utility Bits and Pieces
Implementing XMLGrepTree
A Standardized Tree-Driven XML Processing Model
Advantages and Disadvantages of Tree-Driven XML Processing
Some Examples
Bringing It All Together
10 Just Enough SAX
The Concept of an "Interface"
Overview of the SAX Specification
The HandlerBase Class
The DocumentHandler Interface
The AttributeList Interface
The ErrorHandler Interface
A SAX Inspection Application
SAX as a Source of PYX
Switching SAX Parsers
11 Just Enough DOM
DOM Support in Python
The DOM Architecture
Accessing an XML File with pyDOM
Navigating a DOM Tree
Walking a DOM Tree
Accessing Attributes
Manipulating Trees
Accessing an HTML File with pyDOM
Printing the Text of an HTML Document
Changing Data Content in a DOM Tree
Creating a Tree Programmatically
Converting HTML to PYX by Using DOM
Using PYX as a DOM Data Source
12 Pyxie: An Open Source XML- Processing Library for Python
What Is Pyxie? Design Goals
PYX Notation Processing
Event-driven Processing
Tree-driven Processing
Tree Navigation
Tree Cut-and-Paste
Node Lists
Tree Walking