50
Used, New, and Out of Print Books - We Buy and Sell - Powell's Books
Cart |
|  my account  |  wish list  |  help   |  800-878-7323
Hello, | Login
MENU
  • Browse
    • New Arrivals
    • Bestsellers
    • Featured Preorders
    • Award Winners
    • Audio Books
    • See All Subjects
  • Used
  • Staff Picks
    • Staff Picks
    • Picks of the Month
    • Bookseller Displays
    • 50 Books for 50 Years
    • 25 Best 21st Century Sci-Fi & Fantasy
    • 25 PNW Books to Read Before You Die
    • 25 Books From the 21st Century
    • 25 Memoirs to Read Before You Die
    • 25 Global Books to Read Before You Die
    • 25 Women to Read Before You Die
    • 25 Books to Read Before You Die
  • Gifts
    • Gift Cards & eGift Cards
    • Powell's Souvenirs
    • Journals and Notebooks
    • socks
    • Games
  • Sell Books
  • Blog
  • Events
  • Find A Store

Don't Miss

  • A Sale By Any Other Name
  • Spring Sale
  • Scientifically Proven Sale
  • Powell's Author Events
  • Oregon Battle of the Books
  • Audio Books

Visit Our Stores


Powell's Staff: 9 New Books to Read This Transgender Day of Visibility (0 comment)
March 31 is International Transgender Day of Visibility, a day dedicated to celebrating the lives and accomplishments of transgender and gender-nonconforming people, while continuing to bring attention to the ongoing prejudice and violence the community faces every day. It’s also a day that serves as an important reminder to cisgender folks...
Read More»
  • Kelsey Ford: Powell's Picks Spotlight: Kelly Link's 'White Cat, Black Dog' (0 comment)
  • Powell's Staff: New Literature in Translation: March 2023 (0 comment)

{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##

Perl & Lwp: Fetching Web Pages, Parsing Html, Writing Spiders & More

by Burke, Sean M.
Perl & Lwp: Fetching Web Pages, Parsing Html, Writing Spiders & More

  • Comment on this title
  • Synopses & Reviews

ISBN13: 9780596001780
ISBN10: 0596001789
Condition: Standard


All Product Details

View Larger ImageView Larger Images
Ships free on qualified orders.
Add to Cart
$10.50
List Price:$39.99
Used Trade Paperback
Ships in 1 to 3 days
Add to Wishlist
QtyStore
1Burnside

Synopses & Reviews

Publisher Comments

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages.The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.Perl & LWP covers:

  • Understanding LWP and its design
  • Fetching and analyzing URLs
  • Extracting information from HTML using regular expressions and tokens
  • Working with the structure of HTML documents using trees
  • Setting and inspecting HTTP headers and response codes
  • Managing cookies
  • Accessing information that requires authentication
  • Extracting links
  • Cooperating with proxy caches
  • Writing web spiders (also known as robots) in a safe fashion
Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work.Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerful and popular toolkit.

Synopsis

This comprehensive guide to LWP and its applications comes with many practical examples. Topics include fetching Web pages, submitting forms, using various techniques for HTML parsing, handling cookies and authentication.

Synopsis

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages.

The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.

Perl & LWP covers:

Understanding LWP and its design

Fetching and analyzing URLs

Extracting information from HTML using regular expressions and tokens

Working with the structure of HTML documents using trees

Setting and inspecting HTTP headers and response codes

Managing cookies

Accessing information that requires authentication

Extracting links

Cooperating with proxy caches

Writing web spiders (also known as robots) in a safe fashion

Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work.

Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerfuland popular toolkit.


Table of Contents

Foreword; Preface; Audience for This Book; Structure of This Book; Order of Chapters; Important Standards Documents; Conventions Used in This Book; Comments and Questions; Acknowledgments; Chapter 1: Introduction to Web Automation; 1.1 The Web as Data Source; 1.2 History of LWP; 1.3 Installing LWP; 1.4 Words of Caution; 1.5 LWP in Action; Chapter 2: Web Basics; 2.1 URLs; 2.2 An HTTP Transaction; 2.3 LWP::Simple; 2.4 Fetching Documents Without LWP::Simple; 2.5 Example: AltaVista; 2.6 HTTP POST; 2.7 Example: Babelfish; Chapter 3: The LWP Class Model; 3.1 The Basic Classes; 3.2 Programming with LWP Classes; 3.3 Inside the do_GET and do_POST Functions; 3.4 User Agents; 3.5 HTTP::Response Objects; 3.6 LWP Classes: Behind the Scenes; Chapter 4: URLs; 4.1 Parsing URLs; 4.2 Relative URLs; 4.3 Converting Absolute URLs to Relative; 4.4 Converting Relative URLs to Absolute; Chapter 5: Forms; 5.1 Elements of an HTML Form; 5.2 LWP and GET Requests; 5.3 Automating Form Analysis; 5.4 Idiosyncrasies of HTML Forms; 5.5 POST Example: License Plates; 5.6 POST Example: ABEBooks.com; 5.7 File Uploads; 5.8 Limits on Forms; Chapter 6: Simple HTML Processing with Regular Expressions; 6.1 Automating Data Extraction; 6.2 Regular Expression Techniques; 6.3 Troubleshooting; 6.4 When Regular Expressions Aren't Enough; 6.5 Example: Extracting Linksfrom a Bookmark File; 6.6 Example: Extracting Linksfrom Arbitrary HTML; 6.7 Example: Extracting Temperatures from Weather Underground; Chapter 7: HTML Processing with Tokens; 7.1 HTML as Tokens; 7.2 Basic HTML::TokeParser Use; 7.3 Individual Tokens; 7.4 Token Sequences; 7.5 More HTML::TokeParser Methods; 7.6 Using Extracted Text; Chapter 8: Tokenizing Walkthrough; 8.1 The Problem; 8.2 Getting the Data; 8.3 Inspecting the HTML; 8.4 First Code; 8.5 Narrowing In; 8.6 Rewrite for Features; 8.7 Alternatives; Chapter 9: HTML Processing with Trees; 9.1 Introduction to Trees; 9.2 HTML::TreeBuilder; 9.3 Processing; 9.4 Example: BBC News; 9.5 Example: Fresh Air; Chapter 10: Modifying HTML with Trees; 10.1 Changing Attributes; 10.2 Deleting Images; 10.3 Detaching and Reattaching; 10.4 Attaching in Another Tree; 10.5 Creating New Elements; Chapter 11: Cookies, Authentication,and Advanced Requests; 11.1 Cookies; 11.2 Adding Extra Request Header Lines; 11.3 Authentication; 11.4 An HTTP Authentication Example:The Unicode Mailing Archive; Chapter 12: Spiders; 12.1 Types of Web-Querying Programs; 12.2 A User Agent for Robots; 12.3 Example: A Link-Checking Spider; 12.4 Ideas for Further Expansion; Appendix A: LWP Modules; Appendix B: HTTP Status Codes; B.1 100s: Informational; B.2 200s: Successful; B.3 300s: Redirection; B.4 400s: Client Errors; B.5 500s: Server Errors; Appendix C: Common MIME Types; Appendix D: Language Tags; Appendix E: Common Content Encodings; Appendix F: ASCII Table; Appendix G: User's View of Object-Oriented Modules; G.1 A User's View of Object-Oriented Modules; G.2 Modules and Their Functional Interfaces; G.3 Modules with Object-Oriented Interfaces; G.4 What Can You Do with Objects?; G.5 What's in an Object?; G.6 What Is an Object Value?; G.7 So Why Do Some Modules Use Objects?; G.8 The Gory Details; Colophon;


What Our Readers Are Saying

Be the first to share your thoughts on this title!




Product Details

ISBN:
9780596001780
Binding:
Trade Paperback
Publication date:
06/30/2002
Publisher:
O'Reilly Media
Language:
English
Pages:
262
Height:
.68IN
Width:
7.08IN
Thickness:
.68 in.
LCCN:
2002071540
Number of Units:
1
Illustration:
Yes
Copyright Year:
2002
UPC Code:
2800596001782
Author:
Sean M Burke
Author:
Sean M. Burke
Subject:
World Wide Web
Subject:
General-General
Subject:
Application software
Subject:
Perl (computer program language)

Ships free on qualified orders.
Add to Cart
$10.50
List Price:$39.99
Used Trade Paperback
Ships in 1 to 3 days
Add to Wishlist
QtyStore
1Burnside

More copies of this ISBN

  • New, Trade Paperback, $39.99
{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##
  • Twitter
  • Facebook
  • Pinterest
  • Instagram

  • Help
  • Guarantee
  • My Account
  • Careers
  • About Us
  • Security
  • Wish List
  • Partners
  • Contact Us
  • Shipping
  • Transparency ACT MRF
  • Sitemap
  • © 2023 POWELLS.COM Terms

{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]##
{1}
##LOC[OK]## ##LOC[Cancel]##
{1}
##LOC[OK]## ##LOC[Cancel]##