Don't Miss

Visit Our Stores

Powell's Staff: Five Book Friday: In Memoriam (0 comment)

Every year, the booksellers at Powell’s submit their Top Fives: their five favorite books that were released in 2023. It’s a list that, when put together, shows just how varied and interesting the book tastes of Powell’s booksellers are. I highly recommend digging into the recommendations — we would never lead you astray — but today...

Brontez Purnell: Powell’s Q&A: Brontez Purnell, author of ‘Ten Bridges I’ve Burnt’ (0 comment)
Rachael P.: Starter Pack: Where to Begin with Ursula K. Le Guin (0 comment)

Web Communities: Analysis and Construction

by Yanchun Zhang and Jeffrey Xu Yu and Jingyu Hou

ISBN13: 9783642066115
ISBN10: 3642066119

All Product Details

$65.95

New Trade Paperback

Available at a Remote Warehouse. Ships separately from other items. Additional shipping charges may apply. Not available for In Store Pickup. More Info

Qty	Store
20	Remote Warehouse

Synopses & Reviews

Publisher Comments

Due to the lack of a uniform schema for Web documents and the sheer amount and dynamics of Web data, both the effectiveness and the efficiency of information management and retrieval of Web data is often unsatisfactory when using conventional data management techniques. Web community, defined as a set of Web-based documents with its own logical structure, is a flexible and efficient approach to support information retrieval and to implement various applications. Zhang and his co-authors explain how to construct and analyse Web communities based on information like Web document contents, hyperlinks, or user access logs. Their approaches combine results from Web search algorithms, Web clustering methods, and Web usage mining. They also detail the necessary preliminaries needed to understand the algorithms presented, and they discuss several successful existing applications. Researchers and students in information retrieval and Web search find in this all the necessary basics and methods to create and understand Web communities. Professionals developing Web applications will additionally benefit from the samples presented for their own designs and implementations.

Review

The book can be used by applied mathematicians, search industry professionals, and anyone who wants to learn more about how search engines work. I recommend it for any course on Web information retrieval. I firmly believe that this book and the book by Langville and Meyer are the top two books about the algorithmic aspects of modern search engines. (Yannis Manolopoulos, Aristotle University, Thessaloniki, Greece in ACM REVIEWS)

Review

Synopsis

Chapter 1: Introduction (10 pages) -- Web Search, -- Information Filtering -- Web Community Chapter 2: Preliminaries (30 pages) -- Statistics -- Similarity -- Markov Model -- Matrix Expression of Hyperlinks -- Eigenvector, Principle Engenvector, Secondary Engenvector -- Singular Value Decomposition (SVD) of Matrix -- Graph Theory Basis (Random walk) Chapter 3: HITS and Related Algorithms (50 pages) -- The Original HITS -- The Stability issues -- The Randomized HITS -- The Subspace HITS -- Weighted HITS -- Vector Space Model (VSM) -- Cover Density Ranking (CDR) -- The In-depth Analysis of the HITS -- HITS Improvement (a significant improvement to clever algorithm) -- Noise Page Elimination Algorithm Based on SVD -- The PHITS algorithm (probabilistic HITS) -- SALSA (Stochastic algorithm) -- Random Walks and the Kleinberg Algorithm Chapter 4: PageRank Related Algorithms (50 pages) -- The Original PageRank -- Probability Combination of Link and Content Information in PageRank -- Topic-Sensitve PageRank -- Search-Order: Breadth-First, Backlink, Random -- Quadratic Extrapolation -- Exporing the Block Structure of the Web for Computing PageRank -- Second Eignevalue of the Google Matrix -- A Latent Linkage Information (LLI) Algorithm -- WebPage Scoring Systems (WPSS) -- Rank Aggregation -- Random Suffer Method -- Voting Model -- SimRank (graph-based) -- When Experts Agree: Using Non-Affliated Experts to Rank Popular Topics -- PageRank, HITS and a Unified Framework for Link Analysis Chapter 5: Web Classification and Clustering (50 pages) -- Web Document Similarity Measurement -- Web Document Classification Based on Hyperlinks and Document Semantics -- Clustering Hypertext with Applications to Web Search -- Link-based Clustering to Improve Web Search Results -- Measure Similarity of Interest for Clustering Web-Users -- Clustering of Web Users Using Session-based Similarity Measures -- Scalable Techniques for Clustering the Web -- Clustering web surfers with mixtures of hidden Markov Models -- Clustering User Queries of a Search Engine -- Using Web Structure for Classifying and Describing Web Pages -- Matrix-Based Hierarchical Clustering Algorithms Chapter 6: Web Log/Content Mining for Web Community (50 pages) -- Cut-and-Pick Transactions for Proxy Log Mining -- Mining Web Logs to Improve Website Organization -- Extracting Large-Scale Knowledge Bases from the Web -- Mining the Space of Graph Properties -- Discovering Test Set Regularities in Relational Domains (classification) -- Enhanced Hypertext Categorization Using Hyperlinks -- The Structure of Broad Topics on the Web -- Discovering Unexpected Information from Your Competitors' Web Sites -- On Integrating Catalogs -- Web Community Mining and Web Log Mi

Synopsis

The lack of uniformity in Web documents, and the sheer volume of Web data make conventional information retrieval techniques inadequate for management of Web data. Now comes Web Community, essentially a set of Web-based documents with consistent logical structure. The authors explain how to construct and analyze Web communities, introduce and explain the necessary algorithms, and examine successful existing applications. This book provides researchers and students in information retrieval and Web search with all the necessary basics and methods to create and understand Web communities, and adapt the samples provided for to new designs and implementations.

About the Author

Dr. Yanchun Zhang is Associate Professsor and the Head of Computing Discipline in the Department of Mathematics and Computing at the University of Southern Queensland. He obtained PhD degree in Computer Science from the University of Queensland in 1991. His research areas cover databases, electronic commerce, internet/web information systems, web data management, web search and web services. He has published over 100 research papers on these topics in international journals and conference proceedings, and edited over 10 books/proceedings and journal special issues. He is a co-founder and Co-Editor-In-Chief of World Wide Web: Internet and Web Information Systems and Co-Chairman of International Web Information Systems Engineering Society. Dr. Jeffrey Xu Yu received his B.E., M.E. and Ph.D. in computer science, from the University of Tsukuba, Japan, in 1985, 1987 and 1990, respectively. Jeffrey Xu Yu was a faculty member in the Institute of Information Sciences and Electronics, University of Tsukuba, Japan, and was a Lecturer in the Department of Computer Science, The Australian National University. Currently, he is an Associate Professor in the Department of Systems Engineering and Engineering Management, the Chinese University of Hong Kong. His research areas cover databases, data warehouse and data mining. He has published over 100 research papers on these topics in international journals and conference proceedings. Jeffrey Xu Yu is a member of ACM, and a society affiliate of IEEE Computer Society. Dr Jingyu Hou received his BSc in Computational Mathematics from Shanghai University of Science and Technology (1985) and his PhD in Computational Mathematics from Shanghai University (1995). He is now a Lecturer in the School of Information Technology at Deakin University, Australia. He has also completed a PhD in Computer Science in the Department of Mathematics and Computing at The University of Southern Queensland, Australia. His research interests include Web-Based Data Management and Information Retrieval, Web Databases, Internet Computing and Electronic Commerce, and Semi-Structured Data Models. He has extensively published in the areas of Web information retrieval and Web Communities.

Chapter 1: Introduction (10 pages)

-- Web Search,

-- Information Filtering

-- Web Community

Chapter 2: Preliminaries (30 pages)

-- Statistics

-- Similarity

-- Markov Model

-- Matrix Expression of Hyperlinks

-- Eigenvector, Principle Engenvector, Secondary Engenvector

-- Singular Value Decomposition (SVD) of Matrix

-- Graph Theory Basis (Random walk)

Chapter 3: HITS and Related Algorithms (50 pages)

-- The Original HITS

-- The Stability issues

-- The Randomized HITS

-- The Subspace HITS

-- Weighted HITS

-- Vector Space Model (VSM)

-- Cover Density Ranking (CDR)

-- The In-depth Analysis of the HITS

-- HITS Improvement (a significant improvement to clever algorithm)

-- Noise Page Elimination Algorithm Based on SVD

-- The PHITS algorithm (probabilistic HITS)

-- SALSA (Stochastic algorithm)

-- Random Walks and the Kleinberg Algorithm

Chapter 4: PageRank Related Algorithms (50 pages)

-- The Original PageRank

-- Probability Combination of Link and Content Information in PageRank

-- Topic-Sensitve PageRank

-- Search-Order: Breadth-First, Backlink, Random

-- Quadratic Extrapolation

-- Exporing the Block Structure of the Web for Computing PageRank

-- Second Eignevalue of the Google Matrix

-- A Latent Linkage Information (LLI) Algorithm

-- Web Page Scoring Systems (WPSS)

-- Rank Aggregation

-- Random Suffer Method

-- Voting Model

-- SimRank (graph-based)

-- When Experts Agree: Using Non-Affliated Experts to Rank Popular Topics

-- PageRank, HITS and a Unified Framework for Link Analysis

Chapter 5: Web Classification and Clustering (50 pages)

-- Web Document Similarity Measurement

-- Web Document Classification Based on Hyperlinks and Document Semantics

-- Clustering Hypertext with Applications to Web Search

-- Link-based Clustering to Improve Web Search Results

-- Measure Similarity of Interest for Clustering Web-Users

-- Clustering of Web Users Using Session-based Similarity Measures

-- Scalable Techniques for Clustering the Web

-- Clustering web surfers with mixtures of hidden Markov Models

-- Clustering User Queries of a Search Engine

-- Using Web Structure for Classifying and Describing Web Pages

-- Matrix-Based Hierarchical Clustering Algorithms

Chapter 6: Web Log/Content Mining for Web Community (50 pages)

-- Cut-and-Pick Transactions for Proxy Log Mining

-- Mining Web Logs to Improve Website Organization

-- Extracting Large-Scale Knowledge Bases from the Web

-- Mining the Space of Graph Properties

-- Discovering Test Set Regularities in Relational Domains (classification)

-- Enhanced Hypertext Categorization Using Hyperlinks

-- The Structure of Broad Topics on the Web

-- Discovering Unexpected Information from Your Competitors' Web Sites

-- On Integrating Catalogs

-- Web Community Mining and Web Log Mining: Commodity Cluster Based

Execution

-- Building Cybercommunity Hierarchy

-- Automatic Topic Identification Using Webpage Clustering

-- A Special Method to Separate Disconnected and Nearly-Disconnected

Web Graph Components

Chapter 7: Web Community Applications (50 pages)

-- Link Structure Analysis for Finding Authoritative Images

-- Integrating DOM with Hyperlinks for Enhanced Topic Distillation and Information

Extraction

-- What is the Page Known for? Computing Web Page Reputations

-- Retrieving and Organizing Web Pages by Information Unit

-- Using Information Scent to Model User Information Needs and Actions on the Web

-- Designing Personalized Web Applications

-- Constructing Multi-Granular and Topic-Focused Web Site Map Based on Users

Interests.

-- Guided Tours on the Web

-- Scaling Personalized Web Search

-- Computing Geographical Scopes of Web Resources

-- InCommonSense - Rethinking Web Search Results

-- Persona: A Contextualized and Personalized Web Search

-- Focused Web Crawling: A Generic Framework for Specifying the User Interest and

for Adaptive Crawling Strategies

-- Focused Crawling Using Context Graphs

-- Intelligent Crawling on the Web with arbitrary predicates

Chapter 8: Concluding Remarks (10 pages)

-- Summary

-- Future Research

What Our Readers Are Saying

Be the first to share your thoughts on this title!

Product Details

ISBN:: 9783642066115
Binding:: Trade Paperback
Publication date:: 02/12/2010
Publisher:: Springer
Language:: English
Pages:: 187
Height:: .42IN
Width:: 6.14IN
Number of Units:: 1
Author:: Jeffrey Xu Yu
Author:: Yanchun Zhang
Author:: Jingyu Hou
Subject:: Language, literature and biography
Subject:: e-Commerce/e-business
Subject:: Information Systems Applications (incl.Internet)
Subject:: Web usage mining
Subject:: Internet-Information
Subject:: Web Clustering
Subject:: Information storage and retrieva
Subject:: Web Data Management
Subject:: e-Commerce/e-business
The lack of uniformity in Web documents, and the sheer volume of Web data make conventional information retrieval techniques inadequate for management of Web data. Now comes Web Community, essentially a set of Web-based documents
Subject:: Web Search
Subject:: Information technology
Subject:: User Interfaces and Human Computer Interaction
Subject:: Information storage and retrieval.

$65.95

New Trade Paperback

Available at a Remote Warehouse. Ships separately from other items. Additional shipping charges may apply. Not available for In Store Pickup. More Info