详情介绍
中文名: Web数据挖掘:挖掘Web内容模式、结构和用途
作者: Zdravko Markov
Daniel T. Larose图书
分类: 网络
资源格式: PDF
版本: 文字版
出版社: Wiley Blackwell
书号: 0471666556
发行时间: 2007年04月01日
地区: 美国
语言: 英文
简介:
内容介绍:
This book introduces the reader to methods of data mining on the web, including uncovering patterns in web content (classification, clustering, language processing), structure (graphs, hubs, metrics), and usage (modeling, sequence analysis, performance).
目录:
PREFACE
PART I: WEB STRUCTURE MINING
1 INFORMATION RETRIEVAL AND WEB SEARCH
Web Challenges
Web Search Engines
Topic Directories
Semantic Web
Crawling the Web
Web Basics
Web Crawlers
Indexing and Keyword Search
Document Representation
Implementation Considerations
Relevance Ranking
Advanced Text Search
Using the HTML Structure in Keyword Search
Evaluating Search Quality
Similarity Search
Cosine Similarity
Jaccard Similarity
Document Resemblance
References
Exercises
2 HYPERLINK-BASED RANKING
Introduction
Social Networks Analysis
PageRank
Authorities and Hubs
Link-Based Similarity Search
Enhanced Techniques for Page Ranking
References
Exercises
PART II: WEB CONTENT MINING
3 CLUSTERING
Introduction
Hierarchical Agglomerative Clustering
k-Means Clustering
Probabilty-Based Clustering
Finite Mixture Problem
Classification Problem
Clustering Problem
Collaborative Filtering (Recommender Systems)
References
Exercises
4 EVALUATING CLUSTERING
Approaches to Evaluating Clustering
Similarity-Based Criterion Functions
Probabilistic Criterion Functions
MDL-Based Model and Feature Evaluation.
Minimum Description Length Principle.
MDL-Based Model Evaluation
Feature Selection
Classes-to-Clusters Evaluation
Precision, Recall, and F-Measure
Entropy
References
Exercises
5 CLASSIFICATION
General Setting and Evaluation Techniques
Nearest-Neighbor Algorithm
Feature Selection
Naive Bayes Algorithm
Numerical Approaches
Relational Learning
References
Exercises
PART III: WEB USAGE MINING
6 INTRODUCTION TO WEB USAGE MINING
Definition of Web Usage Mining
Cross-Industry Standard Process for Data Mining
Clickstream Analysis
Web Server Log Files
Remote Host Field
Date/Time Field
HTTP Request Field
Status Code Field
Transfer Volume (Bytes) Field
Common Log Format
Identification Field
Authuser Field
Extended Common Log Format
Referrer Field
User Agent Field
Example of a Web Log Record
Microsoft IIS Log Format
Auxiliary Information
References
Exercises
7 PREPROCESSING FOR WEB USAGE MINING
Need for Preprocessing the Data
Data Cleaning and Filtering
Page Extension Exploration and Filtering
De-Spidering the Web Log File
User Identification
Session Identification
Path Completion
Directories and the Basket Transformation
Further Data Preprocessing Steps
References
Exercises
8 EXPLORATORY DATA ANALYSIS FOR WEB USAGE MINING
Introduction
Number of Visit Actions
Session Duration
Relationship between Visit Actions and Session Duration
Average Time per Page
Duration for Individual Pages
References
Exercises
9 MODELING FOR WEB USAGE MINING: CLUSTERING, ASSOCIATION, AND CLASSIFICATION
Introduction
Modeling Methodology
Definition of Clustering
The BIRCH Clustering Algorithm
Affinity Analysis and the A Priori Algorithm
Discretizing the Numerical Variables: Binning
Applying the A Priori Algorithm to the CCSU Web Log Data
Classification and Regression Trees
The C4.5 Algorithm
References
Exercises
INDEX
作者: Zdravko Markov
Daniel T. Larose图书
分类: 网络
资源格式: PDF
版本: 文字版
出版社: Wiley Blackwell
书号: 0471666556
发行时间: 2007年04月01日
地区: 美国
语言: 英文
简介:
内容介绍:
This book introduces the reader to methods of data mining on the web, including uncovering patterns in web content (classification, clustering, language processing), structure (graphs, hubs, metrics), and usage (modeling, sequence analysis, performance).
目录:
PREFACE
PART I: WEB STRUCTURE MINING
1 INFORMATION RETRIEVAL AND WEB SEARCH
Web Challenges
Web Search Engines
Topic Directories
Semantic Web
Crawling the Web
Web Basics
Web Crawlers
Indexing and Keyword Search
Document Representation
Implementation Considerations
Relevance Ranking
Advanced Text Search
Using the HTML Structure in Keyword Search
Evaluating Search Quality
Similarity Search
Cosine Similarity
Jaccard Similarity
Document Resemblance
References
Exercises
2 HYPERLINK-BASED RANKING
Introduction
Social Networks Analysis
PageRank
Authorities and Hubs
Link-Based Similarity Search
Enhanced Techniques for Page Ranking
References
Exercises
PART II: WEB CONTENT MINING
3 CLUSTERING
Introduction
Hierarchical Agglomerative Clustering
k-Means Clustering
Probabilty-Based Clustering
Finite Mixture Problem
Classification Problem
Clustering Problem
Collaborative Filtering (Recommender Systems)
References
Exercises
4 EVALUATING CLUSTERING
Approaches to Evaluating Clustering
Similarity-Based Criterion Functions
Probabilistic Criterion Functions
MDL-Based Model and Feature Evaluation.
Minimum Description Length Principle.
MDL-Based Model Evaluation
Feature Selection
Classes-to-Clusters Evaluation
Precision, Recall, and F-Measure
Entropy
References
Exercises
5 CLASSIFICATION
General Setting and Evaluation Techniques
Nearest-Neighbor Algorithm
Feature Selection
Naive Bayes Algorithm
Numerical Approaches
Relational Learning
References
Exercises
PART III: WEB USAGE MINING
6 INTRODUCTION TO WEB USAGE MINING
Definition of Web Usage Mining
Cross-Industry Standard Process for Data Mining
Clickstream Analysis
Web Server Log Files
Remote Host Field
Date/Time Field
HTTP Request Field
Status Code Field
Transfer Volume (Bytes) Field
Common Log Format
Identification Field
Authuser Field
Extended Common Log Format
Referrer Field
User Agent Field
Example of a Web Log Record
Microsoft IIS Log Format
Auxiliary Information
References
Exercises
7 PREPROCESSING FOR WEB USAGE MINING
Need for Preprocessing the Data
Data Cleaning and Filtering
Page Extension Exploration and Filtering
De-Spidering the Web Log File
User Identification
Session Identification
Path Completion
Directories and the Basket Transformation
Further Data Preprocessing Steps
References
Exercises
8 EXPLORATORY DATA ANALYSIS FOR WEB USAGE MINING
Introduction
Number of Visit Actions
Session Duration
Relationship between Visit Actions and Session Duration
Average Time per Page
Duration for Individual Pages
References
Exercises
9 MODELING FOR WEB USAGE MINING: CLUSTERING, ASSOCIATION, AND CLASSIFICATION
Introduction
Modeling Methodology
Definition of Clustering
The BIRCH Clustering Algorithm
Affinity Analysis and the A Priori Algorithm
Discretizing the Numerical Variables: Binning
Applying the A Priori Algorithm to the CCSU Web Log Data
Classification and Regression Trees
The C4.5 Algorithm
References
Exercises
INDEX
下载地址
下载错误?【投诉报错】
人气书籍
下载声明
☉ 解压密码:www.jb51.net 就是本站主域名,希望大家看清楚,[ 分享码的获取方法 ]可以参考这篇文章
☉ 推荐使用 [ 迅雷 ] 下载,使用 [ WinRAR v5 ] 以上版本解压本站软件。
☉ 如果这个软件总是不能下载的请在评论中留言,我们会尽快修复,谢谢!
☉ 下载本站资源,如果服务器暂不能下载请过一段时间重试!或者多试试几个下载地址
☉ 如果遇到什么问题,请评论留言,我们定会解决问题,谢谢大家支持!
☉ 本站提供的一些商业软件是供学习研究之用,如用于商业用途,请购买正版。
☉ 本站提供的Web数据挖掘:挖掘Web内容模式、结构和用途 英文 PDF版 [7M]资源来源互联网,版权归该下载资源的合法拥有者所有。


![Web数据挖掘:挖掘Web内容模式、结构和用途 英文 PDF版 [7M]](http://img.jbzj.com/do/uploads/litimg/120924/144P11BK7.jpg)










![网络黑白 中文pdf扫描版[35MB]](http://img.jbzj.com/file_images/article/201907/2019711170552126.jpg)
![内容分发网络原理与实践 带目录完整pdf[150MB]](http://img.jbzj.com/file_images/article/201908/201985170039247.jpg?2019751715)
![软件定义网络:核心原理与应用实践 带目录完整版pdf[88MB]](http://img.jbzj.com/file_images/article/201908/2019826164906375.jpg)
![5G无线接入网络:雾计算和云计算 中文pdf高清版[55MB]+epub](http://img.jbzj.com/file_images/article/201909/201994165333253.jpg)
![无线电安全攻防大揭秘 中文pdf扫描版[24MB]](http://img.jbzj.com/file_images/article/201909/201996165006666.jpg)
![NB-IoT物联网技术解析与案例详解 中文pdf完整版[159MB]](http://img.jbzj.com/file_images/article/201910/2019109164451780.jpg)
![精通以太坊智能合约开发 中文pdf完整版[119MB]](http://img.jbzj.com/file_images/article/201910/20191010163854605.jpg)
![计算机网络(第7版) (谢希仁著) 完整pdf扫描版[67MB] 计算机网络(第7版) (谢希仁著) 完整pdf扫描版[67MB]](http://img.jbzj.com/do/uploads/litimg/170507/1G5302OD2.jpg)
![计算机网络 第6版 (谢希仁) pdf扫描版[182MB] 计算机网络 第6版 (谢希仁) pdf扫描版[182MB]](http://img.jbzj.com/do/uploads/litimg/161122/1626002L423.jpg)
![网络工程师教程(第五版) 带目录完整版pdf[189MB] 网络工程师教程(第五版) 带目录完整版pdf[189MB]](http://img.jbzj.com/file_images/article/201902/2019211170505805.jpg?201911117523)
![计算机网络:自顶向下方法(原书第4版) PDF扫描版[143MB] 计算机网络:自顶向下方法(原书第4版) PDF扫描版[143MB]](http://img.jbzj.com/do/uploads/litimg/141021/14563RUR8.png)
![计算机网络:自顶向下方法(原书第6版) ([美]库罗斯) 中文pdf扫描 计算机网络:自顶向下方法(原书第6版) ([美]库罗斯) 中文pdf扫描](http://img.jbzj.com/do/uploads/litimg/170804/1HAHG529.jpg)
![网络规划设计师教程 PDF扫描版[60MB] 网络规划设计师教程 PDF扫描版[60MB]](http://img.jbzj.com/do/uploads/litimg/140728/1H5412Q931.png)

![大话通信:通信基础知识读本 PDF扫描版[116MB] 大话通信:通信基础知识读本 PDF扫描版[116MB]](http://img.jbzj.com/do/uploads/litimg/141022/16153HX149.png)
![Windows 内核情景分析 上 毛德操著 中文 PDF版 [185M] Windows 内核情景分析 上 毛德操著 中文 PDF版 [185M]](http://img.jbzj.com/do/uploads/litimg/121121/154032162Z3.jpg)