Association rules analysis for objects hierarchy

Association rules are one of the most popular methods of data mining. This technique allows to discover interesting dependences between objects. The thesis concerns on association rules for hierarchy of objects. As a multi–level structure is used DBLP database, which contains bibliographic descriptions of scientific papers conferences and journals in computer science. The main goal of thesis is investigation of interesting patterns of co-authorship with respect to different levels of hierarchy. To reach this goal own extracting method is proposed.

Contents

1 Introduction
1.1 The goals of the work
1.2 Aims and objectives
1.3 Research questions
1.4 The contents of the chapters
2 Data mining
2.1 Data mining methods
2.2 Data mining process
2.3 Data mining goals
3 Association rules
3.1 Introduction
3.2 Measures: support, confidence, lift
3.3 Types of association rules
3.3.1 Spatial – geographical association rules
3.3.2 Quantitative association rules
3.4 Algorithms
3.4.1 A-priori algorithm
3.4.2 Partition algorithm
3.4.3 Effective Hash-based algorithm – DHP
3.4.4 Algorithm with an informative rule set
4 Hierarchical association rules
4.1 Introduction
4.1.1 Association rules with uniform minimum support for all levels
4.1.2 Association rules with reduced minimum support at lower levels
4.1.3 Cross-level association rules
4.2 Algorithms for hierarchical association rules
4.2.1 Algorithm ML_T2L1
4.2.2 Algorithm for mining spatial association rules
5 DBLP database
5.1 Introduction
5.2 DBLP database profile
5.3 Structure of DBLP
5.4 Hierarchies in DBLP – introduction
6 Association rules analysis for DBLP
6.1 Plan of tests
6.1.1 Hierarchies in DBLP database
6.1.2 DBLP – grouping algorithm
6.1.3 Tests for DBLP database
6.2 Description of test
6.2.1 Test environment
6.2.1.1 Tool for mining association rules
6.2.1.2 Data for mining association rules
6.2.2 Process of deriving data from DBLP database
6.2.2.1 Mapping data from XML DBLP database to SAS software
6.2.2.2 Cleaning
6.2.2.3 Distinguishing the journals and conferences
6.2.2.4 Merging tables together and cleaning process
6.2.2.5 SAS problems with association rules
6.2.2.6 Mining association rules using SAS enterprise manager
6.2.2.7 Presentation of the results
6.3 General characteristic of DBLP
6.3.1 Journals and conferences
6.3.1.1 Journals
6.3.1.2 Conferences
6.3.2 Journals and conferences in years
6.3.2.1 Journals in years
6.3.2.2 Conferences in years
6.3.3 Authors for journals and articles
6.3.3.1 Journals authors
6.3.3.2 Conference authors
6.3.3.3 Journal and conference authors
6.3.4 Articles and papers in years
6.3.4.1 Articles in in years
6.3.4.2 Papers in years
6.4 Hierarchical analysis and discussion
6.4.1 Progression of co-authorship in years
6.4.2 Progression of co-authorship
6.4.3 Progression of certain journals and conferences in years
6.4.4 Dependences between years and names
6.4.5 Dependences between journals and conferences
6.4.6 Progression of journals and conferences
6.4.7 General conclusions of executed tests
7 Conclusions and future work
8 References
Appendix A Map File
Appendix B Abstract

Author: Przemyslaw Pietruszewski

Source: Blekinge Institute of Technology

Reference URL 1: Visit Now

Leave a Comment