Association-based similarity testing and its applications

Tao Li, Mitsunori Ogihara, Shenghuo Zhu

Research output: Contribution to journalArticlepeer-review

8 Scopus citations


This paper proposes a new similarity measure between basket datasets based on associations. The new measure is calculated from support counts using a formula inspired by information entropy. Experiments on both real and synthetic datasets show the effectiveness of the measure. This paper then investigates the applications of the similarity measure. It first studies the problem of finding a mapping between categorical database attribute sets using similarity measures. A generic approach for identifying such a mapping is proposed. The approach is implemented based on the similarity measure proposed in the paper and its performance has been evaluated and validated. Moreover, this paper also explores the applications of using the similarity measure to mine distributed datasets.

Original languageEnglish (US)
Pages (from-to)209-232
Number of pages24
JournalIntelligent Data Analysis
Issue number3
StatePublished - 2003
Externally publishedYes


  • association
  • distributed data mining
  • heterogeneous
  • maximal frequent itemset
  • similarity measure

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence


Dive into the research topics of 'Association-based similarity testing and its applications'. Together they form a unique fingerprint.

Cite this