Abstract
Numerous applications of topical interest call for knowledge discovery and classification from information that may be inaccurate and/or incomplete. For example, in an airport threat classification scenario, data from heterogeneous sensors are used to extract features for classifying potential threats. This requires a training set that utilizes non-traditional information sources (e.g., domain experts) to assign a threat level to each training set instance. Sensor reliability, accuracy, noise, etc., all contribute to feature level ambiguities; conflicting opinions of experts generate class label ambiguities that may however indicate important clues. To accommodate these, a belief theoretic approach is proposed. It utilizes a data structure that facilitates belief/plausibility queries regarding "ambiguous" itemsets. An efficient apriori-like algorithm is then developed to extract frequent such itemsets and to generate corresponding association rules. These are then used to classify an incoming "ambiguous" data instance into a class label (which may be "hard" or "soft"). To test its performance, the proposed algorithm is compared with C4.5 for several databases from the UCI repository and a threat assessment application scenario.
Original language | English (US) |
---|---|
Article number | 13 |
Pages (from-to) | 98-107 |
Number of pages | 10 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 5803 |
DOIs | |
State | Published - 2005 |
Event | Intelligent Computing: Theory and Applications III - Orlando, FL, United States Duration: Mar 28 2005 → Mar 29 2005 |
Keywords
- Association rules
- Classification
- Data ambiguities
- Data mining
- Dempster-Shafer belief theory
- Imperfect data
- Missing data
ASJC Scopus subject areas
- Electronic, Optical and Magnetic Materials
- Condensed Matter Physics
- Computer Science Applications
- Applied Mathematics
- Electrical and Electronic Engineering