Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • Research Outputs
  • Fundings & Projects
  • People
  • Statistics
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Italiano
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Srpski (lat)
  • Suomi
  • Svenska
  • Türkçe
  • Tiếng Việt
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Српски
  • Yкраї́нська
  • Log In
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Academic Research Output
  3. Conference Paper
  4. Fast Cluster-learning with Prior Probability from Big Dataset
 
  • Details
Options

Fast Cluster-learning with Prior Probability from Big Dataset

Date Issued
2018
Author(s)
Li, Tengyue
Fong, Simon
Lobo Marques, Joao Alexandre 
Faculty of Business and Law 
Wong, Raymond K.
DOI
10.1109/ISCMI.2018.8703219
Abstract
Association Rule Mining by Aprior method has been one of the popular data mining techniques for decades, where knowledge in the form of item-association rules is harvested from a dataset. The quality of item-association rules nevertheless depends on the concentration of frequent items from the input dataset. When the dataset becomes large, the items are scattered far apart. It is known from previous literature that clustering helps produce some data groups which are concentrated with frequent items. Among all the data clusters generated by a clustering algorithm, there must be one or more clusters which contain suitable and frequent items. In turn, the association rules that are mined from such clusters would be assured of better qualities in terms of high confidence than those mined from the whole dataset. However, it is not known in advance which cluster is the suitable one until all the clusters are tried by association rule mining. It is time consuming if they were to be tested by brute-force. In this paper, a statistical property called prior probability is investigated with respect to selecting the best out of many clusters by a clustering algorithm as a pre-processing step before association rule mining. Experiment results indicate that there is correlation between prior probability of the best cluster and the relatively high quality of association rules generated from that cluster. The results are significant as it is possible to know which cluster should be best used for association rule mining instead of testing them all out exhaustively.
Subjects

Data mining

Big Data

Association Rule Mini...

Clustering

Clustering algorithms...

Itemsets

Preprocessing

Prior Probability

Probabilistic logic

Probability distribut...

File(s)
No Thumbnail Available
Name

Waiting for Repository Version.pdf

Size

37.66 KB

Format

Adobe PDF

Checksum

(MD5):70439f9ac5a8bde2f366653765cefe3c


  • YouTube
  • Instagram
  • Facebook


USJ Library

Estrada Marginal da Ilha Verde
14-17, Macau, China

E-mail:library@usj.edu.mo
Tel:+853 8592 5633

Quick Link

Direction & Parking
USJ website
Contact Us

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback