This website uses cookies to ensure you have the best experience. Learn more

Hiding Sensitive Xml Association Rules Via Bayesian Network

3618 words - 14 pages

Abstract—Privacy Preserving Data Mining (PPDM) is getting attention of the researchers in different domain especially in Association Rule Mining. The purpose of the preserving association rules is to minimize the disclosing risk on shared information to the external parties. In this paper, we proposed a PPDM model for XML Association Rules (XARs). The proposed model identifies the most probable item called as sensitive to modify the original data source with more accuracy and reliability. Such reliability is not addressed before in the literature in any kind of methodology used in PPDM domain and especially in XML association rules mining. Thus, the significance of the suggested model sets and open new dimension to the academia in order to control the sensitive information in a more unyielding line of attack.
Keywords: XARs, PPDM, K2 algorithm,Bayesian Network, Association Rules
I. INTRODUCTION
I
n data mining, trends and patterns are identified on a huge set of data to discover knowledge. In such analysis, varieties of algorithms exist for extracting knowledge such as clustering, classification and association rule mining. Thus, association rules mining one domain for delivering knowledge on complex data. Moreover, the basis of the discovered association rules is usually determined by the minimum support s % and minimum confidence c% to represent the transactional items in database D. Thus, it has the implication of the form AB, where A is the antecedent and B is the consequent. The problem with such display of rules is the disclosure of sensitive information to the external part when data is shared. Hence Privacy Preserving in Data Mining (PPDM) related to Association Rules emerges.
In PPDM, Sensitive information is controlled with the help of identification of sensitive item(s) or sensitive rules. The question is how to select or identify the sensitive item(s)? In literature, various methodologies, such as in [2, 3, 4, 5, 6], are proposed by the researchers. The problem with these techniques follow to avoid the generation of sensitive rules with the use of antecedent and consequent while some uses the assumption based sensitive item(s) identification. Furthermore, this act rises another question that whether the assumed or the identified item is estimated with reliability to modify the original data source? Thus, all these questions need to be answered for the more accurate results for such NP-hard problem [7].
To answer this question in a reliable manner, we decided to use Bayesian network [8, 10] for reliability. The problem again stood up because in most areas, a lot of work has been proposed especially in rule mining. This question has been solved through XML which is used for interchange of information over the web and may have information disclosure in the form of association rules. In such case, XML association rules can be found in literature as in [1] but the security issue of XML Association Rules (XARs) has been ignored....

Find Another Essay On Hiding Sensitive XML Association Rules via Bayesian Network

Heterogeneous Parallel Ensemble Classifiers for Face Detection

1254 words - 5 pages may itself be ensembled in cascaded manner. The seminal work of Adaboost based face detection has shown robust classification capability while some of the non-face patterns are identified as face patterns. We will mainly focus on reducing the false positives produced by the Adaboost based face detection system by cascading Adaboost with (i) Neural Networks, (ii) Support Vector Machines and (iii) Bayesian Networks. Then all these three-cascaded

The Underlying Technologies That Drive SaaS

1342 words - 5 pages incremental enhancements which can be rolled out over a specific period of time (Hurbean & Fotache, 2009). This is achieved by abstracting these data structures into small sets of simple and ubiquitous interfaces which are made available via XML (Extensible Markup Language) based web services to all participating software agents (He, 2003). XML based web services allow the communication of a standardized XML format as set out by rules governed

Java Web Services Technologies: Java API for XML Web Services (JAX-WS) and Java API for RESTful Web Services (JAX-RS)

1547 words - 7 pages services (WS-Management) groups. In JAX-WS, according to Oracle, a web service procedure call is represented by an XML-based protocol, such as SOAP (Simple Object Access Protocol). The SOAP specification determines the encoding rules, envelope structure, and conventions for representing web service calls and responses. These calls and responses are broadcasted as SOAP messages (XML files) over HTTP protocol which is designed to permit intermediate

E-Commerce

5632 words - 23 pages -Payment system with a secured transaction plan, Risk Management, and the dynamic model for VFC to support the proposed system to be able to run well. We also make a detailed analysis of EDI and XML to come up with convinced reasons for recommendation of using XML for the strategy.2.0 OVERVIEW OF VIETNAM FIRST-CLASS BANK (VFCB)The organizational initiative discussed below is for a local leading bank, which has been operating in Vietnam for 40 years

Secure Personal Data Servers: a Vision Paper

2711 words - 11 pages document has been defined by a standardization organization such as Health Level 7(HL7) and DB schema provider (ministry of health). XML document has to be enriched with all referential data required to fill the embedded database precisely in order to allow the building of structured view set of related documents. Then the enrich document is going to be encrypted and sent to the recipient PDS via supporting servers. Once recipient PDS receive it

Implications of the CITP program for CPA's

2846 words - 11 pages partner to quickly synchronize their systems by exchanging not just the old structures of EDI data, but also process control templates and business rules as well.The central idea to XML/edi is to add enough intelligence to the electronic documents so that they (and the document-centric tools that handle them) become the framework for electronic business commerce. By combining the five components together, XML/edi provides a system that delivers not

Wireless Application Protocol. By Laura Lamb

999 words - 4 pages just like HTML pages have the extension *.HTML, although WML is mostly about text. Tags such as the use of tables and images would slow down the communication with handheld devices and are therefore strongly restricted. "Since WML is an XML application, all tags are case sensitive ( is not the same as ), and all tags must be properly closed" (www.w3schools.com).BenefitsThe easy, secure access to relevant Internet information and services will be

How Privacy Preserving Data Mining Protects Your Information

665 words - 3 pages large volume of datasets. In the crypto-based approach, data owners have to cooperatively implement specially designed data mining algorithms. Though these algorithms achieve verifiable privacy protection but they suffers from performance and scalability issues. Usage of hiding association rules is projected due to high performance which is generated from frequent item sets. Specific threshold is set at the disclosure risk rate and confidential

Information technology in business

2219 words - 9 pages (B2B) e-commerce is one area where companies can achieve many such efficiencies.B2B e-commerce can be defined as "doing business electronically" or business that is conducted over the Internet. It is most commonly associated with buying and selling information, products and service via the Internet or through the use of private network shared among business partners. B2B e-commerce can also be defined as exchanges of structured messages with other

Architecting Digital-to-Analog Converters Using Game-Theoretic Configurations

2652 words - 11 pages methodologies (Typo), validating that B-trees and suffix trees are regularly incompatible. Furthermore, we concentrate our efforts on disconfirming that the producer-consumer problem can be made authenticated, adaptive, and reliable. In the end, we use ambimorphic modalities to prove that XML and flip-flop gates are never incompatible. The rest of this paper is organized as follows. We motivate the need for redundancy. Similarly, we place our

Exploring Multicast Methodologies and Operating Systems

2013 words - 8 pages simulating Tail is one thing, but simulating it in middleware is a completely different story. That being said, we ran four novel experiments: (1) we measured database and E-mail throughput on our Bayesian overlay network; (2) we measured NV-RAM throughput as a function of ROM throughput on an Atari 2600; (3) we deployed 28 IBM PC Juniors across the Internet network, and tested our RPCs accordingly; and (4) we deployed 70 Nintendo Gameboys across

Similar Essays

Bayesian Learning Essay

1330 words - 5 pages is essentially as expert system that uses declarative knowledge coded as facts and procedural knowledge coded into production rules. The ACT theory of cognition uses Bayesian learning as a foundation for human learning and the ACT-R systems implements Bayesian learning in problem solving and conflict resolution. Associated with each fact and production in the model are base levels of activation and associations strengths which govern the

Xml Within Organizations Essay

2273 words - 10 pages Consortium, which was chaired by Jon Bosak of Sun Microsystems. XML can best be described as a universal format for exchanging structured documents and data on the Web.An XML document type definition (DTD) allows to specify a document type with problem specific markups as a class of XML documents conforming to that specific DTD, i.e. a set of documents that follow these document design rules. Why is XML important ? It removes two constraints which were

Classifying The Arabic Language Texts Part 2

2847 words - 12 pages binary (or continuous but approximated by such discrete values) a common choice of likelihood is the multinomial distribution, hence Multinomial Naive Bayes. This is a direct generalization of the Binomial Naive Bayes model. 2.3.7 Bayes Network Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases

Our Paper

1520 words - 7 pages automatically from realistic data sources. 2) The BN model should not be too sensitive to perturbation on the CPT parameters CPT: a Bayesian network is a Directed Acyclic Graph (DAG) in which: The nodes represent variables of interest (propositions); the Directed links represent the causal influence among the variables; the strength of an influence is represented by conditional probability tables (CPT). For example, if we Imagine that the graph