IJAECS

International Journal of Advances in Electronics and Computer Science ( IJAECS )
A highly rated peer reviewed monthly International Journal

Editor-in-Chief	:	Dr. P. Suresh
Contact Person	:	Technical Editor
Contact Mail	:	[email protected]
Current Issue	:	Volume-11,Issue-2 ( Feb, 2024 )	View More
Journal Impact Factor	:	2.68	View More

Journal Info

Publisher:IRAJ

ISSN (p): 2394-2835

Issues /Year :12

About DOIONLINE

Download

Download Product Flyer

Download Copyright Form

Download SamplePaper

Recommend to Library

Stay up-to-date

Click here to sign up

Follow us

Paper Detail

Paper Title
An Enhancing XML Big Data Mining Approach on Spark System

Abstract
With the development of cloud computing, intelligent mobile applications, and IoT, XML-type data has changed into large-volume data sets since XML emerged as a popular standard for data exchange among them. XML is a kind of semi-structured data and can be modeled as a tree. As the concept of data sharing becomes popular, the XML features such as the parent-child or ancestor-descendant relationships are widely used to share information in XML big data. Through the parent-child and ancestor-descendant relationships, XML big data exhibits big and massive tree structures, which makes the behaviors on XML big data mining more unconstrained. Users can query data in the tree-structured XML big data through multiple access paths. However, this situation makes more difficult to mine frequent patterns in them. Therefore, how to enhance the performance to find out the frequent patterns among tree-structured XML big data has become an important issue. Several XML pattern mining researches have been proposed focus on enhancing the XML mining performance. However, these researches model XML data as a tree and thus cannot improve the mining performance of big XML data. Also, these researches do not consider the concept of inclusion exclusion principle in combinatorial mathematics to reduce the mining time and I/O costs of generating candidate XML patterns. Thus, the mining performance of tree-structured XML big data cannot to be enhanced effectively. In addition, the existing researches do not consider their algorithms to mine XML big data on the framework of cloud computing and thus damage the system performance. As a result, our research will propose a new approach to mine effective XML frequent patterns on Spark system. Based on Spark’s system, the higher mining and query performance can be achieved for XML big data. Index Terms - Cloud computing, XML frequent patterns, Spark, Hadoop, XML mining.

Author - Tsui-Ping Chang, Chih-Hung Chang, Mao-Lun Chiang