IJAECS

International Journal of Advances in Electronics and Computer Science ( IJAECS )
A highly rated peer reviewed monthly International Journal

Editor-in-Chief	:	Dr. P. Suresh
Contact Person	:	Technical Editor
Contact Mail	:	[email protected]
Current Issue	:	Volume-11,Issue-2 ( Feb, 2024 )	View More
Journal Impact Factor	:	2.68	View More

Journal Info

Publisher:IRAJ

ISSN (p): 2394-2835

Issues /Year :12

About DOIONLINE

Download

Download Product Flyer

Download Copyright Form

Download SamplePaper

Recommend to Library

Stay up-to-date

Click here to sign up

Follow us

Paper Detail

Paper Title
Parallel Implementation Of K-Means Algorithm Using Hadoop

Abstract
Clustering is regarded as one of the momentous task in data mining which deals with primarily grouping of similar data. To cluster large data is a point of concern. In recent years, data clustering has been studied extensively and a lot of methods and theories have been achieved. Hadoop is a software framework which deals with distributed processing of vast amount of data across groups of distributed computers using Map-Reduce programming model. The Map-Reduce computing model have two phases; a map phase and a reduce phase. The map phase calculates the distances between each point and each cluster and allots each point to its nearest cluster. All the points which belong to the same cluster are sent to a single reduce phase. The reduce phase calculates the new cluster centers for the next Map-Reduce job. Map-Reduce allows a kind of parallelization to solve a problem that involves large datasets using computing clusters and is also a striking implication for data clustering involving large datasets. This paper focuses on studying the parallel implementation of KMeans clustering algorithm using Map-Reduce computing model of Hadoop on different datasets. Keywords— Data Mining, Data Clustering, Parallel Computing, Map-Reduce, K-Means algorithm, Hadoop, HDFS, Machine Learning.

Author - Jerril Mathson Mathew, Jyothis Joseph