Implementation of A Machine Learning Technique: Store and Process Big Data in Distributed Environment over the Array of Computers

Nandini Madineni; CH.sivasankar

Issue Abstract

Abstract
Machine learning (ML) is a subdivision of information science which explains with programming the frameworks such that they mechanically learn and enhance with practice. There are several machine learning methods among one, to arrange elements or objects of a given collection into groups based on the likeness between the items called Clustering. For example, the applications related to publishing online news grouping based on articles published in the news using clustering. At present we are breathing in an epoch where information is available in profusion from various media sources like internet, intranet, web etc. The information load has increased to such heights that sometimes it becomes difficult to manage our gadgets tiny mailboxes, predict the dimensions of data and records of popular websites maintenance information up to date. It is true where unknown websites receives and maintains bulk information. To analyze such huge data across the multiple networked computer systems normally it depends on classical mining algorithms to identify trends and draw conclusions. However, the traditional machine learning techniques which are implemented and run on legacy system framework can be sufficiently productive to process constrained datasets and give results in fast time, unless the computational errands are keep running on numerous machines circulated over the clusters of commodity of computers.We propose a novel algorithm can be testing a very big data processed with a new framework called Mahout that allows us to break down a computation task into multiple segments and run each segment on different machines.The experiment results shows that with number of records increases but it will not affect the system performance and also it will gives good cluster quality.
Key Words: - Machine Learning, Clustering, K-Means, Hadoop, Mahout,

Author Information

Nandini Madineni

Issue No

11

Volume No

3

Issue Publish Date

05 Nov 2017

Issue Pages

121-126

Issue References

References
1) Ch.Sivasankar, D.Vivekananda Reddy, “Document Clustering Approach Using Internal Criterion Function,” International Journal of Innovations in Engineering and Technology (IJIET), Vol. 3 Issue 4 April 2014, ISSN: 2319 – 1058
2) Herrington, “Machine learning in ActionPeter” ISBN 9781617290183.
3) Anantha Grama,Anshul Gupta,George Karypisand Vipin Kumar “Introduction to Parallel Computing,” Second Edition.
4) Piero Giacomelli, “Apache Mahout Cookbook,” Open Source.

Back To Issues

Cheeseburger menu

Implementation of A Machine Learning Technique: Store and Process Big Data in Distributed Environment over the Array of Computers

Services

Publications

Quick Links

GET JOURNAL UPDATES

Resources