Monday 25 April 2016

Get value out of your data with Hadoop Today

What is Hadoop?

Hadoop is free, open source frame work java based programming that supports and processing the large data sets in distributed computing environment .it is a part of Apache project and sponsored by Apache software Foundation.

How it gives value to your data?

With the help of Hadoop you can give value to your data by big data technology. Big data means a huge collection of data sets which usually work on banking sector, railway, ecommerce, and social media, stock market, manufacturing etc.

Big data technologies are important in providing more accurate analysis, which may lead to more concrete decision-making resulting in greater operational efficiencies, cost reductions, and reduced risks for the business.

To trapping the power of big data, you would require an infrastructure that can manage and process huge volumes of structured and unstructured data in real time and can protect data privacy and security.

There are various technologies in the market from different vendors including Amazon, IBM, Microsoft, etc. to handle big data. While looking into the technologies that handle big data are Analytics and operations.

With the help of NoSQL big data technology can design to take the benefit of cloud computation that emerge all decades and run efficiently.

Analytics of big data Massively Parallel Processing (MPP) database systems and MapReduce that provide analytical capabilities for backward-looking and complex analysis that may touch most or all of the data. MapReduce provides a new method of analysing data that is complementary to the capabilities provided by SQL, and a system based on MapReduce that can be scaled up from single servers to thousands of high and low end machines.

Hadoop Basic components 
miriinfotech.com - Big data solution


Hadoop Architecture

Hadoop framework includes following four modules:
·      Hadoop Common: These are Java libraries and utilities required by other Hadoop modules. These libraries provide file system and OS level abstractions and contains the necessary Java files and scripts required to start Hadoop.
·  Hadoop YARN: This is a framework for job scheduling and cluster resource management.
·       Hadoop Distributed File System (HDFS): A distributed file system that provides high-throughput access to application data.

·    Hadoop MapReduce: This is YARN-based system for parallel processing of large data sets.