Latest Posts

Kick-start Your Career in Big Data and Hadoop!

We are sure, like everybody else you have heard about how Big Data is taking the world by storm and how a career in Hadoop can really take your future places. But we are also sure, like everybody else you…
Read more

6 Reasons You Must Switch Career to Big Data Now

Big Data has got a lot of young professionals excited about the sterling career prospects and rightly so due to the sheer promise that this new domain holds. Getting a foothold in this exciting arena can take your career places,…
Read more

Internals of HDFS

Here we are going to discuss something deeper about HDFS . IMportant Points : 1. Client  Write  data  directly  to dataNodes  while people think that namenode is writing data the fact  client ask to NameNode  for free block and more…
Read more

5 Ways To Get More Out Of Hadoop

As organisations increasingly look to speed time to market, anticipate and respond to customers’ needs, and introduce new products and services, they need to have peace of mind in knowing that their decisions are based on information that’s fresh and…
Read more

Difference between Hadoop 1.0 and Hadoop 2.0

In Hadoop 1.x, “Namenode” is the single point of failure. In Hadoop 2.x, we have Active and Passive “Namenodes”. If the active “Namenode” fails, the passive “Namenode” takes charge. Because of this, high availability can be achieved in Hadoop 2.x….
Read more

Various Hadoop daemons and their roles in a Hadoop cluster

Namenode: It is the Master node which is responsible for storing the meta data for all the files and directories. It has information around blocks that make a file, and where those blocks are located in the cluster. Datanode: It…
Read more

SOME UNKNOWN FACTS ABOUT APACHE HADOOP

1. What is Hadoop Classpath and what is its role? Classpath consists of a list of directories containing jar files to stop or start daemons. 2. Mention what is distributed cache in Hadoop? Distributed cache in Hadoop is a facility…
Read more

BIG DATA IMPORTANT FACTS

1.What is the Difference between Structured and Unstructured data. STRUCTURED DATA:- If we can store the data in  traditional database systems in the form of rows and columns, for example the transactions done online, then this type of data is…
Read more

BIG DATA IMPORTANT POINTS

What is HDFS? The Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop applications. HDFS is based on GOOGLE’s own filesystem which is the GFS. What is a block and block scanner in HDFS? Block –…
Read more

BIG DATA ECOSYSTEM FRAMEWORKS

Apache Hadoop: framework for distributed processing. Integrates MapReduce (parallel processing), YARN (job scheduling) and HDFS (distributed file system. Apache Ignite: high-performance, integrated and distributed in-memory platform for computing and transacting on large-scale data sets in real-time Apache MapReduce: programming model for…
Read more