Big Data Analytics

What is Big Data Analytics?


Big data analytics describes the process of uncovering trends, patterns, and correlations in large amounts of raw data to help make data-informed decisions.

These processes use familiar statistical analysis techniques—like clustering and regression—and apply them to more extensive datasets with the help of newer tools.

How big data analytics works

1. Collect Data

Data collection looks different for every organization. With today’s technology, organizations can gather both structured and unstructured data from a variety of sources — from cloud storage to mobile applications to in-store IoT sensors and beyond. Some data will be stored in data warehouses where business intelligence tools and solutions can access it easily. Raw or unstructured data that is too diverse or complex for a warehouse may be assigned metadata and stored in a data lake.


2. Process Data

Once data is collected and stored, it must be organized properly to get accurate results on analytical queries, especially when it’s large and unstructured. Available data is growing exponentially, making data processing a challenge for organizations. One processing option is batch processing, which looks at large data blocks over time. Batch processing is useful when there is a longer turnaround time between collecting and analyzing data. Stream processing looks at small batches of data at once, shortening the delay time between collection and analysis for quicker decision-making. Stream processing is more complex and often more expensive.


3. Clean Data

Data big or small requires scrubbing to improve data quality and get stronger results; all data must be formatted correctly, and any duplicative or irrelevant data must be eliminated or accounted for. Dirty data can obscure and mislead, creating flawed insights.


4. Analyze Data

Getting big data into a usable state takes time. Once it’s ready, advanced analytics processes can turn big data into big insights. Some of these big data analysis methods include:


Data mining sorts through large datasets to identify patterns and relationships by identifying anomalies and creating data clusters.

Predictive analytics uses an organization’s historical data to make predictions about the future, identifying upcoming risks and opportunities.

Deep learning imitates human learning patterns by using artificial intelligence and machine learning to layer algorithms and find patterns in the most complex and abstract data.


Big data is a term, used to refer data sets that are too large or complex. For processing of this type of data sets use special type of application software. Big data was originally associated with three key concepts: Volume, Variety and Velocity.

Characteristics Big data can be described by the following characteristics:

Volume

Volume defines the quantity of generated and stored data. The size of the data determines its value and its type to understand whether data can be considered as Big data or not.

Variety

Variety defines the type and nature of the data. This helps user to effectively use that data. Big data is combination of text, images, audio and video.

Velocity

Velocity defines the speed at which the data is generated and processed to fulfill the demands and challenges. Big data is often available in real-time. Compared to small data, big data are produced more continually. Two types of velocity related to big data are the frequency of generation and the frequency of handling, recording, and publishing.  

Big Data Types

Mainly, there are three types of Big Data, as given below:

Structured Data:- The structured data can be stored in a tabular column. Examples of structured data are Relational databases.

Unstructured Data:- The unstructured data can be stored in a tabular column. Examples of unstructured data are audio, video etc.

Semi-structured Data:- The semi-structured data contains both structured and unstructured data. Examples of Semi-structured Data are XML data, JSON files, and others.

Qus. 1 : Before you analyse data, you must do the following

  1. Spell check it
  2. Review it
  3. Verify and validate it
  4. Organise and simplify it
Qus. 2 : Which size of data is called as Big Data ?

  1. Giga byte
  2. Mega byte
  3. Meta byte
  4. Kilo byte
Qus. 3 : What is big data analysis?

  1. Analyzing data with a small number of variables
  2. Analyzing data that exceeds the processing capacity of traditional database systems
  3. Analyzing data stored in physical files
  4. Analyzing data using handwritten calculations
Qus. 4 : Which of the following is NOT a characteristic of big data?

  1. Volume
  2. Velocity
  3. Variety
  4. Validity
Qus. 5 : What is the primary goal of big data analysis?

  1. To collect as much data as possible
  2. To analyze data quickly without considering accuracy
  3. To derive meaningful insights and make data-driven decisions
  4. To ignore data variety and focus only on volume
Qus. 6 : Which technology is commonly used for storing and processing big data?

  1. Mainframe computers
  2. Traditional relational databases
  3. Cloud computing and distributed computing platforms
  4. Personal computers
Qus. 7 : What is the purpose of data preprocessing in big data analysis?

  1. To reduce the size of the dataset
  2. o increase data complexity
  3. To clean, transform, and prepare data for analysis
  4. To randomly sample the dataset
Qus. 8 : What is the role of machine learning in big data analysis?

  1. To replace human analysts with automated systems
  2. To generate random data samples
  3. To identify meaningful patterns and insights in large datasets
  4. To ignore data variety and focus only on volume
Qus. 9 : Which of the following is NOT a challenge of big data analysis?

  1. अनुमापकता
  2. Data security and privacy
  3. Limited storage capacity
  4. Data quality and integrity
Qus. 10 : Who was the first to use the term Big Data?

  1. Steve Jobs
  2. Bill Gates
  3. John Mashey
  4. John Bredi

Programs

Latest Current Affairs 2024 Online Exam Quiz for One day Exam Online Typing Test CCC Online Test Python Programming Tutorials Best Computer Training Institute in Prayagraj (Allahabad) Online MBA 2 years Online MCA Online BCA Best Website and Software Company in Allahabad