All IT Courses 50% Off
QA Tutorials

Big Data Testing

Big Data testing is a technique or a process of testing big data application in order to make sure that all the functionalities of a big data application work as expected. The main aim of Big data testing is to ensure that the big data system runs smoothly and error-free while maintaining the performance and also big data collection of largest  datasets that will not be processed using the traditional computing techniques. Testing of these datasets involves numerous tools, techniques and frameworks to process. Big data relates to data creation,storage,retrieval and analysis which is remarkable in the process of volume,variety and velocity.

What is the strategy of big data testing?

Testing the big data application is more of checking its processing instead of testing the individual features of the software package. In big data, QA engineers verifies a successful processing of the terabytes of data using commodity cluster and other supportive components. The demands a high level testing skills as the processing is very fast. This processing may be of three types

With this data quality which is very important factor in Hadoop testing. Before testing this application, it is very necessary to check the quality of data and should be considered as a part of database testing. This also involves checking various characteristics like conformity, accuracy, duplication, consistency, validity data completeness.

Performance testing approach

Performance testing for big data application involves testing of huge volumes of structured and unstructured data and also requires a specific testing approach to test much massive data.

Performance Testing Approach

Here performance testing is executed in the below order:

  1. The process begins with a setting of the big data cluster which is to be tested for the performance.
  2. Identify and design corresponding workloads
  3. Preparing individual clients.
  4. Executing the test and analyse the result
  5. Optimum configuration.

The parameters of the performance testing:

There are numerous parameters which are to be verified for the performance testing:

  • Data storage: How data is stored in varied nodes.
  • Commit logs: How big the commit log is allowed to grow
  • Concurrency: How many threads can perform write and read operation.
  • Caching: Tune the cache setting like “row cache” and “Key cache”.
  • Timeouts: values the connections timeout and querying timeout.
  • JVM parameters: Heap size, GC collections algorithms.

Message queries: Message rate or size.

Big data testing Vs Traditional database testing:

  1. Properties: Data
Traditional database testing-Tester work with structured data
Big data testing-Tester works with both structured and unstructured data.

2. Properties                                      Testing Approach

Traditional database testing    -testing approach is well defined and time tested.
Big data testing     -The testing approach focuses R& D efforts.

3. Properties                                       Testing Strategy

Traditional database testing-Tester has the option of “sampling” strategy doing manually or “exhaustive verification” Strategy by the automation tool.
Big data testing-sampling strategy in big data is a Challenge

4. Properties Infrastructure

Traditional database testing-It doesn’t need a special test environment  as the file size has limit.
Big data testing  -It needs a special test environment due to large data size and files.

5. Properties validation tools

Traditional database testing-Testing tools can be used with basic operating knowledge and less training.
Big data testing-It needs a particular set of skills and training to operate a testing tool. Tools are in their nascent stage and over time it may come up with new features.


Facebook Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Check Also
Back to top button

Get Python Course
worth 499$ for FREE!

Offer valid for 1st 20 seats only, Hurry up!!

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

H2kinfosys Blog will use the information you provide on this form to be in touch with you and to provide updates and marketing.