Why Big data security matters
Big Data is a gigantic amount of data characterized by 4Vs (volume, velocity, variety, veracity) and considered beyond the processing capacity of traditional database systems. The productive use of big data analytics has brought a revolution in many fields like business, healthcare, etc. It is frequently used by data-driven organizations like google, amazon, Facebook, etc. for discovering business insights for correct decision making and predictions.
Big Data has serious concerns regarding security matters, a minor slack can adversely affect the business goals or sometimes destroy the whole investment.
Currently, Big data is considered as a major source of business success but it has some concerns regarding security matters in which big data security might be compromised, leaked, or even misused by the organizations. For example, according to (Torre, García-Zapirain, and López-Coronado 2017) healthcare organizations can blackmail people using data related to their health condition or even sell it to other companies for marketing, etc. In big data processing framework data is scattered on different locations in the network, making it more complex and exposed to an attack.
5 Big Data Security Risks
The security of big data may adversely affect the performance of big data analytics and it might create vicious impacts on business. The following are the major five Big Data security risks (Divya, Bhargavi, and Jyothi 2018).
1. Insecure Computation
An untrusted computational program may cause big risk; such programs are used by attackers/hackers to steal important data. It can cause data corruption which leads to incorrect analysis and predictions. Sometimes, such untrusted programs exposed data to Denial of Services(DoS) that can disable the property of massive programming language usage in Big data-parallel processing(Divya et al. 2018). The risk of sensitive data from untrusted computational programs can be seen in Figure 1(Jitendra Chauhan 2015).
2. Input Validation and Filtering
Few records can be easily validating and filtered to avoid any type of risk of incorrect data acquisition but it becomes difficult where data is in Terabytes(TBs) or Petabytes(PBs). Data validation and filtration is a very critical phase where data is sorted based on quality, capability, relevance, irrelevance, untrusted sources. Signature-based data validation and filtration are not able to filter malicious or rogue data which have some behavioral issues. Custom algorithms are designed to filter such data to avoid risks.
3. Granular Access Controls
The concept of Big Data focuses on scalability and analytical performance instead of security matters like traditional databases. It has no comprehensive access control system to avoid any type of unauthorized access. Irrelevant personnel can retrieve sensitive information to lack of proper security measures during big data processing. It may lead to unexpected leakage of information.
4. Insecure Data Storage
Dispersed data is another risk of security violation because data is stored at thousands of nodes where authorization, authentications, encryption of data is risky work. In this situation, sensitive data can be mishandled. Sometimes most of the data is loaded into cloud for fast processing but the security of data is another challenge to assure the security of data(Anon 2015).
5. Privacy Concerns in Data Mining and Analytics
Privacy violation involves unintentional disclosure of sensitive information for marketing, research, and development based on Big Data Analytics during the monetization process.
Data security has been violated in different ways. With the growth of data, the attacker, and hackers also grownup accordingly. Intrusion from outside the organization is not only the source of attack but the person providing services of big data processing can violate the security by stealing or damaging data.
The following are two examples taken from past researches in which big data security was breached(Torre et al. 2017).
- In the year of 2004, the data of Home Depot Medicine was stolen by violating the security which affected 53 million of patients/customers.
- In the year of 2015, another incident happened in which data of 80 million customers was stolen from US Insurances company.
Anon. 2015. “Big Data Security : Risks and Challenges.” Retrieved May 17, 2020 (https://business-iq.net/articles/983-en-big-data-security-risks-and-challenges?v=cloudtech).
Divya, K. Sree, P. Bhargavi, and S. Jyothi. 2018. “International Journal of Computer Sciences and Engineering Open Access Machine Learning Algorithms in Big Data Analytics.” (October):157–66.
Jitendra Chauhan. 2015. “Big Data Security Challenges and Recommendations!” Retrieved May 17, 2020 (https://www.slideshare.net/cisoplatform/big-data-security-challenges-and-recommendations).
Norris, Brian, Alec Mcgail, Matt Gilliam, Bob Boehnlein, John Springer, and Wei Kao. 2015. “Watch Flu Spread Big Data for Social Good Challenge.” 1–5.
Torre, Isabel, Begoña García-Zapirain, and Miguel López-Coronado. 2017. “Analysis of Security in Big Data Related to Healthcare.” The Journal of Digital Forensics, Security and Law 12(3).