Big Data Definition
Big data is a term often used when there is an enormous data set that is beyond the scope of traditional mining techniques or human handlings. Also, big data describes the presence of a large volume of data (either structured data and unstructured data), the volume of the data becomes so massive that traditional techniques are no longer valid.
Big data also refers to a situation where data are created and collected from several sources at the speed of light. The data become so voluminous that they cannot be analyzed by humans or computed used traditional mining techniques.
A Little More on What is Big Data
More extensively, big data is a field that offers a solution to the computation and analysis of data that are too large to be handled by traditional techniques and human analysts. Big data is related to the creation, extraction, and collation of a set of information or data from multiple sources in a very fast manner. Big data are large data and are often associated with large businesses or companies with a large scope of business.
Big data is a field that provides an opportunity for companies with massive data to carry out an effective collation and analysis of the data. Since data can be collected through different means, big data ensures an easy assemblage of data as well as their analysis.
Challenges of Using Big Data
Despite that there are many advantages of using big data, it comes with some challenges, the common challenges associated with big data are;
- Big data can lead to overwork because it entails sorting out relevant data from the irrelevant ones.
- Determining what data is relevant or not is also a difficult process as some r3levant data can be omitted.
- Big data contains both structured and unstructured data, the sorting and analysis process is often cumbersome.
- Handling large volumes of data can also create noise.
Reference for “Big Data”
Academics research “Big Data”
Data mining with big data, Wu, X., Zhu, X., Wu, G. Q., & Ding, W. (2014). Data mining with big data. IEEE transactions on knowledge and data engineering, 26(1), 97-107. Big Data concern large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development of networking, data storage, and the data collection capacity, Big Data are now rapidly expanding in all science and engineering domains, including physical, biological and biomedical sciences. This paper presents a HACE theorem that characterizes the features of the Big Data revolution, and proposes a Big Data processing model, from the data mining perspective. This data-driven model involves demand-driven aggregation of information sources, mining and analysis, user interest modeling, and security and privacy considerations. We analyze the challenging issues in the data-driven model and also in the Big Data revolution.
Big data: A survey, Chen, M., Mao, S., & Liu, Y. (2014). Big data: A survey. Mobile networks and applications, 19(2), 171-209. In this paper, we review the background and state-of-the-art of big data. We first introduce the general background of big data and review related technologies, such as could computing, Internet of Things, data centers, and Hadoop. We then focus on the four phases of the value chain of big data, i.e., data generation, data acquisition, data storage, and data analysis. For each phase, we introduce the general background, discuss the technical challenges, and review the latest advances. We finally examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid. These discussions aim to provide a comprehensive overview and big-picture to readers of this exciting area. This survey is concluded with a discussion of open problems and future directions.
Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon, Boyd, D., & Crawford, K. (2012). Critical questions for big data: Provocations for a cultural, technological, and scholarly phenomenon. Information, communication & society, 15(5), 662-679. The era of Big Data has begun. Computer scientists, physicists, economists, mathematicians, political scientists, bio-informaticists, sociologists, and other scholars are clamoring for access to the massive quantities of information produced by and about people, things, and their interactions. Diverse groups argue about the potential benefits and costs of analyzing genetic sequences, social media interactions, health records, phone logs, government records, and other digital traces left by people. Significant questions emerge. Will large-scale search data help us create better tools, services, and public goods? Or will it usher in a new wave of privacy incursions and invasive marketing? Will data analytics help us understand online communities and political movements? Or will it be used to track protesters and suppress speech? Will it transform how we study human communication and culture, or narrow the palette of research options and alter what ‘research’ means? Given the rise of Big Data as a socio-technical phenomenon, we argue that it is necessary to critically interrogate its assumptions and biases. In this article, we offer six provocations to spark conversations about the issues of Big Data: a cultural, technological, and scholarly phenomenon that rests on the interplay of technology, analysis, and mythology that provokes extensive utopian and dystopian rhetoric.
The parable of Google Flu: traps in big data analysis, Lazer, D., Kennedy, R., King, G., & Vespignani, A. (2014). The parable of Google Flu: traps in big data analysis. Science, 343(6176), 1203-1205.
Big data: A review, Sagiroglu, S., & Sinanc, D. (2013, May). Big data: A review. In 2013 International Conference on Collaboration Technologies and Systems (CTS) (pp. 42-47). IEEE. Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. These useful informations for companies or organizations with the help of gaining richer and deeper insights and getting an advantage over the competition. For this reason, big data implementations need to be analyzed and executed as accurately as possible. This paper presents an overview of big data’s content, scope, samples, methods, advantages and challenges and discusses privacy concern on it.