Big data as a value generator in decision support systems: a literature review.

AutorGrander, Gustavo
  1. Introduction

    While Big Data can be used as a powerful tool to treat various social diseases, offering the potential for new insights into areas such as medical research, counterterrorism and climate change, its use also allows invasions of privacy, diminished civil liberties and increased state and corporate control (Boyd& Crawford, 2012). Thereisa big challengeinmanaging Big data due to the increasing and cheap volume of data storage (Demirkan & Delen, 2013).

    Most data sets from which scientists and researchers have been able to extract real meaning are still very small compared to the proportion ofdata that canbe captured (Dobre & Xhafa, 2014). Big data analysis should include the phases of data generation, acquisition, storage and analysis and can provide useful values at each stage through judgments, suggestions, support or decisions (Chen, Mao & Liu, 2014a). Some authors point out that advances in analytical techniques, especially machine learning, have been a major facilitator for dealing with large data set analysis (Murdoch & Detsky, 2013).

    Big data are defined as a large set of data that are difficult to store, process analyze and understand using traditional database processing tools (Huang & Chaovalitwongse, 2015), Big data emerge as a paradigm shift in how organizations make decisions (Mortenson, Doherty & Robinson, 2015). Therefore, through decision support systems (DSS), it is possible to process large volumes of data using output models with accessible interfaces (Constantiou & Kallinikos, 2014). Research in big data and DSS has presented technological aspects and big data design challenges as the main focus (Chen, Mao, Zhang & Leung, 2014b).

    We, therefore, have the opportunity to propose an in-depth analysis of how DSS manages big data to obtain value. In this context, we performed a systematic literature review (SLR) of the use of big data in DSS. To achieve this goal, we propose to answer two questions: DSS manages big data to obtain value which techniques and technologies? And, DSS has been applied to solve what types of problems? To answer the research questions, we conducted an SLR with research in the academic databases Scopus and Web of Science (WoS), in August 2019.

    As a contribution of our study, we show that DSS is used for the management of big data mainly through techniques such as big data analytics (BDA), machine learning algorithms and technologies such as cloud computing. The main areas in which these techniques and technologies are applied are logistics, traffic, health, organization and market.

    This article is structured as follows: In addition to this Introduction section, we present in Section 2 the methodological procedures applied in the research, the results are presented in Section 3, discussions in Section 4 and in Section 5, we present the conclusion of our study.

  2. Methodology

    We performed the SLR based on the guidelines suggested by Petticrew & Roberts (2006) to understand the useof big data in DSS. We have chosen to developan SLR becauseof the rigor of the research methodology that involves systematic data collection procedures, descriptive and qualitative data analysis techniques.

    2.1 Data collection

    First, we start with the identification phase with the application of a search string, initially basedon preliminary searches. Then, weapply the string tothe database search toolsinorder to identify other ways in which the searched terms are referenced (Petticrew&Roberts, 2006). Thus, after a few rounds of searching, new terms were identified and added to the initial string to obtain a corpus of analysis. Therefore, the first stage of the SLR consisted of searching the Scopus and WoS databases, using the following Boolean terms: "Big Data" AND ("Decision Theory" or "decision support system*" or "decision-support system*"). The use of the asterisk (*) in the search string serves to obtain variations of words in their plural form.

    The second stage of the SLR had as criteria the filter by areas. The Scopus database was restricted to the following areas: business, management and accounting; decision sciences; social sciences. The WoS base was restricted to the following areas: business, management and social science interdisciplinary.

    The third stage of the SLR consisted of filtering the databases keeping only articles, thereby excluding books, reviews, etc. The fourth step of SRL consisted of checking the availability of downloadable articles. The fifth stage of the SLR was reading the articles and checking their alignment with the research objective; articles without objective strongly dealing with the subject were considered inappropriate for the analysis, and therefore, they were excluded from the sample. The sixth stage of the SLR consisted of identifying repeated articles and consequently reducing duplication. Finally, the seventh stage of SLR was the consolidation of the two bases for an in-depth analysis of the articles. The exclusion flow of articles is presented in Table 1.

    The initial search totaled 1,427 documents contained in both databases, and at the end of the seven steps, 5% of the sample, 72 articles, were considered adequate for full reading and analysis.

    2.2 Data analysis

    We performed the SLR of the 72 selected articles, of which 63 were empirical and 9 were theoretical. The articles were read in full and classified in a spreadsheet according to relationships identified throughout the analysis and that met the research questions. We note that the first record refers to the year 2012, and until 2018, there is a growth in the number of publications. The year 2019 had eight publications, but due to the date of the search in the databases, it was not possible to account for the total number of publications in the year (Table 2).

    Regarding the analysis process, after selecting and collecting the database, the researchers started categorizing the contents from a qualitative perspective (Petticrew & Roberts, 2008). A recursive process was applied based on reflective critical reading and classification of the content according to its adherence to the themes proposed in this research. The process of an SLR must allow its transparency and replicability as pointed out by Tranfield, Denyer & Smart (2003). The following are the results of this study and their respective analyzes.

  3. Results

    To answer the first research question, we classified the articles into two groups: The first group deals with techniques, and the second group with technologies. There is a wide variety of techniques and technologies for capturing, selecting, analyzing and visualizing big data, and these tools focus on three classes: batch processing, flow processing and interactive analysis tools (Chen & Zhang, 2014).

    3.1 Techniques applied in big data

    The items presented here are related to the application of data analysis techniques to obtain value from big data.

    3.1.1 Machine learning technique with supervised learning algorithm. Classification algorithms from this group were used, for example, as a management tool for the creation of a DSS that enabled credit risk assessment in the financial market (Hayashi, 2016), for building modernization assessment allowing the decision-maker to select the best alternatives in terms of energy consumption and installation cost (Rasiulis, Ustinovichius, Vilutiene & Popov, 2016), to deal with business data heterogeneity and multidimensionality (Nimmagadda, Reiners & Wood, 2018). We also found application in route optimization models through DSS to view traffic volume and weather interactions applicable to transport planners, traffic control rooms and urban infrastructure DSS (Sathiaraj, Punkasem, Wang & Seedah, 2018) and through an online route generation system with Dijkstra algorithm, which resulted in changes of route paradigms that were determined together (Ronnqvist, Svenson, Flisberg & Jo). Finally, we found an application for the purpose of identifying and predicting social issues through online news analytics (Suh, 2019).

    Other findings were with vector machine for decision-making optimization through patient similarity (Tashkandi, Wiese & Wiese, 2018) and in optimizing data collection for cancer classification and online critic sentiments (Ghaddar & Naoum-Sawaya, 2018), the fuzzy rule to determine the health status of cattle to predict nutritional intake (Sivamani, Choi & Cho, 2018), Mehrabian-Russell model for forecasting consumer purchases using climate parameters (Tian, Zhang & Zhang, 2018), the use of a set of attribute reduction and data set analysis in information systems (Li, Yang, Jin & Guo, 2017) and rapid safety feedback used to identify child maltreatment (Gillingham, 2019a).

    We have identified that techniques applied in the above studies were for predictive data analysis, which helps anticipate changes based on understanding patterns and anomalies within a database (Hurwitz & Kirsch, 2018).

    The second group of articles in our technique analysis was linear regression that has been observed in stock market index prediction studies (Khan et al., 2018), in DSS for collaborative logistics networks (Ilie-Zudor et al., 2015) and in predicting fuel consumption based on driving behaviors (Hsu, Lim & Yang, 2017). The least absolute shrinkage operator regression was used to estimate risk-adjusted performance in hospitals (Feuerriegel, 2016), for forecasting electricity prices based on historical data and analysis of weather conditions (Ludwig, Feuerriegel & Neumann, 2015) and for predicting population health indices from social media data to improve predictive performance (Nguyen et al., 2017).

    Regression of neural networks was applied to estimate rail integrity conditions to assist in maintenance control (Jamshidi et al., 2018), through regression for multipurpose utility analysis, a methodology was developed to analyze resource scarcity problems in rural transport management (Chen, Achtari, Majkut & Sheu, 2017), and a...

Para continuar a ler

PEÇA SUA AVALIAÇÃO

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT