EFFICIENT BIG DATA ANALYSIS ON COVID-19 MEDICAL DATA USING MACHINE LEARNING TECHNIQUES
Keywords:
Big Data Analytics, Regression analysis models, Machine learning, Health care, Covid -19, feature selection, medical data, Classification.Abstract
In the present technological era, massive amounts of big data are rendered and gathered from various rich data sources. These big data can be of distinct levels of integrity because some are actual while others are vague and indecisive. Embedded in these big data are valuable data and valuable knowledge. Examples of these big data are healthcare and epidemiological data, such as data related to patients who are sorrowed from epidemic illnesses like the coronavirus disease 2019 (COVID-19). Due to the quick spread of COVID-19 to almost every part of the world, massive volumes of data and case investigations have been created, providing investigators with a special opportunity to discover tendencies and make findings like never before by leveraging such big data. This paper demonstrates the growing database of COVID-2019 has been studied from March 1, 2020, to April 11, 2020, and the next one is forecasted for the number of patients suffering from the rising COVID-2019.Various regression analysis models have been operated for data analysis of COVID-2019 in India based on data kept by Kaggle from 1 March 2020 to 11 April 2020. In this analysis, we have performed three regression analysis-based models, namely quadratic, sixth-degree, and exponential polynomial, respectively, for the COVID-2019 dataset. We have calculated these three regression analysis models’ roots mean square error (RMSE). In these three models, the RMSE of a sixth-degree polynomial is significantly less than others, like a quadratic and exponential polynomial. Thus, the sixth-degree polynomial regression model is an excellent model for predicting the next coming days for COVID-2019 data analysis in India.