Leveraging Open Data with Machine Learning Algorithms

Amirah; Fitrah Karimah

doi:10.70356/jafotik.v1i2.19

Authors

Amirah Lentera Ilmu Publisher
Fitrah Karimah Lentera Ilmu Publisher

DOI:

https://doi.org/10.70356/jafotik.v1i2.19

Keywords:

Open Data, Machine Learning, Predictive Policing, Support Vector Machine, Synergy

Abstract

In the evolving landscape of technology, the amalgamation of open data and machine learning stands as a powerful catalyst for innovation. This study explores the dynamic synergy between these domains, where open data's accessibility and transparency converge with machine learning's pattern recognition and predictive capabilities. The fusion holds immense promise across diverse sectors, from healthcare to finance, urban planning, and environmental science. By leveraging advanced algorithms on openly available information, organizations can gain unprecedented insights into trends, correlations, and anomalies, fostering a culture of innovation. The methodology involves a comprehensive literature review, knowledge enrichment, case studies, and conclusion, providing a systematic approach to understanding the intersection of open data and machine learning. The results showcase practical applications in predictive policing, healthcare resource allocation, smart traffic management, and more. Each application is supported by relevant machine learning algorithms, emphasizing their role in addressing complex challenges. The study culminates with a simplified example of predictive policing using a Support Vector Machine (SVM) algorithm, showcasing its pseudocode and decision function equation. This example illustrates how machine learning can predict crime occurrences based on patrol data and historical crime rates. Overall, this fusion marks a pivotal chapter in technological progress and societal advancement.

Downloads

Download data is not yet available.

References

F. Catalá-López et al., “Transparency, openness, and reproducible research practices are frequently underused in health economic evaluations,” J. Clin. Epidemiol., vol. 165, 2024, doi: 10.1016/j.jclinepi.2023.10.024.

L. Li, H. Yu, and M. Kunc, “The impact of forum content on data science open innovation performance: A system dynamics-based causal machine learning approach,” Technol. Forecast. Soc. Change, vol. 198, no. December 2022, p. 122936, 2024, doi: 10.1016/j.techfore.2023.122936.

T. Guimaraes, R. Duarte, J. Cunha, P. Gomes, and M. F. Santos, “Security and Immutability of Open Data in Healthcare,” Procedia Comput. Sci., vol. 220, no. 2022, pp. 832–837, 2023, doi: 10.1016/j.procs.2023.03.111.

S. Boxebeld et al., “Public preferences for the allocation of societal resources over different healthcare purposes,” Soc. Sci. Med., vol. 341, no. September 2023, p. 116536, 2023, doi: 10.1016/j.socscimed.2023.116536.

A. A. A. Alkhatib, K. A. Maria, S. AlZu’bi, and E. A. Maria, “Smart Traffic Scheduling for Crowded Cities Road Networks,” Egypt. Informatics J., vol. 23, no. 4, pp. 163–176, 2022, doi: 10.1016/j.eij.2022.10.002.

U. Chakraborty, A. Kaushik, G. R. Chaudhary, and Y. K. Mishra, “ur na of,” Curr. Opin. Environ. Sci. Heal., p. 100532, 2024, doi: 10.1016/j.coesh.2024.100532.

S. Martinez Vargas et al., “Monitoring multiple parameters in complex water scenarios using a low-cost open-source data acquisition platform,” HardwareX, vol. 16, no. June, 2023, doi: 10.1016/j.ohx.2023.e00492.

G. F. M. Sekli and I. De La Vega, “Adoption of big data analytics and its impact on organizational performance in higher education mediated by knowledge management,” J. Open Innov. Technol. Mark. Complex., vol. 7, no. 4, 2021, doi: 10.3390/joitmc7040221.

G. Ibarra-Vazquez, M. S. Ramírez-Montoya, M. Buenestado-Fernández, and G. Olague, “Predicting open education competency level: A machine learning approach,” Heliyon, vol. 9, no. 11, p. e20597, 2023, doi: 10.1016/j.heliyon.2023.e20597.

T. K. Lee and J. U. Kim, “A cost-effective and heuristic approach for building energy consumption prediction: BES model calibration and forecasting algorithm,” Energy Build., vol. 303, no. December 2023, p. 113800, 2024, doi: 10.1016/j.enbuild.2023.113800.

S. Bregaglio, F. Ginaldi, E. Raparelli, G. Fila, and S. Bajocco, “Improving crop yield prediction accuracy by embedding phenological heterogeneity into model parameter sets,” Agric. Syst., vol. 209, no. October 2022, p. 103666, 2023, doi: 10.1016/j.agsy.2023.103666.

P. Josso, A. Hall, C. Williams, T. Le Bas, P. Lusty, and B. Murton, “Application of random-forest machine learning algorithm for mineral predictive mapping of Fe-Mn crusts in the World Ocean,” Ore Geol. Rev., vol. 162, no. September, p. 105671, 2023, doi: 10.1016/j.oregeorev.2023.105671.

K. Mainali, M. Evans, D. Saavedra, E. Mills, B. Madsen, and S. Minnemeyer, “Convolutional neural network for high-resolution wetland mapping with open data: Variable selection and the challenges of a generalizable model,” Sci. Total Environ., vol. 861, no. June 2022, p. 160622, 2023, doi: 10.1016/j.scitotenv.2022.160622.

A. A. Khan, O. Chaudhari, and R. Chandra, “A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation,” Expert Syst. Appl., vol. 244, no. December 2023, p. 122778, 2024, doi: 10.1016/j.eswa.2023.122778.

U. Krishnamoorthy, V. Karthika, M. K. Mathumitha, H. Panchal, V. K. S. Jatti, and A. Kumar, “Learned prediction of cholesterol and glucose using ARIMA and LSTM models – A comparison,” Results Control Optim., vol. 14, no. June 2023, p. 100362, 2024, doi: 10.1016/j.rico.2023.100362.