Architecting Enterprise-Scale Data Products: A Framework for Advanced Data Science and AI/ML Operations DOI Open Access

S Venkata

International Journal of Scientific Research in Computer Science Engineering and Information Technology, Journal Year: 2024, Volume and Issue: 10(6), P. 1724 - 1734

Published: Dec. 15, 2024

This article presents a comprehensive framework for building enterprise-scale data products that power modern Customer & Product Analytics, Data Science, artificial intelligence, and machine learning initiatives. The examines the foundational architecture patterns, pipeline engineering strategies, advanced distributed computing approaches in both on-prem cloud. These are essential developing robust infrastructure capable of handling complex AI/ML workflows. explores critical aspects feature at scale, real-time processing capabilities, implementation stores, while addressing challenges quality, governance, legal, security regulated environments. introduces systematic approach to integrating with MLOps pipelines, emphasizing importance automated workflows, monitoring systems, feedback loops production findings demonstrate successful scalable requires careful balance architectural decisions, technology selection, operational practices. contributes field by providing actionable insights patterns organizations can adopt build resilient, scalable, efficient their use cases. establishes bridges gap between theoretical principles practical enterprise settings.

Language: Английский

Optimizing healthcare big data performance through regional computing DOI Creative Commons
Tariq Alsahfi, Afzal Badshah,

Omar Aboulola

et al.

Scientific Reports, Journal Year: 2025, Volume and Issue: 15(1)

Published: Jan. 24, 2025

The healthcare sector is experiencing a digital transformation propelled by the Internet of Medical Things (IOMT), real-time patient monitoring, robotic surgery, Electronic Health Records (EHR), medical imaging, and wearable technologies. This proliferation tools generates vast quantities data. Efficient timely analysis this data critical for enhancing outcomes optimizing care delivery. Real-time processing Healthcare Big Data (HBD) offers significant potential improved diagnostics, continuous effective surgical interventions. However, conventional cloud-based systems face challenges due to sheer volume time-sensitive nature migration large datasets centralized cloud infrastructures often results in latency, which impedes applications. Furthermore, network congestion exacerbates these challenges, delaying access vital insights necessary informed decision-making. Such limitations hinder professionals from fully leveraging capabilities emerging technologies big analytics. To mitigate issues, paper proposes Regional Computing (RC) paradigm management HBD. RC framework establishes strategically positioned regional servers capable regionally collecting, processing, storing data, thereby reducing dependence on resources, especially during peak usage periods. innovative approach effectively addresses constraints traditional facilitating at level. Ultimately, it empowers providers with information required deliver data-driven, personalized optimize treatment strategies.

Language: Английский

Citations

0

Stakeholder Interactions and Ethical Imperatives in Big Data and AI Development DOI Creative Commons
Jarosław Brodny, Magdalena Tutak

Journal of Open Innovation Technology Market and Complexity, Journal Year: 2025, Volume and Issue: unknown, P. 100491 - 100491

Published: Feb. 1, 2025

Language: Английский

Citations

0

Exploring Big Data Applications in Sustainable Urban Infrastructure: A Review DOI Creative Commons
David Victor Ogunkan,

Stella Kehinde Ogunkan

Urban Governance, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 1, 2025

Language: Английский

Citations

0

Regional computing approach for educational big data DOI Creative Commons

Bader Alshemaimri,

Afzal Badshah, Ali Daud

et al.

Scientific Reports, Journal Year: 2025, Volume and Issue: 15(1)

Published: March 4, 2025

The educational landscape is witnessing a transformation with the integration of Educational Technology (Edutech). As institutions adopt digital platforms and tools, generation Big Data (EBD) has significantly increased. Research indicates that produce massive data, including student enrollment records, academic performance metrics, attendance learning activities, interactions within environments. This influx data needs efficient processing to derive actionable insights enhance experience. Real-time critical part in environments support various functions such as personalized learning, adaptive assessment, administrative decision-making. However, there may be challenges sending large amounts cloud servers, i.e., latency, cost network congestion. These make it more difficult provide educators students timely services, which reduces efficiency activities. paper proposes Regional Computing (RC) paradigm designed specifically for big management education address these issues. In this case, RC established regions intended decentralize processing. To reduce dependency on infrastructure, regional servers are strategically located collect, process, store related regionally. Our investigation results show latency 203.11 ms 2,000 devices, compared 707.1 Cloud (CC). It also cost-efficient, total just 1.14 USD versus 5.36 cloud. Furthermore, avoids 600% congestion surges seen setups maintains consistent throughput under high workloads, establishing optimal solution managing EBD.

Language: Английский

Citations

0

Statistical Reliability of Data‐Driven Science and Technology DOI Open Access
Ichiro Takeuchi

IEEJ Transactions on Electrical and Electronic Engineering, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 21, 2025

Abstract With the rapid development of AI and machine learning, use data‐driven approaches has been expanding across various fields science technology. In approaches, unlike traditional scientific research technological development, hypotheses are generated based on data, requiring consideration data dependency when evaluating hypotheses. As a result, conventional statistical tests, which have served as foundation for reliability assessments in inadequate properly this paper, we introduce framework known selective inference , gained attention evaluation method We provide an overview recent trends present our studies tests deep learning models inference. © 2025 Institute Electrical Engineers Japan Wiley Periodicals LLC.

Language: Английский

Citations

0

Combining Similarity-Based Correlation and Hierarchical Ascending Clustering for Small Files Problem in HDFS DOI

Hanène Chettaoui,

Farah Hkiri

Lecture notes on data engineering and communications technologies, Journal Year: 2025, Volume and Issue: unknown, P. 234 - 244

Published: Jan. 1, 2025

Language: Английский

Citations

0

Architecting Enterprise-Scale Data Products: A Framework for Advanced Data Science and AI/ML Operations DOI Open Access

S Venkata

International Journal of Scientific Research in Computer Science Engineering and Information Technology, Journal Year: 2024, Volume and Issue: 10(6), P. 1724 - 1734

Published: Dec. 15, 2024

This article presents a comprehensive framework for building enterprise-scale data products that power modern Customer & Product Analytics, Data Science, artificial intelligence, and machine learning initiatives. The examines the foundational architecture patterns, pipeline engineering strategies, advanced distributed computing approaches in both on-prem cloud. These are essential developing robust infrastructure capable of handling complex AI/ML workflows. explores critical aspects feature at scale, real-time processing capabilities, implementation stores, while addressing challenges quality, governance, legal, security regulated environments. introduces systematic approach to integrating with MLOps pipelines, emphasizing importance automated workflows, monitoring systems, feedback loops production findings demonstrate successful scalable requires careful balance architectural decisions, technology selection, operational practices. contributes field by providing actionable insights patterns organizations can adopt build resilient, scalable, efficient their use cases. establishes bridges gap between theoretical principles practical enterprise settings.

Language: Английский

Citations

1