keyword
https://read.qxmd.com/read/37115783/a-distributed-computing-model-for-big-data-anonymization-in-the-networks
#21
JOURNAL ARTICLE
Farough Ashkouti, Keyhan Khamforoosh
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT systems. Therefore, the compelling need of the scientific and industrial community is large-scale and robust computing systems. Since one of the characteristics of big data is value, data should be published for analysts to extract useful patterns from them. However, data publishing may lead to the disclosure of individuals' private information...
2023: PloS One
https://read.qxmd.com/read/36991663/big-data-analytics-using-cloud-computing-based-frameworks-for-power-management-systems-status-constraints-and-future-recommendations
#22
REVIEW
Ahmed Hadi Ali Al-Jumaili, Ravie Chandren Muniyandi, Mohammad Kamrul Hasan, Johnny Koh Siaw Paw, Mandeep Jit Singh
Traditional parallel computing for power management systems has prime challenges such as execution time, computational complexity, and efficiency like process time and delays in power system condition monitoring, particularly consumer power consumption, weather data, and power generation for detecting and predicting data mining in the centralized parallel processing and diagnosis. Due to these constraints, data management has become a critical research consideration and bottleneck. To cope with these constraints, cloud computing-based methodologies have been introduced for managing data efficiently in power management systems...
March 8, 2023: Sensors
https://read.qxmd.com/read/36793705/design-and-development-of-a-big-data-platform-for-disease-burden-based-on-the-spark-engine
#23
JOURNAL ARTICLE
Chengcheng Li, Jing Gao, Qingwei Pan, Zhihua Zhou, Yue Yang, Shangcheng Zhou
OBJECTIVE: This study attempts to build a big data platform for disease burden that can realize the deep coupling of artificial intelligence and public health. This is a highly open and shared intelligent platform, including big data collection, analysis, and result visualization. METHODS: Based on data mining theory and technology, the current situation of multisource data on disease burden was analyzed. Putting forward the disease burden big data management model, functional modules, and technical framework, Kafka technology is used to optimize the transmission efficiency of the underlying data...
2023: Computational Intelligence and Neuroscience
https://read.qxmd.com/read/36714440/data-science-technology-course-the-design-assessment-and-computing-environment-perspectives
#24
JOURNAL ARTICLE
Azlan Ismail, Sofianita Mutalib, Haryani Haron
This article discusses the key elements of the Data Science Technology course offered to postgraduate students enrolled in the Master of Data Science program. This course complements the existing curriculum by providing the skills to handle the Big Data platform and tools, in addition to data science activities. We tackle the discussion about this course based on three main requirements, which are related to the need to exploit the key skills from two dimensions, namely, Data Science and Big Data, and the need for a cluster-based computing platform and its accessibility...
January 24, 2023: Education and Information Technologies
https://read.qxmd.com/read/36656558/prediction-and-big-data-impact-analysis-of-telecom-churn-by-backpropagation-neural-network-algorithm-from-the-perspective-of-business-model
#25
JOURNAL ARTICLE
Jiabing Xu, Jiarui Liu, Tianen Yao, Yang Li
This study aims to transform the existing telecom operators from traditional Internet operators to digital-driven services, and improve the overall competitiveness of telecom enterprises. Data mining is applied to telecom user classification to process the existing telecom user data through data integration, cleaning, standardization, and transformation. Although the existing algorithms ensure the accuracy of the algorithm on the telecom user analysis platform under big data, they do not solve the limitations of single machine computing and cannot effectively improve the training efficiency of the model...
January 19, 2023: Big Data
https://read.qxmd.com/read/36616956/a-distributed-big-data-analytics-architecture-for-vehicle-sensor-data
#26
JOURNAL ARTICLE
Theodoros Alexakis, Nikolaos Peppes, Konstantinos Demestichas, Evgenia Adamopoulou
The unceasingly increasing needs for data acquisition, storage and analysis in transportation systems have led to the adoption of new technologies and methods in order to provide efficient and reliable solutions. Both highways and vehicles, nowadays, host a vast variety of sensors collecting different types of highly fluctuating data such as speed, acceleration, direction, and so on. From the vast volume and variety of these data emerges the need for the employment of big data techniques and analytics in the context of state-of-the-art intelligent transportation systems (ITS)...
December 29, 2022: Sensors
https://read.qxmd.com/read/36579056/disease-specific-data-processing-an-intelligent-digital-platform-for-diabetes-based-on-model-prediction-and-data-analysis-utilizing-big-data-technology
#27
JOURNAL ARTICLE
Xiangyong Kong, Ruiyang Peng, Huajie Dai, Yichi Li, Yanzhuan Lu, Xiaohan Sun, Bozhong Zheng, Yuze Wang, Zhiyun Zhao, Shaolin Liang, Min Xu
BACKGROUND: Artificial intelligence technology has become a mainstream trend in the development of medical informatization. Because of the complex structure and a large amount of medical data generated in the current medical informatization process, big data technology to assist doctors in scientific research and analysis and obtain high-value information has become indispensable for medical and scientific research. METHODS: This study aims to discuss the architecture of diabetes intelligent digital platform by analyzing existing data mining methods and platform building experience in the medical field, using a large data platform building technology utilizing the Hadoop system, model prediction, and data processing analysis methods based on the principles of statistics and machine learning...
2022: Frontiers in Public Health
https://read.qxmd.com/read/36515465/cloud-native-distributed-genomic-pileup-operations
#28
JOURNAL ARTICLE
Marek Wiewiórka, Agnieszka Szmurło, Paweł Stankiewicz, Tomasz Gambin
MOTIVATION: Pileup analysis is a building block of many bioinformatics pipelines, including variant calling and genotyping. This step tends to become a bottleneck of the entire assay since the straightforward pileup implementations involve processing of all base calls from all alignments sequentially. On the other hand, a distributed version of the algorithm faces the intrinsic challenge of splitting reads-oriented file formats into self-contained partitions to avoid costly data exchange between computational nodes...
December 14, 2022: Bioinformatics
https://read.qxmd.com/read/36477435/large-scale-digital-forensic-investigation-for-windows-registry-on-apache-spark
#29
JOURNAL ARTICLE
Jun-Ha Lee, Hyuk-Yoon Kwon
In this study, we investigate large-scale digital forensic investigation on Apache Spark using a Windows registry. Because the Windows registry depends on the system on which it operates, the existing forensic methods on the Windows registry have been targeted on the Windows registry in a single system. However, it is a critical issue to analyze large-scale registry data collected from several Windows systems because it allows us to detect suspiciously changed data by comparing the Windows registry in multiple systems...
2022: PloS One
https://read.qxmd.com/read/36389224/a-survey-of-data-element-perspective-application-of-artificial-intelligence-in-health-big-data
#30
JOURNAL ARTICLE
Honglin Xiong, Hongmin Chen, Li Xu, Hong Liu, Lumin Fan, Qifeng Tang, Hsunfang Cho
Artificial intelligence (AI) based on the perspective of data elements is widely used in the healthcare informatics domain. Large amounts of clinical data from electronic medical records (EMRs), electronic health records (EHRs), and electroencephalography records (EEGs) have been generated and collected at an unprecedented speed and scale. For instance, the new generation of wearable technologies enables easy-collecting peoples' daily health data such as blood pressure, blood glucose, and physiological data, as well as the application of EHRs documenting large amounts of patient data...
2022: Frontiers in Neuroscience
https://read.qxmd.com/read/36298077/towards-developing-a-robust-intrusion-detection-model-using-hadoop-spark-and-data-augmentation-for-iot-networks
#31
JOURNAL ARTICLE
Ricardo Alejandro Manzano Sanchez, Marzia Zaman, Nishith Goel, Kshirasagar Naik, Rohit Joshi
In recent years, anomaly detection and machine learning for intrusion detection systems have been used to detect anomalies on Internet of Things networks. These systems rely on machine and deep learning to improve the detection accuracy. However, the robustness of the model depends on the number of datasamples available, quality of the data, and the distribution of the data classes. In the present paper, we focused specifically on the amount of data and class imbalanced since both parameters are key in IoT due to the fact that network traffic is increasing exponentially...
October 12, 2022: Sensors
https://read.qxmd.com/read/36262137/fdup-a-framework-for-general-purpose-and-efficient-entity-deduplication-of-record-collections
#32
JOURNAL ARTICLE
Michele De Bonis, Paolo Manghi, Claudio Atzori
Deduplication is a technique aiming at identifying and resolving duplicate metadata records in a collection. This article describes FDup (Flat Collections Deduper), a general-purpose software framework supporting a complete deduplication workflow to manage big data record collections: metadata record data model definition, identification of candidate duplicates, identification of duplicates. FDup brings two main innovations: first, it delivers a full deduplication framework in a single easy-to-use software package based on Apache Spark Hadoop framework, where developers can customize the optimal and parallel workflow steps of blocking, sliding windows, and similarity matching function via an intuitive configuration file; second, it introduces a novel approach to improve performance, beyond the known techniques of "blocking" and "sliding window", by introducing a smart similarity matching function T-match...
2022: PeerJ. Computer Science
https://read.qxmd.com/read/36248928/load-balancing-algorithms-for-hadoop-cluster-in-unbalanced-environment
#33
JOURNAL ARTICLE
Weiyu Fu, Lixia Wang
Considering that in the process of job scheduling, the cluster load should be prebalanced rather than remedied when the load is seriously unbalanced; therefore, in this paper, the task scheduling flow of the Hadoop cluster is analyzed deeply. On the Hadoop platform, a self-dividing algorithm is proposed for load balancing. An intelligent optimization algorithm is used to solve load balance. A dynamic feedback load balancing scheduling method is proposed from the point of view of task scheduling. In order to solve the shortcoming of the fair scheduling algorithm, this paper proposes two ways to improve the resource utilization and overall performance of Hadoop...
2022: Computational Intelligence and Neuroscience
https://read.qxmd.com/read/36203727/design-of-cross-platform-information-retrieval-system-of-library-based-on-digital-twins
#34
JOURNAL ARTICLE
Shanshan Shang, Zikai Yu, Kun Jiao, Yingshi Huang, Hua Guo, Guozhong Wang
In order to improve the library's ability of cross-platform information retrieval and data scheduling and distribution, a library cross-platform information retrieval system based on digital twin technology is designed. Using data warehouse decision support and data source structured query methods, the spectral characteristics of Library cross-platform information resources are extracted. Using the method of Hadoop data parallel loading, the library cross-platform operation data is divided into decision-making data, computing resource pool data, and Hadoop parallel loading data...
2022: Computational Intelligence and Neuroscience
https://read.qxmd.com/read/36200085/analysis-of-the-correlation-between-football-education-environment-and-students-psychology-health-based-on-gauss-characteristics
#35
JOURNAL ARTICLE
Shu Qiao, Gaosong Huang
Campus football has become a core content of school physical education. Through football education, we can cultivate students' sound personality and promote students' all-round physical and mental development. At the same time, through psychological skills training methods, we can enrich the educational methods of football skills and provide theoretical reference for promoting educational reform. On the basis of Gaussian features, this paper combines the mixed Gaussian feature model to further describe the relationship between football education and students' psychology...
2022: Journal of Environmental and Public Health
https://read.qxmd.com/read/36148405/an-analysis-of-the-effects-of-the-english-language-and-literature-on-students-language-ability-from-a-multidimensional-environment
#36
JOURNAL ARTICLE
Weifang Chen
One of the most crucial components of a student's language proficiency is basic language proficiency, which is also its fundamental component. The development of students' language skills is greatly aided by ELL (English Language and Literature). It can not only foster the growth of students' language thinking but also widen their perspectives and enhance their capacity for language comprehension. In this essay, the rules of English are examined from the multifaceted ELL viewpoint. This study extracts personality characteristic data from practical texts and incorporates it into a modelling process of students' knowledge changes based on DM- (data mining-) related technology and multidisciplinary expertise...
2022: Journal of Environmental and Public Health
https://read.qxmd.com/read/36124116/cloud-based-english-multimedia-for-universities-test-questions-modeling-and-applications
#37
JOURNAL ARTICLE
Yanping Wu, Changlong Zheng, Lele Xie, Meihui Hao
This study constructs a cloud computing-based college English multimedia test question modeling and application through an in-depth study of cloud computing and college English multimedia test questions. The emergence of cloud computing technology undoubtedly provides a new and ideal method to solve test data and paper management problems. This study analyzes the advantages of the Hadoop computing platform and MapReduce computing model and builds a distributed computing platform based on Hadoop using universities' existing hardware and software resources...
2022: Computational Intelligence and Neuroscience
https://read.qxmd.com/read/36120695/individual-online-learning-behavior-analysis-based-on-hadoop
#38
JOURNAL ARTICLE
Ning Xiang
The online individual behavior analysis is an important means for mining user interests. The user retweeting behavior prediction is typical problem for online individual behavior analysis. In order to make online learning behavior prediction method more suitable for the application of large-scale datasets, the improved condensed K nearest neighbor (ICKNN) method is proposed in this paper. Inspired by the idea of compressing samples in the condensed nearest neighbor (CNN) algorithm, this proposed method has adopted the Hadoop platform to parallelize the traditional CNN algorithm...
2022: Computational Intelligence and Neuroscience
https://read.qxmd.com/read/36111065/comparative-analysis-of-chinese-culture-and-hong-kong-macao-and-taiwan-culture-in-the-field-of-public-health-based-on-the-cnn-model
#39
JOURNAL ARTICLE
Hui Xiong
In view of the defect of a large amount of information on cultural resources and poor recommendation effect on a standalone platform, a cultural recommendation system based on the Hadoop platform was proposed, combined with the convolutional neural network (CNN). It aims to improve the adaptability of Chinese culture and Hong Kong, Macao, and Taiwan culture. Firstly, the CNN is used to encode the collected information deeply and map it to the deep feature space. Secondly, the attention mechanism is used to focus the coded features in the deep feature space to improve the classification ability of features...
2022: Journal of Environmental and Public Health
https://read.qxmd.com/read/36081755/tcm-constitution-analysis-method-based-on-parallel-fp-growth-algorithm-in-hadoop-framework
#40
JOURNAL ARTICLE
Mingzheng Li, Xiaojuan Lv, Ye Liu, Lin Wang, Jianqiang Song
This work is devoted to establishing a comparatively accurate classification model between symptoms, constitutions, and regimens for traditional Chinese medicine (TCM) constitution analysis to provide preliminary screening and decision support for clinical diagnosis. However, for the analysis of massive distributed medical data in a cloud platform, the traditional data mining methods have the problems of low mining efficiency and large memory consumption, and long tuning time, an association rules method for TCM constitution analysis (ARA-TCM) is proposed that based on FP-growth algorithm and the open-source distributed file system in Hadoop framework (HDFS) to make full use of its powerful parallel processing capability...
2022: Journal of Healthcare Engineering
keyword
keyword
4199
2
3
Fetch more papers »
Fetching more papers... Fetching...
Remove bar
Read by QxMD icon Read
×

Save your favorite articles in one place with a free QxMD account.

×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"

We want to hear from doctors like you!

Take a second to answer a survey question.