journal
Journals Statistical Applications in Ge...

Statistical Applications in Genetics and Molecular Biology

https://read.qxmd.com/read/38366619/a-global-test-of-hybrid-ancestry-from-genome-scale-data
#1
JOURNAL ARTICLE
Md Rejuan Haque, Laura Kubatko
Methods based on the multi-species coalescent have been widely used in phylogenetic tree estimation using genome-scale DNA sequence data to understand the underlying evolutionary relationship between the sampled species. Evolutionary processes such as hybridization, which creates new species through interbreeding between two different species, necessitate inferring a species network instead of a species tree. A species tree is strictly bifurcating and thus fails to incorporate hybridization events which require an internal node of degree three...
January 1, 2024: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/38363177/integrative-pathway-analysis-with-gene-expression-mirna-methylation-and-copy-number-variation-for-breast-cancer-subtypes
#2
JOURNAL ARTICLE
Henry Linder, Yuping Zhang, Yunqi Wang, Zhengqing Ouyang
Developments in biotechnologies enable multi-platform data collection for functional genomic units apart from the gene. Profiling of non-coding microRNAs (miRNAs) is a valuable tool for understanding the molecular profile of the cell, both for canonical functions and malignant behavior due to complex diseases. We propose a graphical mixed-effects statistical model incorporating miRNA-gene target relationships. We implement an integrative pathway analysis that leverages measurements of miRNA activity for joint analysis with multimodal observations of gene activity including gene expression, methylation, and copy number variation...
January 1, 2024: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/38235525/bayesian-lasso-for-population-stratification-correction-in-rare-haplotype-association-studies
#3
JOURNAL ARTICLE
Zilu Liu, Asuman Seda Turkmen, Shili Lin
Population stratification (PS) is one major source of confounding in both single nucleotide polymorphism (SNP) and haplotype association studies. To address PS, principal component regression (PCR) and linear mixed model (LMM) are the current standards for SNP associations, which are also commonly borrowed for haplotype studies. However, the underfitting and overfitting problems introduced by PCR and LMM, respectively, have yet to be addressed. Furthermore, there have been only a few theoretical approaches proposed to address PS specifically for haplotypes...
January 1, 2024: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/38073574/when-is-the-allele-sharing-dissimilarity-between-two-populations-exceeded-by-the-allele-sharing-dissimilarity-of-a-population-with-itself
#4
JOURNAL ARTICLE
Xiran Liu, Zarif Ahsan, Tarun K Martheswaran, Noah A Rosenberg
Allele-sharing statistics for a genetic locus measure the dissimilarity between two populations as a mean of the dissimilarity between random pairs of individuals, one from each population. Owing to within-population variation in genotype, allele-sharing dissimilarities can have the property that they have a nonzero value when computed between a population and itself. We consider the mathematical properties of allele-sharing dissimilarities in a pair of populations, treating the allele frequencies in the two populations parametrically...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/38015771/mediation-analysis-method-review-of-high-throughput-data
#5
REVIEW
Qiang Han, Yu Wang, Na Sun, Jiadong Chu, Wei Hu, Yueping Shen
High-throughput technologies have made high-dimensional settings increasingly common, providing opportunities for the development of high-dimensional mediation methods. We aimed to provide useful guidance for researchers using high-dimensional mediation analysis and ideas for biostatisticians to develop it by summarizing and discussing recent advances in high-dimensional mediation analysis. The method still faces many challenges when extended single and multiple mediation analyses to high-dimensional settings...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37991399/patterns-of-differential-expression-by-association-in-omic-data-using-a-new-measure-based-on-ensemble-learning
#6
JOURNAL ARTICLE
Jorge M Arevalillo, Raquel Martin-Arevalillo
The ongoing development of high-throughput technologies is allowing the simultaneous monitoring of the expression levels for hundreds or thousands of biological inputs with the proliferation of what has been coined as omic data sources. One relevant issue when analyzing such data sources is concerned with the detection of differential expression across two experimental conditions, clinical status or two classes of a biological outcome. While a great deal of univariate data analysis approaches have been developed to address the issue, strategies for assessing interaction patterns of differential expression are scarce in the literature and have been limited to ad hoc solutions...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37988745/integrated-regulatory-and-metabolic-networks-of-the-tumor-microenvironment-for-therapeutic-target-prioritization
#7
JOURNAL ARTICLE
Tiange Shi, Han Yu, Rachael Hageman Blair
Translation of genomic discovery, such as single-cell sequencing data, to clinical decisions remains a longstanding bottleneck in the field. Meanwhile, computational systems biological models, such as cellular metabolism models and cell signaling pathways, have emerged as powerful approaches to provide efficient predictions in metabolites and gene expression levels, respectively. However, there has been limited research on the integration between these two models. This work develops a methodology for integrating computational models of probabilistic gene regulatory networks with a constraint-based metabolism model...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37937887/randomized-singular-value-decomposition-for-integrative-subtype-analysis-of-omics-data-using-non-negative-matrix-factorization
#8
JOURNAL ARTICLE
Yonghui Ni, Jianghua He, Prabhakar Chalise
Integration of multiple 'omics datasets for differentiating cancer subtypes is a powerful technic that leverages the consistent and complementary information across multi-omics data. Matrix factorization is a common technique used in integrative clustering for identifying latent subtype structure across multi-omics data. High dimensionality of the omics data and long computation time have been common challenges of clustering methods. In order to address the challenges, we propose randomized singular value decomposition (RSVD) for integrative clustering using Non-negative Matrix Factorization: intNMF-rsvd ...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37658681/a-novel-hybrid-cnn-and-bigru-attention-based-deep-learning-model-for-protein-function-prediction
#9
JOURNAL ARTICLE
Lavkush Sharma, Akshay Deepak, Ashish Ranjan, Gopalakrishnan Krishnasamy
Proteins are the building blocks of all living things. Protein function must be ascertained if the molecular mechanism of life is to be understood. While CNN is good at capturing short-term relationships, GRU and LSTM can capture long-term dependencies. A hybrid approach that combines the complementary benefits of these deep-learning models motivates our work. Protein Language models, which use attention networks to gather meaningful data and build representations for proteins, have seen tremendous success in recent years processing the protein sequences...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37622330/accurate-and-fast-small-p-value-estimation-for-permutation-tests-in-high-throughput-genomic-data-analysis-with-the-cross-entropy-method
#10
JOURNAL ARTICLE
Yang Shi, Weiping Shi, Mengqiao Wang, Ji-Hyun Lee, Huining Kang, Hui Jiang
Permutation tests are widely used for statistical hypothesis testing when the sampling distribution of the test statistic under the null hypothesis is analytically intractable or unreliable due to finite sample sizes. One critical challenge in the application of permutation tests in genomic studies is that an enormous number of permutations are often needed to obtain reliable estimates of very small p -values, leading to intensive computational effort. To address this issue, we develop algorithms for the accurate and efficient estimation of small p -values in permutation tests for paired and independent two-group genomic data, and our approaches leverage a novel framework for parameterizing the permutation sample spaces of those two types of data respectively using the Bernoulli and conditional Bernoulli distributions, combined with the cross-entropy method...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37592851/cat-petr-a-graphical-user-interface-for-differential-analysis-of-phosphorylation-and-expression-data
#11
JOURNAL ARTICLE
Keegan Flanagan, Steven Pelech, Yossef Av-Gay, Khanh Dao Duc
Antibody microarray data provides a powerful and high-throughput tool to monitor global changes in cellular response to perturbation or genetic manipulation. However, while collecting such data has become increasingly accessible, a lack of specific computational tools has made their analysis limited. Here we present CAT PETR, a user friendly web application for the differential analysis of expression and phosphorylation data collected via antibody microarrays. Our application addresses the limitations of other GUI based tools by providing various data input options and visualizations...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37489035/improving-the-accuracy-and-internal-consistency-of-regression-based-clustering-of-high-dimensional-datasets
#12
JOURNAL ARTICLE
Bo Zhang, Jianghua He, Jinxiang Hu, Prabhakar Chalise, Devin C Koestler
Component-wise Sparse Mixture Regression (CSMR) is a recently proposed regression-based clustering method that shows promise in detecting heterogeneous relationships between molecular markers and a continuous phenotype of interest. However, CSMR can yield inconsistent results when applied to high-dimensional molecular data, which we hypothesize is in part due to inherent limitations associated with the feature selection method used in the CSMR algorithm. To assess this hypothesis, we explored whether substituting different regularized regression methods (i...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/37082815/a-bayesian-model-to-identify-multiple-expression-patterns-with-simultaneous-fdr-control-for-a-multi-factor-rna-seq-experiment
#13
JOURNAL ARTICLE
Yuanyuan Bian, Chong He, Jing Qiu
It is often of research interest to identify genes that satisfy a particular expression pattern across different conditions such as tissues, genotypes, etc. One common practice is to perform differential expression analysis for each condition separately and then take the intersection of differentially expressed (DE) genes or non-DE genes under each condition to obtain genes that satisfy a particular pattern. Such a method can lead to many false positives, especially when the desired gene expression pattern involves equivalent expression under one condition...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/36724206/a-fast-and-efficient-approach-for-gene-based-association-studies-of-ordinal-phenotypes
#14
JOURNAL ARTICLE
Nanxing Li, Lili Chen, Yajing Zhou, Qianran Wei
Many human disease conditions need to be measured by ordinal phenotypes, so analysis of ordinal phenotypes is valuable in genome-wide association studies (GWAS). However, existing association methods for dichotomous or quantitative phenotypes are not appropriate to ordinal phenotypes. Therefore, based on an aggregated Cauchy association test, we propose a fast and efficient association method to test the association between genetic variants and an ordinal phenotype. To enrich association signals of rare variants, we first use the burden method to aggregate rare variants...
January 1, 2023: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/35918809/distinct-characteristics-of-correlation-analysis-at-the-single-cell-and-the-population-level
#15
JOURNAL ARTICLE
Guoyu Wu, Yuchao Li
Correlation analysis is widely used in biological studies to infer molecular relationships within biological networks. Recently, single-cell analysis has drawn tremendous interests, for its ability to obtain high-resolution molecular phenotypes. It turns out that there is little overlap of co-expressed genes identified in single-cell level investigations with that of population level investigations. However, the nature of the relationship of correlations between single-cell and population levels remains unclear...
August 2, 2022: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/35848211/use-of-svm-based-ensemble-feature-selection-method-for-gene-expression-data-analysis
#16
JOURNAL ARTICLE
Shizhi Zhang, Mingjin Zhang
Gene selection is one of the key steps for gene expression data analysis. An SVM-based ensemble feature selection method is proposed in this paper. Firstly, the method builds many subsets by using Monte Carlo sampling. Secondly, ranking all the features on each of the subsets and integrating them to obtain a final ranking list. Finally, the optimum feature set is determined by a backward feature elimination strategy. This method is applied to the analysis of 4 public datasets: the Leukemia, Prostate, Colorectal, and SMK_CAN, resulting 7, 10, 13, and 32 features...
July 14, 2022: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/35848210/a-robust-association-test-with-multiple-genetic-variants-and-covariates
#17
JOURNAL ARTICLE
Jen-Yu Lee, Pao-Sheng Shen, Kuang-Fu Cheng
Due to the advancement of genome sequencing techniques, a great stride has been made in exome sequencing such that the association study between disease and genetic variants has become feasible. Some powerful and well-known association tests have been proposed to test the association between a group of genes and the disease of interest. However, some challenges still remain, in particular, many factors can affect the performance of testing power, e.g., the sample size, the number of causal and non-causal variants, and direction of the effect of causal variants...
June 6, 2022: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/35634906/estimation-of-the-covariance-structure-from-snp-allele-frequencies
#18
JOURNAL ARTICLE
Jan van Waaij, Zilong Li, Carsten Wiuf
No abstract text is available yet for this article.
May 26, 2022: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/35073469/challenges-for-machine-learning-in-rna-protein-interaction-prediction
#19
REVIEW
Viplove Arora, Guido Sanguinetti
RNA-protein interactions have long being recognised as crucial regulators of gene expression. Recently, the development of scalable experimental techniques to measure these interactions has revolutionised the field, leading to the production of large-scale datasets which offer both opportunities and challenges for machine learning techniques. In this brief note, we will discuss some of the major stumbling blocks towards the use of machine learning in computational RNA biology, focusing specifically on the problem of predicting RNA-protein interactions from next-generation sequencing data...
May 2, 2022: Statistical Applications in Genetics and Molecular Biology
https://read.qxmd.com/read/35266368/gmeps-a-fast-and-efficient-likelihood-approach-for-genome-wide-mediation-analysis-under-extreme-phenotype-sequencing
#20
JOURNAL ARTICLE
Janaka S S Liyanage, Jeremie H Estepp, Kumar Srivastava, Yun Li, Motomi Mori, Guolian Kang
Due to many advantages such as higher statistical power of detecting the association of genetic variants in human disorders and cost saving, extreme phenotype sequencing (EPS) is a rapidly emerging study design in epidemiological and clinical studies investigating how genetic variations associate with complex phenotypes. However, the investigation of the mediation effect of genetic variants on phenotypes is strictly restrictive under the EPS design because existing methods cannot well accommodate the non-random extreme tails sampling process incurred by the EPS design...
March 11, 2022: Statistical Applications in Genetics and Molecular Biology
journal
journal
40440
1
2
Fetch more papers »
Fetching more papers... Fetching...
Remove bar
Read by QxMD icon Read
×

Save your favorite articles in one place with a free QxMD account.

×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"

We want to hear from doctors like you!

Take a second to answer a survey question.