keyword
MENU ▼
Read by QxMD icon Read
search

Locality sensitive hashing

keyword
https://www.readbyqxmd.com/read/29897871/deep-constrained-siamese-hash-coding-network-and-load-balanced-locality-sensitive-hashing-for-near-duplicate-image-detection
#1
Weiming Hu, Yabo Fan, Junliang Xing, Liang Sun, Zhaoquan Cai, Stephen Maybank
We construct a new efficient near duplicate image detection method using a hierarchical hash code learning neural network and load-balanced locality-sensitive hashing (LSH) indexing. We propose a deep constrained siamese hash coding neural network combined with deep feature learning. Our neural network is able to extract effective features for near duplicate image detection. The extracted features are used to construct a LSH-based index. We propose a load-balanced LSH method to produce load-balanced buckets in the hashing process...
September 2018: IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society
https://www.readbyqxmd.com/read/29788498/cellatlassearch-a-scalable-search-engine-for-single-cells
#2
Divyanshu Srivastava, Arvind Iyer, Vibhor Kumar, Debarka Sengupta
Owing to the advent of high throughput single cell transcriptomics, past few years have seen exponential growth in production of gene expression data. Recently efforts have been made by various research groups to homogenize and store single cell expression from a large number of studies. The true value of this ever increasing data deluge can be unlocked by making it searchable. To this end, we propose CellAtlasSearch, a novel search architecture for high dimensional expression data, which is massively parallel as well as light-weight, thus infinitely scalable...
May 21, 2018: Nucleic Acids Research
https://www.readbyqxmd.com/read/29692912/new-approaches-in-the-systematics-of-rickettsiae
#3
S N Shpynov, P-E Fournier, N N Pozdnichenko, A S Gumenuk, A A Skiba
The development of a formal order analysis (FOA) allowed constructing a classification of 49 genomes of Rickettsiaceae family representatives. Recently FOA has been extended with new tools-'Map of genes,' 'Matrix of similarity' and 'Locality-sensitive hashing'-for a more in-depth study of the structure of rickettsial genomes. The new classification confirmed and supplemented the previously constructed one by determining the position of Rickettsia africae str. ESF-5, R. heilongjiangensis 054, R. monacensis str...
May 2018: New Microbes and New Infections
https://www.readbyqxmd.com/read/29679685/identifying-and-characterizing-highly-similar-notes-in-big-clinical-note-datasets
#4
Rodney A Gabriel, Tsung-Ting Kuo, Julian McAuley, Chun-Nan Hsu
BACKGROUND: Big clinical note datasets found in electronic health records (EHR) present substantial opportunities to train accurate statistical models that identify patterns in patient diagnosis and outcomes. However, near-to-exact duplication in note texts is a common issue in many clinical note datasets. We aimed to use a scalable algorithm to de-duplicate notes and further characterize the sources of duplication. METHODS: We use an approximation algorithm to minimize pairwise comparisons consisting of three phases: (1) Minhashing with Locality Sensitive Hashing; (2) a clustering method using tree-structured disjoint sets; and (3) classification of near-duplicates (exact copies, common machine output notes, or similar notes) via pairwise comparison of notes in each cluster...
June 2018: Journal of Biomedical Informatics
https://www.readbyqxmd.com/read/29442040/lidocaine-sensitizes-the-cytotoxicity-of-5-fluorouacil-in-melanoma-cells-via-upregulation-of-microrna-493
#5
Yingbin Wang, Jianqin Xie, Wei Liu, Rongzhi Zhang, Shenghui Huang, Yanhong Xing
Lidocaine is a well-documented local anesthetic that has been reported to sensitize the cytotoxicity of cisplatin in cancer cells. However, little information is available concerning whether lidocaine sensitizes the cytotoxicity of 5-fluorouracil (5-FU) in melanoma cells. The study was aimed to explore the effects and mechanisms of lidocaine on the sensitivity to 5-FU in the melanoma cell line SK-MEL-2. Cell viability and apoptosis were analyzed after administration of different concentrations of lidocaine, 5-FU, or the combinations...
November 1, 2017: Die Pharmazie
https://www.readbyqxmd.com/read/29361178/dropclust-efficient-clustering-of-ultra-large-scrna-seq-data
#6
Debajyoti Sinha, Akhilesh Kumar, Himanshu Kumar, Sanghamitra Bandyopadhyay, Debarka Sengupta
Droplet based single cell transcriptomics has recently enabled parallel screening of tens of thousands of single cells. Clustering methods that scale for such high dimensional data without compromising accuracy are scarce. We exploit Locality Sensitive Hashing, an approximate nearest neighbour search technique to develop a de novo clustering algorithm for large-scale single cell data. On a number of real datasets, dropClust outperformed the existing best practice methods in terms of execution time, clustering accuracy and detectability of minor cell sub-types...
April 6, 2018: Nucleic Acids Research
https://www.readbyqxmd.com/read/29346410/an-evaluation-of-multi-probe-locality-sensitive-hashing-for-computing-similarities-over-web-scale-query-logs
#7
Graham Cormode, Anirban Dasgupta, Amit Goyal, Chi Hoon Lee
Many modern applications of AI such as web search, mobile browsing, image processing, and natural language processing rely on finding similar items from a large database of complex objects. Due to the very large scale of data involved (e.g., users' queries from commercial search engines), computing such near or nearest neighbors is a non-trivial task, as the computational cost grows significantly with the number of items. To address this challenge, we adopt Locality Sensitive Hashing (a.k.a, LSH) methods and evaluate four variants in a distributed computing environment (specifically, Hadoop)...
2018: PloS One
https://www.readbyqxmd.com/read/29342158/real-time-community-detection-in-full-social-networks-on-a-laptop
#8
Benjamin Paul Chamberlain, Josh Levy-Kramer, Clive Humby, Marc Peter Deisenroth
For a broad range of research and practical applications it is important to understand the allegiances, communities and structure of key players in society. One promising direction towards extracting this information is to exploit the rich relational data in digital social networks (the social graph). As global social networks (e.g., Facebook and Twitter) are very large, most approaches make use of distributed computing systems for this purpose. Distributing graph processing requires solving many difficult engineering problems, which has lead some researchers to look at single-machine solutions that are faster and easier to maintain...
2018: PloS One
https://www.readbyqxmd.com/read/29260348/medical-image-retrieval-with-compact-binary-codes-generated-in-frequency-domain-using-highly-reactive-convolutional-features
#9
Jamil Ahmad, Khan Muhammad, Sung Wook Baik
Efficient retrieval of relevant medical cases using semantically similar medical images from large scale repositories can assist medical experts in timely decision making and diagnosis. However, the ever-increasing volume of images hinder performance of image retrieval systems. Recently, features from deep convolutional neural networks (CNN) have yielded state-of-the-art performance in image retrieval. Further, locality sensitive hashing based approaches have become popular for their ability to allow efficient retrieval in large scale datasets...
December 19, 2017: Journal of Medical Systems
https://www.readbyqxmd.com/read/29123069/a-neural-algorithm-for-a-fundamental-computing-problem
#10
Sanjoy Dasgupta, Charles F Stevens, Saket Navlakha
Similarity search-for example, identifying similar images in a database or similar documents on the web-is a fundamental computing problem faced by large-scale information retrieval systems. We discovered that the fruit fly olfactory circuit solves this problem with a variant of a computer science algorithm (called locality-sensitive hashing). The fly circuit assigns similar neural activity patterns to similar odors, so that behaviors learned from one odor can be applied when a similar odor is experienced. The fly algorithm, however, uses three computational strategies that depart from traditional approaches...
November 10, 2017: Science
https://www.readbyqxmd.com/read/29060551/collision-frequency-locality-sensitive-hashing-for-prediction-of-critical-events
#11
Y Bryce Kim, Erik Hemberg, Una-May O'Reilly
We present a fast, efficient method to predict future critical events for a patient. The prediction method is based on retrieving and leveraging similar waveform trajectories from a large medical database. Locality-sensitive hashing (LSH), our theoretical foundation, is a model-free, sub-linear time, approximate search method enabling a fast retrieval of a nearest neighbor set for a given query. We propose a new variant of LSH, namely Collision Frequency LSH (CFLSH), to further improve the prediction accuracy without sacrificing any speed...
July 2017: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/29018478/ultrafast-comparison-of-personal-genomes-via-precomputed-genome-fingerprints
#12
Gustavo Glusman, Denise E Mauldin, Leroy E Hood, Max Robinson
We present an ultrafast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into "genome fingerprints" via locality sensitive hashing. The resulting genome fingerprints can be meaningfully compared even when the input data were obtained using different sequencing technologies, processed using different pipelines, represented in different data formats and relative to different reference versions. Furthermore, genome fingerprints are robust to up to 30% missing data...
2017: Frontiers in Genetics
https://www.readbyqxmd.com/read/28981422/phenotype-prediction-from-metagenomic-data-using-clustering-and-assembly-with-multiple-instance-learning-camil
#13
Mohammad Arifur Rahman, Nathan LaPierre, Huzefa Rangwala
The recent advent of Metagenome Wide Association Studies (MGWAS) provides insight into the role of microbes on human health and disease. However, the studies present several computational challenges. In this paper we demonstrate a novel, efficient, and effective Multiple Instance Learning (MIL) based computational pipeline to predict patient phenotype from metagenomic data. MIL methods have the advantage that besides predicting the clinical phenotype, we can infer the instance level label or role of microbial sequence reads in the specific disease...
October 4, 2017: IEEE/ACM Transactions on Computational Biology and Bioinformatics
https://www.readbyqxmd.com/read/28953425/shared-nearest-neighbor-clustering-in-a-locality-sensitive-hashing-framework
#14
Sawsan Kanj, Thomas Brüls, Stéphane Gazut
We present a new algorithm to cluster high-dimensional sequence data and its application to the field of metagenomics, which aims at reconstructing individual genomes from a mixture of genomes sampled from an environmental site, without any prior knowledge of reference data (genomes) or the shape of clusters. Such problems typically cannot be solved directly with classical approaches seeking to estimate the density of clusters, for example, using the shared nearest neighbors (SNN) rule, due to the prohibitive size of contemporary sequence datasets...
February 2018: Journal of Computational Biology: a Journal of Computational Molecular Cell Biology
https://www.readbyqxmd.com/read/28771497/sinc-saliency-injected-neural-codes-for-representation-and-efficient-retrieval-of-medical-radiographs
#15
Jamil Ahmad, Muhammad Sajjad, Irfan Mehmood, Sung Wook Baik
Medical image collections contain a wealth of information which can assist radiologists and medical experts in diagnosis and disease detection for making well-informed decisions. However, this objective can only be realized if efficient access is provided to semantically relevant cases from the ever-growing medical image repositories. In this paper, we present an efficient method for representing medical images by incorporating visual saliency and deep features obtained from a fine-tuned convolutional neural network (CNN) pre-trained on natural images...
2017: PloS One
https://www.readbyqxmd.com/read/28508884/a-hybrid-cloud-read-aligner-based-on-minhash-and-kmer-voting-that-preserves-privacy
#16
Victoria Popic, Serafim Batzoglou
Low-cost clouds can alleviate the compute and storage burden of the genome sequencing data explosion. However, moving personal genome data analysis to the cloud can raise serious privacy concerns. Here, we devise a method named Balaur, a privacy preserving read mapper for hybrid clouds based on locality sensitive hashing and kmer voting. Balaur can securely outsource a substantial fraction of the computation to the public cloud, while being highly competitive in accuracy and speed with non-private state-of-the-art read aligners on short read data...
May 16, 2017: Nature Communications
https://www.readbyqxmd.com/read/28268827/stratified-locality-sensitive-hashing-for-accelerated-physiological-time-series-retrieval
#17
Yongwook Bryce Kim, Erik Hemberg, Una-May O'Reilly
We introduce stratified locality-sensitive hashing (SLSH) for retrieving similar physiological waveform time series. SLSH further accelerates the sublinear retrieval time obtained by the standard locality-sensitive hashing (LSH) method. The standard family of locality-sensitive hash functions is limited to provide only a single perspective on the data due to its one-to-one relationship to a distinct distance function for measuring similarity. SLSH incorporates multiple locality-sensitive hash families with various distance functions enabling it to examine the data with more diverse and refined perspectives...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28268443/analysis-of-locality-sensitive-hashing-for-fast-critical-event-prediction-on-physiological-time-series
#18
Yongwook Bryce Kim, Una-May O'Reilly
We apply the sublinear time, scalable locality-sensitive hashing (LSH) and majority discrimination to the problem of predicting critical events based on physiological waveform time series. Compared to using the linear exhaustive k-nearest neighbor search, our proposed method vastly speeds up prediction time up to 25 times while sacrificing only 1% of accuracy when demonstrated on an arterial blood pressure dataset extracted from the MIMIC2 database. We compare two widely used variants of LSH, the bit sampling based (L1LSH) and the random projection based (E2LSH) methods to measure their direct impact on retrieval and prediction accuracy...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28227023/stratified-locality-sensitive-hashing-for-accelerated-physiological-time-series-retrieval
#19
Yongwook Bryce Kim, Erik Hemberg, Una-May O'Reilly, Yongwook Bryce Kim, Erik Hemberg, Una-May O'Reilly, Yongwook Bryce Kim, Una-May O'Reilly, Erik Hemberg
We introduce stratified locality-sensitive hashing (SLSH) for retrieving similar physiological waveform time series. SLSH further accelerates the sublinear retrieval time obtained by the standard locality-sensitive hashing (LSH) method. The standard family of locality-sensitive hash functions is limited to provide only a single perspective on the data due to its one-to-one relationship to a distinct distance function for measuring similarity. SLSH incorporates multiple locality-sensitive hash families with various distance functions enabling it to examine the data with more diverse and refined perspectives...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
https://www.readbyqxmd.com/read/28226614/analysis-of-locality-sensitive-hashing-for-fast-critical-event-prediction-on-physiological-time-series
#20
Yongwook Bryce Kim, Una-May O'Reilly, Yongwook Bryce Kim, Una-May O'Reilly, Yongwook Bryce Kim, Una-May O'Reilly
We apply the sublinear time, scalable locality-sensitive hashing (LSH) and majority discrimination to the problem of predicting critical events based on physiological waveform time series. Compared to using the linear exhaustive k-nearest neighbor search, our proposed method vastly speeds up prediction time up to 25 times while sacrificing only 1% of accuracy when demonstrated on an arterial blood pressure dataset extracted from the MIMIC2 database. We compare two widely used variants of LSH, the bit sampling based (L1LSH) and the random projection based (E2LSH) methods to measure their direct impact on retrieval and prediction accuracy...
August 2016: Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society
keyword
keyword
57909
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"