Read by QxMD icon Read

IEEE Transactions on Pattern Analysis and Machine Intelligence

Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris N Metaxas
Although Generative Adversarial Networks (GANs) have shown remarkable success in various tasks, they still face challenges in generating high quality images. In this paper, we propose Stacked Generative Adversarial Networks (StackGANs) aimed at generating high-resolution photo-realistic images. First, we propose a two-stage generative adversarial network architecture, StackGAN-v1, for text-to-image synthesis. The Stage-I GAN sketches the primitive shape and colors of a scene based on a given text description, yielding low-resolution images...
July 16, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Saeed Anwar, Cong Phuoc Huynh, Fatih Porikli
A fundamental problem in image deblurring is to reliably recover distinct spatial frequencies that have been suppressed by the blur kernel. To tackle this issue, existing image deblurring techniques often rely on generic image priors such as the sparsity of salient features including image gradients and edges. However, these priors only help recover part of the frequency spectrum, such as the frequencies near the high-end. To this end, we pose the following specific questions: (i) Does any image class information offer an advantage over existing generic priors for image quality restoration (ii) If a class-specific prior exists, how should it be encoded into a deblurring framework to recover attenuated image frequencies Throughout this work, we devise a class-specific prior based on the band-pass filter responses and incorporate it into a deblurring strategy...
July 11, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Tianfan Xue, Jiajun Wu, Katherine Bouman, William Freeman
We study the problem of synthesizing a number of likely future frames from a single input image. In contrast to traditional methods that have tackled this problem in a deterministic or non-parametric way, we propose to model future frames in a probabilistic manner. Our probabilistic model makes it possible for us to sample and synthesize many possible future frames from a single input image. To synthesize realistic movement of objects, we propose a novel network structure, namely a Cross Convolutional Network; this network encodes image and motion information as feature maps and convolutional kernels, respectively...
July 10, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jae-Pil Heo, Zhe Lin, Sung-Eui Yoon
Approximate K-nearest neighbor search is a fundamental problem in computer science. The problem is especially important for high-dimensional and large-scale data. Recently, many techniques encoding high-dimensional data to compact codes have been proposed. The product quantization and its variations that encode the cluster index in each subspace have been shown to provide impressive accuracy. In this paper, we explore a simple question: is it best to use all the bit-budget for encoding a cluster index? We have found that as data points are located farther away from the cluster centers, the error of estimated distance becomes larger...
July 5, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Zechao Li, Jinhui Tang, Tao Mei
In this work, we investigate the problem of learning knowledge from the massive community-contributed images with rich weakly-supervised context information, which can benefit multiple image understanding tasks simultaneously, such as social image tag refinement and assignment, content-based image retrieval, tag-based image retrieval and tag expansion. Towards this end, we propose a Deep Collaborative Embedding (DCE) model to uncover a unified latent space for images and tags. The proposed method incorporates the end-to-end learning and collaborative factor analysis in one unified framework for the optimal compatibility of representation learning and latent space discovery...
July 4, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Leonid Blouvshtein, Daniel Cohen-Or
Multi-dimensional scaling (MDS) plays a central role in data-exploration, dimensionality reduction and visualization. State-of-the-art MDS algorithms are not robust to outliers, yielding significant errors in the embedding even when only a handful of outliers are present. In this paper, we introduce a technique to detect and filter outliers based on geometric reasoning. We test the validity of triangles formed by three points, and mark a triangle as broken if its triangle inequality does not hold. The premise of our work is that unlike inliers, outlier distances tend to break many triangles...
June 29, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Brandon RichardWebster, Samuel Anthony, Walter Scheirer
By providing substantial amounts of data and standardized evaluation protocols, datasets in computer vision have helped fuel advances across all areas of visual recognition. But even in light of breakthrough results on recent benchmarks, it is still fair to ask if our recognition algorithms are doing as well as we think they are. The vision sciences at large make use of a very different evaluation regime known as Visual Psychophysics to study visual perception. Psychophysics is the quantitative examination of the relationships between controlled stimuli and the behavioral responses they elicit in experimental test subjects...
June 25, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Ludovic Magerand, Alessio Del Bue
This paper presents a solution to the Projective Structure from Motion (PSfM) problem able to deal efficiently with missing data, outliers and, for the first time, large scale 3D reconstruction scenarios. By embedding the projective depths into the projective parameters of the points and views, we decrease the number of unknowns to estimate and improve computational speed by optimizing standard linear Least Squares systems instead of homogeneous ones. In order to do so, we show that an extension of the linear constraints from the Generalized Projective Reconstruction Theorem can be transferred to the projective parameters, ensuring also a valid projective reconstruction in the process...
June 25, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Baoguang Shi, Mingkun Yang, Xinggang Wang, Pengyuan Lyu, Cong Yao, Xiang Bai
A challenging aspect of scene text recognition is to handle text with distortions or irregular layout. In particular, perspective text and curved text are common in natural scenes and are difficult to recognize. In this work, we introduce ASTER, an end-to-end neural network model that comprises a rectification network and a recognition network. The rectification network adaptively transforms an input image into a new one, rectifying the text in it. It is powered by a flexible Thin-Plate Spline transformation which handles a variety of text irregularities and is trained without human annotations...
June 25, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Michael Opitz, Georg Waltner, Horst Possegger, Horst Bischof
Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness of such embeddings by exploiting the independence within ensembles. To this end, we divide the last embedding layer of a deep network into an embedding ensemble and formulate the task of training this ensemble as an online gradient boosting problem. Each learner receives a reweighted training sample from the previous learners. Further, we propose two loss functions which increase the diversity in our ensemble...
June 25, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Chenping Hou, Ling-Li Zeng, Dewen Hu
With the evolution of data collection methods, it is possible to produce abundant data described by multiple feature sets. Previous studies show that including more features does not necessarily bring positive effects. How to prevent the augmented features from worsening classification performance is crucial but rarely studied. In this paper, we study this challenging problem by proposing a safe classification approach, whose accuracy is never degenerated when exploiting augmented features. We propose two ways to achieve the safeness of our method named as SAfe Classification (SAC)...
June 21, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Yicong Tian, Afshin Dehghan, Mubarak Shah
In this work, we propose a tracker that differs from most existing multi-target trackers in two major ways. Firstly, our tracker does not rely on a pre-trained object detector to get the initial object hypotheses. Secondly, our tracker's final output is the fine contours of the targets rather than traditional bounding boxes. Therefore, our tracker simultaneously solves three main problems: detection, data association and segmentation. This is especially important because the output of each of those three problems are highly correlated and the solution of one can greatly help improve the others...
June 21, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Dylan John Campbell, Lars Petersson, Laurent Kneip, Hongdong Li
Estimating the 6-DoF pose of a camera from a single image relative to a 3D point-set is an important task for many computer vision applications. Perspective-n-point solvers are routinely used for camera pose estimation, but are contingent on the provision of good quality 2D-3D correspondences. However, finding cross-modality correspondences between 2D image points and a 3D point-set is non-trivial, particularly when only geometric information is known. Existing approaches to the simultaneous pose and correspondence problem use local optimisation, and are therefore unlikely to find the optimal solution without a good pose initialisation, or introduce restrictive assumptions...
June 19, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Federica Arrigoni, Andrea Fusiello
This paper provides a unifying view and offers new insights on bearing-based network localizability, that is the problem of establishing whether a set of directions between pairs of nodes uniquely determines (up to translation and scale) the position of the nodes in d-space. If nodes represent cameras then we are in the context of global structure from motion. The contribution of the paper is theoretical: first, we rewrite and link in a coherent structure several results that have been presented in different communities using disparate formalisms; second, we derive some new localizability results within the edge-based formulation...
June 18, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Zheng Zhang, Li Liu, Fumin Shen, Heng Tao Shen, Ling Shao
Clustering is a long-standing important research problem, however, remains challenging when handling large-scale image data from diverse sources. In this paper, we present a novel Binary Multi-View Clustering (BMVC) framework, which can dexterously manipulate multi-view image data and easily scale to large data. To achieve this goal, we formulate BMVC by two key components: compact collaborative discrete representation learning and binary clustering structure learning, in a joint learning framework. Specifically, BMVC collaboratively encodes the multi-view image descriptors into a compact common binary code space by considering their complementary information; the collaborative binary representations are meanwhile clustered by a binary matrix factorization model, such that the cluster structures are optimized in the Hamming space by pure, extremely fast bit-operations...
June 18, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Gianluigi Pillonetto, Luca Schenato, Damiano Varagnolo
We consider the problem of distributedly estimating Gaussian processes in multi-agent frameworks. Each agent collects few measurements and aims to collaboratively reconstruct a common estimate based on all data. Agents are assumed with limited computational and communication capabilities and to gather noisy measurements in total on input locations independently drawn from a known common probability density. The optimal solution would require agents to exchange all the input locations and measurements and then invert an matrix, a non-scalable task...
June 18, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Qin Zhang, Jia Wu, Peng Zhang, Guodong Long, Chengqi Zhang
Time series has been a popular research topic over the past decade. Salient subsequences of time series that can benefit the learning task, e.g. classification or clustering, are called shapelets. Shapelet-based time series learning extracts these types of salient subsequences with highly informative features from a time series. Most existing methods for shapelet discovery must scan a large pool of candidate subsequences, which is a time-consuming process. A recent work, Grabocka:KDD14, uses regression learning to discover shapelets in a time series; however, it only considers learning shapelets from labeled time series data...
June 15, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Danhang Tang, Qi Ye, Jonathan Taylor, Shanxin Yuan, Pushmeet Kohli, Cem Keskin, Tae-Kyun Kim, Jamie Shotton
Hand pose estimation, formulated as an inverse problem, is typically optimized by an energy function over pose parameters using a 'black box' image generation procedure, knowing little about either the relationships between the parameters or the form of the energy function. In this paper, we show significant improvement upon the black box optimization by exploiting high-level knowledge of the parameter structure and using a local surrogate energy function. Our new framework, hierarchical sampling optimization (HSO), consists of a sequence of discriminative predictors organized into a kinematic hierarchy...
June 15, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Linzhao Wang, Lijun Wang, Huchuan Lu, Pingping Zhang, Xiang Ruan
Deep networks have been proved to encode high-level features with semantic meaning and delivered superior performance in salient object detection. In this paper, we take one step further by developing a new saliency detection method based on recurrent fully convolutional networks (RFCNs). Compared with existing deep network based methods, the proposed network is able to incorporate saliency prior knowledge for more accurate inference. In addition, the recurrent architecture enables our method to automatically learn to refine the saliency map by iteratively correcting its previous errors, yielding more reliable final predictions...
June 12, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Filip Radenovic, Giorgos Tolias, Ondrej Chum
Image descriptors based on activations of Convolutional Neural Networks (CNNs) have become dominant in image retrieval due to their discriminative power, compactness of representation, and search efficiency. Training of CNNs, either from scratch or fine-tuning, requires a large amount of annotated data, where a high quality of annotation is often crucial. In this work, we propose to fine-tune CNNs for image retrieval on a large collection of unordered images in a fully automated manner. Reconstructed 3D models obtained by the state-of-the-art retrieval and structure-from-motion methods guide the selection of the training data...
June 12, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"