IEEE Transactions on Pattern Analysis and Machine Intelligence

Suren Vagharshakyan, Robert Bregovic, Atanas Gotchev
In this article we develop an image based rendering technique based on light field reconstruction from a limited set of perspective views acquired by cameras. Our approach utilizes sparse representation of epipolar-plane images (EPI) in shearlet transform domain. The shearlet transform has been specifically modified to handle the straight lines characteristic for EPI. The devised iterative regularization algorithm based on adaptive thresholding provides high-quality reconstruction results for relatively big disparities between neighboring views...
January 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Wen-Yan Lin, Fan Wang, Ming-Ming Cheng, Sai-Kit Yeung, Philip H S Torr, Minh N Do, Jiangbo Lu
A key challenge in feature correspondence is the difficulty in differentiating true and false matches at a local descriptor level. This forces adoption of strict similarity thresholds that discard many true matches. However, if analyzed at a global level, false matches are usually randomly scattered while true matches tend to be coherent (clustered around a few dominant motions), thus creating a coherence based separability constraint. This paper proposes a non-linear regression technique that can discover such a coherence based separability constraint from highly noisy matches and embed it into a correspondence likelihood model...
January 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Lacey Best-Rowden, Anil K Jain
The two underlying premises of automatic face recognition are uniqueness and permanence. This paper investigates the permanence property by addressing the following: Does face recognition ability of state-of-the-art systems degrade with elapsed time between enrolled and query face images? If so, what is the rate of decline w.r.t. the elapsed time? While previous studies have reported degradations in accuracy, no formal statistical analysis of large-scale longitudinal data has been conducted. We conduct such an analysis on two mugshot databases, which are the largest facial aging databases studied to date in terms of number of subjects, images per subject, and elapsed times...
January 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Liang Lin, Keze Wang, Deyu Meng, Wangmeng Zuo, Lei Zhang
This paper aims to develop a novel cost-effective framework for face identification, which progressively maintains a batch of classifiers with the increasing face images of different individuals. By naturally combining two recently rising techniques: active learning (AL) and self-paced learning (SPL), our framework is capable of automatically annotating new instances and incorporating them into training under weak expert recertification. We first initialize the classifier using a few annotated samples for each individual, and extract image features using the convolutional neural nets...
January 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Shangwen Li, Sanjay Purushotham, Chen Chen, Yuzhuo Ren, C-C Jay Kuo
Textual data such as tags, sentence descriptions are combined with visual cues to reduce the semantic gap for image retrieval applications in today's Multimodal Image Retrieval (MIR) systems. However, all tags are treated as equally important in these systems, which may result in misalignment between visual and textual modalities during MIR training. This will further lead to degenerated retrieval performance at query time. To address this issue, we investigate the problem of tag importance prediction, where the goal is to automatically predict the tag importance and use it in image retrieval...
January 11, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Zhouchen Lin, Chen Xu, Hongbin Zha
L1-norm based low rank matrix factorization in the presence of missing data and outliers remains a hot topic in computer vision. Due to non-convexity and non-smoothness, all the existing methods either lack scalability or robustness, or have no theoretical guarantee on convergence. In this paper, we apply the Majorization Minimization technique to solve this problem. At each iteration, we upper bound the original function with a strongly convex surrogate. By minimizing the surrogate and updating the iterates accordingly, the objective function has sufficient decrease, which is stronger than just being non-increasing that other methods could offer...
January 11, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiao-Tong Yuan, Qingshan Liu
We introduce a family of Newton-type greedy selection methods for ℓ0-constrained minimization problems. The basic idea is to construct a quadratic function to approximate the original objective function around the current iterate and solve the constructed quadratic program over the cardinality constraint. The next iterate is then estimated via a line search operation between the current iterate and the solution of the sparse quadratic program. This iterative procedure can be interpreted as an extension of the constrained Newton methods from convex minimization to non-convex ℓ0-constrained minimization...
January 11, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Oren Freifeld, Soren Hauberg, Kayhan Batmanghelich, John W Fisher
We propose novel finite-dimensional spaces of well-behaved Rn → Rn transformations. The latter are obtained by (fast and highly-accurate) integration of continuous piecewise-affine velocity fields. The proposed method is simple yet highly expressive, effortlessly handles optional constraints (e.g., volume preservation and/or boundary conditions), and supports convenient modeling choices such as smoothing priors and coarse-to-fine analysis. Importantly, the proposed approach, partly due to its rapid likelihood evaluations and partly due to its other properties, facilitates tractable inference over rich transformation spaces, including using Markov-Chain Monte-Carlo methods...
January 11, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Lingqiao Liu, Peng Wang, Chunhua Shen, Lei Wang, Anton van den Hengel, Chao Wang, Heng Tao Shen
Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) as the generative model for local features. However, the representative power of a GMM can be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes, and the number of prototypes is usually small in FVC...
January 10, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1]. The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification...
January 2, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Baoguang Shi, Xiang Bai, Cong Yao
Image-based sequence recognition has been a long-standing research topic in computer vision. In this paper, we investigate the problem of scene text recognition, which is among the most important and challenging tasks in image-based sequence recognition. A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed. Compared with previous systems for scene text recognition, the proposed architecture possesses four distinctive properties: (1) It is end-to-end trainable, in contrast to most of the existing algorithms whose components are separately trained and tuned...
December 29, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh
People typically learn through exposure to visual concepts associated with linguistic descriptions. For instance, teaching visual object categories to children is often accompanied by descriptions in text or speech. In a machine learning context, these observations motivates us to ask whether this learning process could be computationally modeled to learn visual classifiers. More specifically, the main question of this work is how to utilize purely textual description of visual classes with no training images, to learn explicit visual classifiers for them...
December 29, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Feng Zheng, Yi Tang, Ling Shao
Recently, cross-modal search has attracted considerable attention but remains a very challenging task because of the integration complexity and heterogeneity of the multi-modal data. To address both challenges, in this paper, we propose a novel method termed hetero-manifold regularisation (HMR) to supervise the learning of hash functions for efficient cross-modal search. A hetero-manifold integrates multiple sub-manifolds defined by homogeneous data with the help of cross-modal supervision information. Taking advantages of the hetero-manifold, the similarity between each pair of heterogeneous data could be naturally measured by three order random walks on this hetero-manifold...
December 28, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Zhiyuan Shi, Yongxin Yang, Timothy Hospedales, Tao Xiang
We propose to model complex visual scenes using a non-parametric Bayesian model learned from weakly labelled images abundant on media sharing sites such as Flickr. Given weak image-level annotations of objects and attributes without locations or associations between them, our model aims to learn the appearance of object and attribute classes as well as their association on each object instance. Once learned, given an image, our model can be deployed to tackle a number of vision problems in a joint and coherent manner, including recognising objects in the scene (automatic object annotation), describing objects using their attributes (attribute prediction and association), and localising and delineating the objects (object detection and semantic segmentation)...
December 26, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Michael Clement, Adrien Poulenard, Camille Kurtz, Laurent Wendling
The analysis of spatial relations between objects in digital images plays a crucial role in various application domains related to pattern recognition and computer vision. Classical models for the evaluation of such relations are usually sufficient for the handling of simple objects, but can lead to ambiguous results in more complex situations. In this article, we investigate the modeling of spatial configurations where the objects can be imbricated in each other. We formalize this notion with the term enlacement, from which we also derive the term interlacement, denoting a mutual enlacement of two objects...
December 26, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Tianfu Wu, Yang Lu, Song-Chun Zhu
This paper presents a method, called AOGTracker, for simultaneously tracking, learning and parsing (TLP) of unknown objects in video sequences with a hierarchical and compositional And-Or graph (AOG) representation. The TLP method is formulated in the Bayesian framework with a spatial and a temporal dynamic programming (DP) algorithms inferring object bounding boxes on-the-fly. During online learning, the AOG is discriminatively learned using latent SVM [1] to account for appearance (e.g., lighting and partial occlusion) and structural (e...
December 23, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Kun Fu, Junqi Jin, Runpeng Cui, Fei Sha, Changshui Zhang
Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this paper, we propose an image captioning system that exploits the parallel structures between images and sentences. In our model, the process of generating the next word, given the previously generated ones, is aligned with the visual perception experience where the attention shifts among the visual regions - such transitions impose a thread of ordering in visual perception...
December 21, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jian-Fang Hu, Wei-Shi Zheng, Jian-Huang Lai, Jianguo Zhang
In this paper, we focus on heterogeneous features learning for RGB-D activity recognition. We find that features from different channels (RGB, depth) could share some similar hidden structures, and then propose a joint learning model to simultaneously explore the shared and feature-specific components as an instance of heterogeneous multi-task learning. The proposed model formed in a unified framework is capable of: 1) jointly mining a set of subspaces with the same dimensionality to exploit latent shared features across different feature channels, 2) meanwhile, quantifying the shared and feature-specific components of features in the subspaces, and 3) transferring feature-specific intermediate transforms (i-transforms) for learning fusion of heterogeneous features across datasets...
December 15, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Tomas Simon, Jack Valmadre, Iain Matthews, Yaser Sheikh
Recovering dynamic 3D structures from 2D image observations is highly under-constrained because of projection and missing data, motivating the use of strong priors to constrain shape deformation. In this paper, we empirically show that the spatiotemporal covariance of natural deformations is dominated by a Kronecker pattern. We demonstrate that this pattern arises as the limit of a spatiotemporal autoregressive process, and derive a Kronecker Markov Random Field as a prior distribution over dynamic structures...
December 13, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Lingqiao Liu, Chunhua Shen, Anton van den Hengel
Recent studies have shown that a Deep Convolutional Neural Network (DCNN) trained on a large image dataset can be used as a universal image descriptor and that doing so leads to impressive performance for a variety of image recognition tasks. Most of these studies adopt activations from a single DCNN layer, usually the fully-connected layer, as the image representation. In this paper, we proposed a novel way to extract image representations from two consecutive convolutional layers: one layer is utilized for local feature extraction and the other serves as guidance to pool the extracted features...
December 9, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Sign in or create an account to discover new knowledge that matter to you.
