Read by QxMD icon Read

IEEE Transactions on Pattern Analysis and Machine Intelligence

Dimitrios Georgios Evangelidis, Radu Horaud
This paper addresses the problem of registering multiple point sets. Solutions to this problem are often approximated by repeatedly solving for pairwise registration, which results in an uneven treatment of the sets forming a pair: a model set and a data set. The main drawback of this strategy is that the model set may contain noise and outliers, which negatively affects the estimation of the registration parameters. In contrast, the proposed formulation treats all the point sets on an equal footing. Indeed, all the points are drawn from a central Gaussian mixture, hence the registration is cast into a clustering problem...
June 21, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Mingliang Chen, Xing Wei, Qingxiong Yang, Qing Li, Gang Wang, Ming-Hsuan Yang
We propose a background subtraction algorithm using hierarchical superpixel segmentation, spanning trees and optical flow. First, we generate superpixel segmentation trees using a number of Gaussian Mixture Models (GMMs) by treating each GMM as one vertex to construct spanning trees. Next, we use the M-smoother to enhance the spatial consistency on the spanning trees and estimate optical flow to extend the M-smoother to the temporal domain. Experimental results on synthetic and real-world benchmark datasets show that the proposed algorithm performs favorably for background subtraction in videos against the state-of-the-art methods in spite of frequent and sudden changes of pixel values...
June 21, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Tao Wang, Haibin Ling
Matching-based algorithms have been commonly used in planar object tracking. They often model a planar object as a set of keypoints, and then find correspondences between keypoint sets via descriptor matching. In previous work, unary constraints on appearances or locations are usually used to guide the matching. However, these approaches rarely utilize structure information of the object, and are thus suffering from various perturbation factors. In this paper, we proposed a graph-based tracker, named Gracker, which is able to fully explore the structure information of the object to enhance tracking performance...
June 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Tingting Mu, Yannis John Goulermas, Sophia Ananiadou
A typical objective of data visualization is to generate low-dimensional plots that maximally convey the information within the data. The visualization output should help the user to not only identify the local neighborhood structure of individual samples, but also obtain a global view of the relative positioning and separation between cohorts. Here, we propose a very novel visualization framework designed to satisfy these needs. By incorporating additional cohort positioning and discriminative constraints into local neighbor preservation models through the use of computed cohort prototypes, effective control over the arrangements and proximities of data cohorts can be obtained...
June 15, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fatemehsadat Saleh, Mohammad Sadegh Aliakbarian, Mathieu Salzmann, Lars Petersson, Jose M Alvarez, Stephen Gould
Pixel-level annotations are expensive and time consuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recently, CNN-based methods have proposed to fine-tune pre-trained networks using image tags. Without additional information, this leads to poor localization accuracy. This problem, however, was alleviated by making use of objectness priors to generate foreground/background masks. Unfortunately these priors either require pixel-level annotations/bounding boxes, or still yield inaccurate object boundaries...
June 8, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
George Trigeorgis, Mihalis Nicolaou, Stefanos Zafeiriou, Bjoern Schuller
Machine learning algorithms for the analysis of time-series often depend on the assumption that utilised data are temporally aligned. Any temporal discrepancies arising in the data is certain to lead to ill-generalisable models, which in turn fail to correctly capture properties of the task at hand. The temporal alignment of time-series is thus a crucial challenge manifesting in a multitude of applications. Nevertheless, the vast majority of algorithms oriented towards temporal alignment are either applied directly on the observation space or simply utilise linear projections - thus failing to capture complex, hierarchical non-linear representations that may prove beneficial, especially when dealing with multi-modal data (e...
June 8, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Bing Shuai, Zhen Zup, Bing Wang, Gang Wang
In this paper, we address the challenging task of scene segmentation. In order to capture the rich contextual dependencies over image regions, we propose Directed Acyclic Graph - Recurrent Neural Networks (DAG-RNN) to perform context aggregation over locally connected feature maps. More specifically, DAG-RNN is placed on top of pre-trained CNN (feature extractor) to embed context into local features so that their representative capability can be enhanced. In comparison with plain CNN (as in Fully Convolutional Networks - FCN), DAG-RNN is empirically found to be significantly more effective at aggregating context...
June 6, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Gul Varol, Ivan Laptev, Cordelia Schmid
Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representations using neural networks with long-term temporal convolutions (LTC). We demonstrate that LTC-CNN models with increased temporal extents improve the accuracy of action recognition...
June 6, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Gyorgy Kovacs
In this paper, a novel dissimilarity measure called Matching by Monotonic Tone Mapping (MMTM) is proposed. The MMTM technique allows matching under non-linear monotonic tone mappings and can be computed efficiently when the tone mappings are approximated by piecewise constant or piecewise linear functions. The proposed method is evaluated in various template matching scenarios involving simulated and real images, and compared to other measures developed to be invariant to monotonic intensity transformations...
June 2, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Girum G Demisse, Djamila Aouada, Bjorn Ottersten
In this paper, we introduce a deformation based representation space for curved shapes in Rn. Given an ordered set of points sampled from a curved shape, the proposed method represents the set as an element of a finite dimensional matrix Lie group. Variation due to scale and location are filtered in a preprocessing stage, while shapes that vary only in rotation are identified by an equivalence relationship. The use of a finite dimensional matrix Lie group leads to a similarity metric with an explicit geodesic solution...
June 2, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Adrian Sosic, Abdelhak M Zoubir, Heinz Koeppl
Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used e.g. for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g. they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework...
June 1, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Relja Arandjelovic, Petr Gronat, Akihiko Torii, Tomas Pajdla, Josef Sivic
We tackle the problem of large scale visual place recognition, where the task is to quickly and accurately recognize the location of a given query photograph. We present the following four principal contributions. First, we develop a convolutional neural network (CNN) architecture that is trainable in an end-to-end manner directly for the place recognition task. The main component of this architecture, NetVLAD, is a new generalized VLAD layer, inspired by the "Vector of Locally Aggregated Descriptors" image representation commonly used in image retrieval...
June 1, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Guosheng Lin, Chunhua Shen, Anton van den Hengel, Ian Reid
We propose an approach for exploiting contextual information in semantic image segmentation, and particularly investigate the use of patch-patch context and patch-background context in deep CNNs. We formulate deep structured models by combining CNNs and Conditional Random Fields (CRFs) for learning the patch-patch context between image regions. Specifically, we formulate CNN-based pairwise potential functions to capture semantic correlations between neighboring patches. Efficient piecewise training of the proposed deep structured model is then applied in order to avoid repeated expensive CRF inference during the course of back propagation...
May 26, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Alberto Crivellaro, Mahdi Rad, Yannick Verdie, Kwang Moo Yi, Pascal Fua, Vincent Lepetit
We present an algorithm for estimating the pose of a rigid object in real-time under challenging conditions. Our method effectively handles poorly textured objects in cluttered, changing environments, even when their appearance is corrupted by large occlusions, and it relies on grayscale images to handle metallic environments on which depth cameras would fail. As a result, our method is suitable for practical Augmented Reality applications including industrial environments. At the core of our approach is a novel representation for the 3D pose of object parts: We predict the 3D pose of each part in the form of the 2D projections of a few control points...
May 26, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Qi Wu, Chunhua Shen, Peng Wang, Anthony Dick, Anton van den Hengel
Much of the recent progress in Vision-to-Language problems has been achieved through a combination of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). This approach does not explicitly represent high-level semantic concepts, but rather seeks to progress directly from image features to text. In this paper we first propose a method of incorporating high-level concepts into the successful CNN-RNN approach, and show that it achieves a significant improvement on the state-of-the-art in both image captioning and visual question answering...
May 26, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Ethan M Rudd, Lalit P Jain, Walter J Scheirer, Terrance E Boult
It is often desirable to be able to recognize when inputs to a recognition function learned in a supervised manner correspond to classes unseen at training time. With this ability, new class labels could be assigned to these inputs by a human operator, allowing them to be incorporated into the recognition function - ideally under an efficient incremental update mechanism. While good algorithms that assume inputs from a fixed set of classes exist, e.g., artificial neural networks and kernel machines, it is not immediately obvious how to extend them to perform incremental learning in the presence of unknown query classes...
May 23, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Ziming Zhang, Yun Liu, Xi Chen, Yanjun Zhu, Ming-Ming Cheng, Venkatesh Saligrama, Philip H S Torr
We are motivated by the need for a generic object proposal generation algorithm which achieves good balance between object detection recall, proposal localization quality and computational efficiency. We propose a novel object proposal algorithm, BING++, which inherits the virtue of good computational efficiency of BING [1] but significantly improves its proposal localization quality. At high level we formulate the problem of object proposal generation from a novel probabilistic perspective, based on which our BING++ manages to improve the localization quality by employing edges and segments to estimate object boundaries and update the proposals sequentially...
May 23, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Huimin Ma, Sanja Fidler, Raquel Urtasun
The goal of this paper is to perform 3D object detection in the context of autonomous driving. Our method aims at generating a set of high-quality 3D object proposals by exploiting stereo imagery. We formulate the problem as minimizing an energy function that encodes object size priors, placement of objects on the ground plane as well as several depth informed features that reason about free space, point cloud densities and distance to the ground. We then exploit a CNN on top of these proposals to perform object detection...
May 19, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiangbo Shu, Jinhui Tang, Zechao Li, Hanjiang Lai, Liyan Zhang, Shuicheng Yan
Age progression is defined as aesthetically re-rendering the aging face at any future age for an individual face. In this work, we aim to automatically render aging faces in a personalized way. Basically, for each age group, we learn an aging dictionary to reveal its aging characteristics (e.g., wrinkles), where the dictionary bases corresponding to the same index yet from two neighboring aging dictionaries form a particular aging pattern cross these two age groups, and a linear combination of all these patterns expresses a particular personalized aging process...
May 17, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Wen Li, Zheng Xu, Dong Xu, Dengxin Dai, Luc Van Gool
Domain adaptation between diverse source and target domains is a challenging research problem, especially in the real-world visual recognition tasks where the images and videos consist of significant variations in viewpoints, illuminations, qualities, etc. In this paper, we propose a new approach for domain generalization and domain adaptation based on exemplar SVMs. Specifically, we decompose the source domain into many subdomains, each of which contains only one positive training sample and all negative samples...
May 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"