Read by QxMD icon Read

IEEE Transactions on Pattern Analysis and Machine Intelligence

Bing Li, Chunfeng Yuan, Weihua Xiong, Weiming Hu, Houwen Peng, Xinmiao Ding, Steve Maybank
In multi-instance learning (MIL), the relations among instances in a bag convey important contextual information in many applications. Previous studies on MIL either ignore such relations or simply model them with a fixed graph structure so that the overall performance inevitably degrades in complex environments. To address this problem, this paper proposes a novel multi-view multi-instance learning algorithm (MIL) that combines multiple context structures in a bag into a unified framework. The novel aspects are: (i) we propose a sparse -graph model that can generate different graphs with different parameters to represent various context relations in a bag, (ii) we propose a multi-view joint sparse representation that integrates these graphs into a unified framework for bag classification, and (iii) we propose a multi-view dictionary learning algorithm to obtain a multi-view graph dictionary that considers cues from all views simultaneously to improve the discrimination of the MIL...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Cewu Lu, Di Lin, Jiaya Jia, Chi-Keung Tang
Given a single outdoor image, we propose a collaborative learning approach using novel weather features to label the image as either sunny or cloudy. Though limited, this two-class classification problem is by no means trivial given the great variety of outdoor images captured by different cameras where the images may have been edited after capture. Our overall weather feature combines the data-driven convolutional neural network (CNN) feature and well-chosen weather-specific features. They work collaboratively within a unified optimization framework that is aware of the presence (or absence) of a given weather cue during learning and classification...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Damien Lefloch, Markus Kluge, Hamed Sarbolandi, Tim Weyrich, Andreas Kolb
Interactive real-time scene acquisition from hand-held depth cameras has recently developed much momentum, enabling applications in ad-hoc object acquisition, augmented reality and other fields. A key challenge to online reconstruction remains error accumulation in the reconstructed camera trajectory, due to drift-inducing instabilities in the range scan alignments of the underlying iterative-closest-point (ICP) algorithm. Various strategies have been proposed to mitigate that drift, including SIFT-based pre-alignment, color-based weighting of ICP pairs, stronger weighting of edge features, and so on...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Luca Puggini, Sean McLoone
Principal Component Analysis (PCA) is a powerful and widely used tool for dimensionality reduction. However, the principal components generated are linear combinations of all the original variables and this often makes interpreting results and root-cause analysis difficult. Forward Selection Component Analysis (FSCA) is a recent technique that overcomes this difficulty by performing variable selection and dimensionality reduction at the same time. This paper provides, for the first time, a detailed presentation of the FSCA algorithm, and introduces a number of new variants of FSCA that incorporate a refinement step to improve performance...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Filipe Rodrigues, Mariana Lourenco, Bernardete Ribeiro, Francisco C Pereira
The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Anh T Pham, Raviv Raich, Xiaoli Z Fern
Labeling data for classification requires significant human effort. To reduce labeling cost, instead of labeling every instance, a group of instances (bag) is labeled by a single bag label. Computer algorithms are then used to infer the label for each instance in a bag, a process referred to as instance annotation. This task is challenging due to the ambiguity regarding the instance labels. We propose a discriminative probabilistic model for the instance annotation problem and introduce an expectation maximization framework for inference, based on the maximum likelihood approach...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Hamid Laga, Qian Xie, Ian H Jermyn, Anuj Srivastava
Recent developments in elastic shape analysis (ESA) are motivated by the fact that it provides a comprehensive framework for simultaneous registration, deformation, and comparison of shapes. These methods achieve computational efficiency using certain square-root representations that transform invariant elastic metrics into euclidean metrics, allowing for the application of standard algorithms and statistical tools. For analyzing shapes of embeddings of in , Jermyn et al. [1] introduced square-root normal fields (SRNFs), which transform an elastic metric, with desirable invariant properties, into the metric...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Shangwen Li, Sanjay Purushotham, Chen Chen, Yuzhuo Ren, C-C Jay Kuo
Textual data such as tags, sentence descriptions are combined with visual cues to reduce the semantic gap for image retrieval applications in today's Multimodal Image Retrieval (MIR) systems. However, all tags are treated as equally important in these systems, which may result in misalignment between visual and textual modalities during MIR training. This will further lead to degenerated retrieval performance at query time. To address this issue, we investigate the problem of tag importance prediction, where the goal is to automatically predict the tag importance and use it in image retrieval...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiao-Tong Yuan, Qingshan Liu
We introduce a family of Newton-type greedy selection methods for -constrained minimization problems. The basic idea is to construct a quadratic function to approximate the original objective function around the current iterate and solve the constructed quadratic program over the cardinality constraint. The next iterate is then estimated via a line search operation between the current iterate and the solution of the sparse quadratic program. This iterative procedure can be interpreted as an extension of the constrained Newton methods from convex minimization to non-convex -constrained minimization...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Lingqiao Liu, Peng Wang, Chunhua Shen, Lei Wang, Anton van den Hengel, Chao Wang, Heng Tao Shen
Deriving from the gradient vector of a generative model of local features, Fisher vector coding (FVC) has been identified as an effective coding method for image classification. Most, if not all, FVC implementations employ the Gaussian mixture model (GMM) as the generative model for local features. However, the representative power of a GMM can be limited because it essentially assumes that local features can be characterized by a fixed number of feature prototypes, and the number of prototypes is usually small in FVC...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Oren Freifeld, Soren Hauberg, Kayhan Batmanghelich, Jonn W Fisher
We propose novel finite-dimensional spaces of well-behaved transformations. The latter are obtained by (fast and highly-accurate) integration of continuous piecewise-affine velocity fields. The proposed method is simple yet highly expressive, effortlessly handles optional constraints (e.g., volume preservation and/or boundary conditions), and supports convenient modeling choices such as smoothing priors and coarse-to-fine analysis. Importantly, the proposed approach, partly due to its rapid likelihood evaluations and partly due to its other properties, facilitates tractable inference over rich transformation spaces, including using Markov-Chain Monte-Carlo methods...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla
We present a novel and practical deep fully convolutional neural network architecture for semantic pixel-wise segmentation termed SegNet. This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer. The architecture of the encoder network is topologically identical to the 13 convolutional layers in the VGG16 network [1] . The role of the decoder network is to map the low resolution encoder feature maps to full input resolution feature maps for pixel-wise classification...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Mohamed Elhoseiny, Ahmed Elgammal, Babak Saleh
People typically learn through exposure to visual concepts associated with linguistic descriptions. For instance, teaching visual object categories to children is often accompanied by descriptions in text or speech. In a machine learning context, these observations motivates us to ask whether this learning process could be computationally modeled to learn visual classifiers. More specifically, the main question of this work is how to utilize purely textual description of visual classes with no training images, to learn explicit visual classifiers for them...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Zhiyuan Shi, Yongxin Yang, Timothy M Hospedales, Tao Xiang
We propose to model complex visual scenes using a non-parametric Bayesian model learned from weakly labelled images abundant on media sharing sites such as Flickr. Given weak image-level annotations of objects and attributes without locations or associations between them, our model aims to learn the appearance of object and attribute classes as well as their association on each object instance. Once learned, given an image, our model can be deployed to tackle a number of vision problems in a joint and coherent manner, including recognising objects in the scene (automatic object annotation), describing objects using their attributes (attribute prediction and association), and localising and delineating the objects (object detection and semantic segmentation)...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Michael Clement, Adrien Poulenard, Camille Kurtz, Laurent Wendling
The analysis of spatial relations between objects in digital images plays a crucial role in various application domains related to pattern recognition and computer vision. Classical models for the evaluation of such relations are usually sufficient for the handling of simple objects, but can lead to ambiguous results in more complex situations. In this article, we investigate the modeling of spatial configurations where the objects can be imbricated in each other. We formalize this notion with the term enlacement, from which we also derive the term interlacement, denoting a mutual enlacement of two objects...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Tianfu Wu, Yang Lu, Song-Chun Zhu
This paper presents a method, called AOGTracker, for simultaneously tracking, learning and parsing (TLP) of unknown objects in video sequences with a hierarchical and compositional And-Or graph (AOG) representation. The TLP method is formulated in the Bayesian framework with a spatial and a temporal dynamic programming (DP) algorithms inferring object bounding boxes on-the-fly. During online learning, the AOG is discriminatively learned using latent SVM [1] to account for appearance (e.g., lighting and partial occlusion) and structural (e...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Kun Fu, Junqi Jin, Runpeng Cui, Fei Sha, Changshui Zhang
Recent progress on automatic generation of image captions has shown that it is possible to describe the most salient information conveyed by images with accurate and meaningful sentences. In this paper, we propose an image captioning system that exploits the parallel structures between images and sentences. In our model, the process of generating the next word, given the previously generated ones, is aligned with the visual perception experience where the attention shifts among the visual regions-such transitions impose a thread of ordering in visual perception...
December 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Hongfu Liu, Zhiqiang Tao, Yun Fu
Constrained clustering uses pre-given knowledge to improve the clustering performance. Here we use a new constraint called partition level side information and propose the Partition Level Constrained Clustering (PLCC) framework' where only a small proportion of the data is given labels to guide the procedure of clustering. Our goal is to find a partition which captures the intrinsic structure from the data itself, and also agrees with the partition level side information. Then we derive the algorithm of partition level side information based on K-means and give its corresponding solution...
October 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Miguel Angel Bautista, Oriol Pujol, Fernando de la Torre, Sergio Escalera
Error Correcting Output Codes (ECOC) is a successful technique in multi-class classification, which is a core problem in Pattern Recognition and Machine Learning. A major advantage of ECOC over other methods is that the multi-class problem is decoupled into a set of binary problems that are solved independently. However, literature defines a general error-correcting capability for ECOCs without analyzing how it distributes among classes, hindering a deeper analysis of pairwise error-correction. To address these limitations this paper proposes an Error-Correcting Factorization (ECF) method...
October 16, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Ajad Chhatkuli, Daniel Pizarro, Toby Collins, Adrien Bartoli
We present a global and convex formulation for the templateless 3D reconstruction of a deforming object with the perspective camera. We show for the first time how to construct a Second-Order Cone Programming (SOCP) problem for Non-Rigid Structure-from-Motion (NRSfM) using the Maximum-Depth Heuristic (MDH). In this regard, we deviate strongly from the general trend of using affine cameras and factorization-based methods to solve NRSfM, which do not perform well with complex nonlinear deformations. In MDH, the points' depths are maximized so that the distance between neighbouring points in camera space are upper bounded by the geodesic distance...
October 13, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"