Read by QxMD icon Read

IEEE Transactions on Pattern Analysis and Machine Intelligence

Arash Akbarinia, C Alejandro Parraga
The problem of removing illuminant variations to preserve the colours of objects (colour constancy) has already been solved by the human brain using mechanisms that rely largely on centre-surround computations of local contrast. In this paper we adopt some of these biological solutions described by long known physiological findings into a simple, fully automatic, functional model (termed Adaptive Surround Modulation or ASM). In ASM, the size of a visual neuron's receptive field (RF) as well as the relationship with its surround varies according to the local contrast within the stimulus, which in turn determines the nature of the centre-surround normalisation of cortical neurons higher up in the processing chain...
September 18, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Yusuf Aytar, Lluis Castrejon, Carl Vondrick, Hamed Pirsiavash, Antonio Torralba
People can recognize scenes across many different modalities beyond natural images. In this paper, we investigate how to learn cross-modal scene representations that transfer across modalities. To study this problem, we introduce a new cross-modal scene dataset. While convolutional neural networks can categorize scenes well, they also learn an intermediate representation not aligned across modalities, which is undesirable for cross-modal transfer applications. We present methods to regularize cross-modal convolutional neural networks so that they have a shared representation that is agnostic of the modality...
September 18, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Antonio Agudo, Francesc Moreno-Noguer
This paper addresses the problem of simultaneously recovering 3D shape, pose and the elastic model of a deformable object from only 2D point tracks in a monocular video. This is a severely under-constrained problem that has been typically addressed by enforcing the shape or the point trajectories to lie on low-rank dimensional spaces. We show that formulating the problem in terms of a low-rank force space that induces the deformation and introducing the elastic model as an additional unknown, allows for a better physical interpretation of the resulting priors and a more accurate representation of the actual object's behavior...
September 15, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Maksim Lapin, Matthias Hein, Bernt Schiele
Top-k error is currently a popular performance measure on large scale image classification benchmarks such as ImageNet and Places. Despite its wide acceptance, our understanding of this metric is limited as most of the previous research is focused on its special case, the top-1 error. In this work, we explore two directions that shed light on the top-k error. First, we provide an in-depth analysis of established and recently proposed single-label multiclass methods along with a detailed account of efficient optimization algorithms for them...
September 13, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Sergey Tulyakov, Laszlo A Jeni, Jeffrey F Cohn, Nicu Sebe
Most approaches to face alignment treat the face as a 2D object, which fails to represent depth variation and is vulnerable to loss of shape consistency when the face rotates along a 3D axis. Because faces commonly rotate three dimensionally, 2D approaches are vulnerable to significant error. 3D morphable models, employed as a second step in 2D+3D approaches are robust to face rotation but are computationally too expensive for many applications, yet their ability to maintain viewpoint consistency is unknown...
September 11, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jiangye Yuan
Extracting buildings from aerial scene images is an important task with many applications. However, this task is highly difficult to automate due to extremely large variations of building appearances, and still heavily relies on manual work. To attack this problem, we design a deep convolutional network with a simple structure that integrates activation from multiple layers for pixel-wise prediction, and introduce the signed distance function of building boundaries as the output representation, which has an enhanced representation power...
September 11, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Qianggong Zhang, Tat-Jun Chin
Multiple-view triangulation by l∞ minimisation has become established in computer vision. State-of-the-art l∞ triangulation algorithms exploit the quasiconvexity of the cost function to derive iterative update rules that deliver the global minimum. Such algorithms, however, can be computationally costly for large problem instances that contain many image measurements, e.g., from web-based photo sharing sites or long-term video recordings. In this paper, we prove that l∞ triangulation admits a coreset approximation scheme, which seeks small representative subsets of the input data called coresets...
September 11, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Ahmad Mheich, Mahmoud Hassan, Mohamad Khalil, Vincent Gripon, Olivier Dufor, Fabrice Wendling
Quantifying the similarity between two networks is critical in many applications. A number of algorithms have been proposed to compute graph similarity, mainly based on the properties of nodes and edges. Interestingly, most of these algorithms ignore the physical location of the nodes, which is a key factor in the context of brain networks involving spatially defined functional areas. In this paper, we present a novel algorithm called "SimiNet" for measuring similarity between two graphs whose nodes are defined a priori within a 3D coordinate system...
September 8, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Junlin Hu, Jiwen Lu, Yap-Peng Tan
This paper presents a sharable and individual multi-view metric learning (MvML) approach for visual recognition. Unlike conventional metric leaning methods which learn a distance metric on either a single type of feature representation or a concatenated representation of multiple types of features, the proposed MvML jointly learns an optimal combination of multiple distance metrics on multi-view representations, in which it not only learns an individual distance metric for each view to retain its specific property but also learns a shared representation for different views in a unified latent subspace to preserve their common properties...
September 7, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Christian Knoll, Dhagash Mehta, Tianran Chen, Franz Pernkopf
Belief propagation (BP) is an iterative method to perform approximate inference on arbitrary graphical models. Whether BP converges and if the solution is a unique fixed point depends on both the structure and the parametrization of the model. To understand this dependence it is interesting to find all fixed points.
September 7, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jean-Baptiste Alayrac, Piotr Bojanowski, Nishant Agrawal, Josef Sivic, Ivan Laptev, Simon Lacoste-Julien
Automatic assistants could guide a person or a robot in performing new tasks, such as changing a car tire or repotting a plant. Creating such assistants, however, is non-trivial and requires understanding of visual and verbal content of a video. Towards this goal, we here address the problem of automatically learning the main steps of a task from a set of narrated instruction videos. We develop a new unsupervised learning approach that takes advantage of the complementary nature of the input video and the associated narration...
September 5, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fanhua Shang, James Cheng, Yuanyuan Liu, Zhi-Quan Luo, Zhouchen Lin
The heavy-tailed distributions of corrupted outliers and singular values of all channels in low-level vision have proven effective priors for many applications such as background modeling, photometric stereo and image alignment. And they can be well modeled by a hyper-Laplacian. However, the use of such distributions generally leads to challenging non-convex, non-smooth and non-Lipschitz problems, and makes existing algorithms very slow for large-scale applications. Together with the analytic solutions to Lp-norm minimization with two specific values of p, i...
September 4, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Hyung Jin Chang, Yiannis Demiris
In this paper, we present a novel framework for unsupervised kinematic structure learning of complex articulated objects from a single-view 2D image sequence. In contrast to prior motion-based methods, which estimate relatively simple articulations, our method can generate arbitrarily complex kinematic structures with skeletal topology via a successive iterative merging strategy. The iterative merge process is guided by a density weighted skeleton map which is generated from a novel object boundary generation method from sparse 2D feature points...
September 4, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Yeqing Li, Chen Chen, Fei Yang, Junzhou Huang
Similarity measure is an essential component in image registration. In this article, we propose a novel similarity measure for registration of two or more images. The proposed method is motivated by the fact that optimally registered images can be sparsified hierarchically in the gradient domain and frequency domain with the separation of sparse errors. One of the key advantages of the proposed similarity measure is its robustness in dealing with severe intensity distortions, which widely exist on medical images, remotely sensed images and natural photos due to differences of acquisition modalities or illumination conditions...
September 1, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Williem, In Kyu Park, Kyoung Mu Lee
Depth estimation is essential in many light field applications. Numerous algorithms have been developed using a range of light field properties. However, conventional data costs fail when handling noisy scenes in which occlusion is present. To address this problem, we introduce a light field depth estimation method that is more robust against occlusion and less sensitive to noise. Two novel data costs are proposed, which are measured using the angular patch and refocus image, respectively. The constrained angular entropy cost (CAE) reduces the effects of the dominant occluder and noise in the angular patch, resulting in a low cost...
August 31, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Shaojing Fan, Tian-Tsong Ng, Bryan L Koenig, Jonathan S Herberg, Ming Jiang, Zhiqi Shen, Qi Zhao
Visual realism is defined as the extent to which an image appears to people as a photo rather than computer generated. Assessing visual realism is important in applications like computer graphics rendering and photo retouching. However, current realism evaluation approaches use either labor-intensive human judgments or automated algorithms largely dependent on comparing renderings to reference images. We develop a reference-free computational framework for visual realism prediction to overcome these constraints...
August 30, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Umar Asif, Mohammed Bennamoun, Ferdous Sohel
While deep convolutional neural networks have shown a remarkable success in image classification, the problems of inter-class similarities, intra-class variances, the effective combination of multimodal data, and the spatial variability in images of objects remain to be major challenges. To address these problems, this paper proposes a novel framework to learn a discriminative and spatially invariant classification model for object and indoor scene recognition using multimodal RGB-D imagery. This is achieved through three postulates: 1) spatial invariance - this is achieved by combining a spatial transformer network with a deep convolutional neural network to learn features which are invariant to spatial translations, rotations, and scale changes, 2) high discriminative capability - this is achieved by introducing Fisher encoding within the CNN architecture to learn features which have small inter-class similarities and large intra-class compactness, and 3) multimodal hierarchical fusion - this is achieved through the regularization of semantic segmentation to a multi-modal CNN architecture, where class probabilities are estimated at different hierarchical levels (i...
August 30, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Antoine Lejeune, Jacques G Verly, Marc Van Droogenbroeck
We develop a powerful probabilistic framework for the local characterization of surfaces and edges in range images, which is useful in many applications of computer vision, such as filtering, edge detection, feature extraction, and classification. We use the geometrical nature of the data to derive an analytic expression for the joint probability density function (pdf) for the random variables used to model the ranges of a set of pixels in a local neighborhood of an image. We decompose this joint pdf by considering independently the cases where two real world points corresponding to two neighboring pixels are locally on the same real world surface or not...
August 29, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Salehe Erfanian Ebadi, Ebroul Izquierdo
Background subtraction is a fundamental video analysis technique that consists of creation of a background model that allows distinguishing foreground pixels. We present a new method in which the image sequence is assumed to be made up of the sum of a low-rank background matrix and a dynamic tree-structured sparse matrix. The decomposition task is then solved using our approximated Robust Principal Component Analysis (ARPCA) method which is an extension to the RPCA that can handle camera motion and noise. Our model dynamically estimates the support of the foreground regions via a superpixel generation step, so that spatial coherence can be imposed on these regions...
August 29, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Enrique Sanchez-Lozano, Georgios Tzimiropoulos, Brais Martinez, Fernando De la Torre, Michel Valstar
Linear regression is a fundamental building block in many face detection and tracking algorithms, typically used to predict shape displacements from image features through a linear mapping. This paper presents a Functional Regression solution to the least squares problem, which we coin Continuous Regression, resulting in the first real-time incremental face tracker. Contrary to prior work in Functional Regression, in which B-splines or Fourier series were used, we propose to approximate the input space by its first-order Taylor expansion, yielding a closed-form solution for the continuous domain of displacements...
August 29, 2017: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"