Read by QxMD icon Read

IEEE Transactions on Pattern Analysis and Machine Intelligence

Jiangbo Lu, Yu Li, Hongsheng Yang, Dongbo Min, Weiyong Eng, Minh Do
Though many tasks in computer vision can be formulated elegantly as pixel-labeling problems, a typical challenge discouraging such a discrete formulation is often due to computational efficiency. Recent studies on fast cost volume filtering based on efficient edge-aware filters provide a fast alternative to solve discrete labeling problems, with the complexity independent of the support window size. However, these methods still have to step through the entire cost volume exhaustively, which makes the solution speed scale linearly with the label space size...
October 11, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiaoyang Wang, Qiang Ji
Current video event recognition research remains largely target-centered. For real-world surveillance videos, targetcentered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects...
October 11, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Nicolas Courty, Remi Flamary, Devis Tuia, Alain Rakotomamonjy
Domain adaptation is one of the most challenging tasks of modern data analytics. If the adaptation is done correctly, models built on a specific data representation become more robust when confronted to data depicting the same classes, but described by another observation system. Among the many strategies proposed, finding domain-invariant representations has shown excellent properties, in particular since it allows to train a unique classifier effective in all domains. In this paper, we propose a regularized unsupervised optimal transportation model to perform the alignment of the representations in the source and target domains...
October 7, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Naila Murray, Herve Jegou, Florent Perronnin, Andrew Zisserman
We consider the design of an image representation that embeds and aggregates a set of local descriptors into a single vector. Popular representations of this kind include the bag-of-visual-words, the Fisher vector and the VLAD. When two such image representations are compared with the dot-product, the image-to-image similarity can be interpreted as a match kernel. In match kernels, one has to deal with interference, i.e. with the fact that even if two descriptors are unrelated, their matching score may contribute to the overall similarity...
October 6, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Seungryong Kim, Dongbo Min, Bumsub Ham, Minh Do, Kwanghoon Sohn
Establishing dense correspondences between multiple images is a fundamental task in many applications. However, finding a reliable correspondence in multi-modal or multi-spectral images still remains unsolved due to their challenging photometric and geometric variations. In this paper, we propose a novel dense descriptor, called dense adaptive self-correlation (DASC), to estimate multi-modal and multi-spectral dense correspondences. Based on an observation that self-similarity existing within images is robust to imaging modality variations, we define the descriptor with a series of an adaptive self-correlation similarity measure between patches sampled by a randomized receptive field pooling, in which a sampling pattern is obtained using a discriminative learning...
October 6, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Torsten Sattler, Bastian Leibe, Leif Kobbelt
Accurately determining the position and orientation from which an image was taken, i.e., computing the camera pose, is a fundamental step in many Computer Vision applications. The pose can be recovered from 2D-3D matches between 2D image positions and points in a 3D model of the scene. Recent advances in Structure-from-Motion allow us to reconstruct large scenes and thus create the need for image-based localization methods that efficiently handle large-scale 3D models while still being effective, i.e., while localizing as many images as possible...
September 20, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Concetto Spampinato, Simone Palazzo, Daniela Giordano
Video object segmentation can be considered as one of the most challenging computer vision problems. Indeed, so far, no existing solution is able to effectively deal with the peculiarities of real-world videos, especially in cases of articulated motion and object occlusions; limitations that appear more evident when we compare the performance of automated methods with the human one. However, manually segmenting objects in videos is largely impractical as it requires a lot of time and concentration. To address this problem, in this paper we propose an interactive video object segmentation method, which exploits, on one hand, the capability of humans to identify correctly objects in visual scenes, and on the other hand, the collective human brainpower to solve challenging and large-scale tasks...
September 19, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Kai Li, Guojun Qi, Jun Ye, Kien Hua
Hashing has attracted a great deal of research in recent years due to its effectiveness for the retrieval and indexing of large-scale high-dimensional multimedia data. In this paper, we propose a novel ranking-based hashing framework that maps data from different modalities into a common Hamming space where the cross-modal similarity can be measured using Hamming distance. Unlike existing cross-modal hashing algorithms where the learned hash functions are binary space partitioning functions, such as the sign and threshold function, the proposed hashing scheme takes advantage of a new class of hash functions closely related to rank correlation measures which are known to be scale-invariant, numerically stable, and highly nonlinear...
September 19, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Nianyi Li, Jinwei Ye, Yu Ji, Haibin Ling, Jingyi Yu
Existing saliency detection approaches use images as inputs and are sensitive to foreground/background similarities, complex background textures, and occlusions. We explore the problem of using light fields as input for saliency detection. Our technique is enabled by the availability of commercial plenoptic cameras that capture the light field of a scene in a single shot. We show that the unique refocusing capability of light fields provides useful focusness, depths, and objectness cues. We further develop a new saliency detection algorithm tailored for light fields...
September 16, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
David Hofmeyr
Minimum normalised graph cuts are highly effective ways of partitioning unlabeled data, having been made popular by the success of spectral clustering. This work presents a novel method for learning hyperplane separators which minimise this graph cut objective, when data are embedded in Euclidean space. The optimisation problem associated with the proposed method can be formulated as a sequence of univariate subproblems, in which the optimal hyperplane orthogonal to a given vector is determined. These subproblems can be solved in log-linear time, by exploiting the trivial factorisation of the exponential function...
September 15, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, Michael Felsberg
Accurate scale estimation of a target is a challenging research problem in visual object tracking. Most state-of-the-art methods employ an exhaustive scale search to estimate the target size. The exhaustive search strategy is computationally expensive and struggles when encountered with large scale variations. This paper investigates the problem of accurate and robust scale estimation in a tracking-by-detection framework. We propose a novel scale adaptive tracking approach by learning separate discriminative correlation filters for translation and scale estimation...
September 15, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Lihe Zhang, Chuan Yang, Huchuan Lu, Xiang Ruan, Ming-Hsuan Yang
Most existing bottom-up algorithms measure the foreground saliency of a pixel or region based on its contrast within a local context or the entire image, whereas a few methods focus on segmenting out background regions and thereby salient objects. Instead of only considering the contrast between salient objects and their surrounding regions, we consider both foreground and background cues in this work. We rank the similarity of image elements with foreground or background cues via graph-based manifold ranking...
September 14, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiaojun Chang, Yao-Liang Yu, Yi Yang, Eric P Xing
Pooling plays an important role in generating a discriminative video representation. In this paper, we propose a new semantic pooling approach for challenging event analysis tasks (e.g. event detection, recognition, and recounting) in long untrimmed Internet videos, especially when only a few shots/segments are relevant to the event of interest while many other shots are irrelevant or even misleading. The commonly adopted pooling strategies aggregate the shots indifferently in one way or another, resulting in a great loss of information...
September 13, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Donghyeon Cho, Sunyeong Kim, Yu-Wing Tai, In So Kweon
In this paper, we introduce an automatic approach to generate trimaps and consistent alpha mattes of foreground objects in a light-field image. Our method first performs binary segmentation to roughly segment a light-field image into foreground and background based on depth and color. Next, we estimate accurate trimaps through analyzing color distribution along the boundary of the segmentation using guided image filter and KL-divergence. In order to estimate consistent alpha mattes across sub-images, we utilize the epipolar plane image (EPI) where colors and alphas along the same epipolar line must be consistent...
September 7, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiaowei Zhou, Menglong Zhu, Spyridon Leonardos, Kostas Daniilidis
We investigate the problem of estimating the 3D shape of an object defined by a set of 3D landmarks, given their 2D correspondences in a single image. A successful approach to alleviating the reconstruction ambiguity is the 3D deformable shape model and a sparse representation is often used to capture complex shape variability. But the model inference is still a challenge due to the nonconvexity in optimization resulted from the joint estimation of shape and viewpoint. In contrast to prior work that relies on an alternating scheme with solutions depending on initialization, we propose a convex approach to addressing this challenge and develop an efficient algorithm to solve the proposed convex program...
September 1, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Bo Yang, Hongbin Pei, Hechang Chen, Jiming Liu, Shang Xia
During an epidemic, the spatial, temporal and demographic patterns of disease transmission are determined by multiple factors. In addition to the physiological properties of the pathogens and hosts, the social contact of the host population, which characterizes the reciprocal exposures of individuals to infection according to their demographic structure and various social activities, are also pivotal to understanding and predicting the prevalence of infectious diseases. How social contact is measured will affect the extent to which we can forecast the dynamics of infections in the real world...
September 1, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Bo Yang, Yu Lei, Jiming Liu, Wenjie Li
Recommender systems are used to accurately and actively provide users with potentially interesting information or services. Collaborative filtering is a widely adopted approach to recommendation, but sparse data and cold-start users are often barriers to providing high quality recommendations. To address such issues, we propose a novel method that works to improve the performance of collaborative filtering recommendations by integrating sparse rating data given by users and sparse social trust network among these same users...
September 1, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, Trevor Darrell
Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent are effective for tasks involving sequences, visual and otherwise. We describe a class of recurrent convolutional architectures which is end-to-end trainable and suitable for large-scale visual understanding tasks, and demonstrate the value of these models for activity recognition, image captioning, and video description. In contrast to previous models which assume a fixed visual representation or perform simple temporal averaging for sequential processing, recurrent convolutional models are "doubly deep" in that they learn compositional representations in space and time...
September 1, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fengyuan Zhu, Guangyong Chen, Jianye Hao, Pheng-Ann Heng
Most existing image denoising approaches assumed the noise to be homogeneous white Gaussian distributed with known intensity. However, in real noisy images, the noise models are usually unknown beforehand and can be much more complex. This paper addresses this problem and proposes a novel blind image denoising algorithm to recover the clean image from noisy one with the unknown noise model. To model the empirical noise of an image, our method introduces the mixture of Gaussian distribution, which is flexible enough to approximate different continuous distributions...
August 31, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Zhenyue Zhang, Zheng Zhai, Limin Li
Multi-view learning aims to integrate multiple data information from different views to improve the learning performance. The key problem is to handle the unconformities or distortions among viewspecific samples or measurements of similarity or dissimilarity. This paper models the view-specific samples as a nonlinear mapping of uniform but latent intact samples for all the views, and the viewspecific dissimilarity matrices or similarity matrices are estimated in terms of the uniform latent one. Two methods are then developed for multi-view clustering...
August 19, 2016: IEEE Transactions on Pattern Analysis and Machine Intelligence
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"