journal
MENU ▼
Read by QxMD icon Read
search

IEEE Transactions on Pattern Analysis and Machine Intelligence

journal
https://www.readbyqxmd.com/read/30235118/capturing-the-geometry-of-object-categories-from-video-supervision
#1
David Novotny, Diane Larlus, Andrea Vedaldi
In this article, we are interested in capturing the 3D geometry of object categories simply by looking around them. Our unsupervised method fundamentally departs from traditional approaches that require either CAD models or manual supervision. It only uses video sequences capturing a handful of instances of an object category to train a deep architecture tailored for extracting 3D geometry predictions. Our deep architecture has three components. First, a Siamese viewpoint factorization network robustly aligns the input videos and, as a consequence, learns to predict the absolute category-specific viewpoint from a single image depicting any previously unseen instance of that category...
September 19, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30235117/order-preserving-optimal-transport-for-distances-between-sequences
#2
Bing Su, Gang Hua
We present new distance measures between sequences that can tackle local temporal distortion and periodic sequences with arbitrary starting points. Through viewing the instances of each sequence as empirical samples of an unknown distribution, we cast the calculations of distances between sequences as optimal transport problems. To preserve the inherent temporal relationships of the instances in sequences, we propose two methods through incorporating the temporal information into the spatial ground metric and concentrating the transport with two novel temporal regularization terms, respectively...
September 14, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30222552/material-classification-from-time-of-flight-distortions
#3
Kenichiro Tanaka, Yasuhiro Mukaigawa, Takuya Funatomi, Hiruyuki Kubo, Yasuyuki Matsushita, Yasushi Yagi
This paper presents a material classification method using an off-the-shelf Time-of-Flight (ToF) camera. The proposed method is built upon a key observation that the depth measurement by a ToF camera is distorted for objects with certain materials, especially with translucent materials. We show that this distortion is due to the variation of time domain impulse responses across materials and also due to the measurement mechanism of the ToF cameras. Specifically, we reveal that the amount of distortion varies according to the modulation frequency of the ToF camera, the object material, and the distance between the camera and object...
September 12, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30222551/segmentation-of-laser-point-clouds-in-urban-areas-by-a-modified-normalized-cut-method
#4
Avishek Dutta, Johannes Engels, Michael Hahn
Normalized Cut is a well-established divisive image segmentation method, which we adapt in this paper for the segmentation of laser point clouds in urban areas. Our focus is on polyhedral objects with planar surfaces. Due to its target function, Normalized Cut favours cuts with "short cut lines" or "small cut surfaces", which is a drawback for our application. We therefore modify the target function, weighting the similarity measures with distance-dependent weights. We call the induced minimization problem "Distance-weighted Cut" (DWCut)...
September 12, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30222550/motion-segmentation-via-generalized-curvatures
#5
Robert Arn, Pradyumna Narayana, Bruce Draper, Tegan Emerson, Michael Kirby, Chris Peterson
New depth sensors, like the Microsoft Kinect, produce streams of human pose data. These discrete pose streams can be viewed as noisy samples of an underlying continuous ideal curve that describes a trajectory through high-dimensional pose space. This paper introduces a technique for generalized curvature analysis (GCA) that determines features along the trajectory which can be used to characterize change and segment motion. Tools are developed for approximating generalized curvatures at mean points along a curve in terms of the singular values of local mean-centered data balls...
September 12, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30207950/multivariate-mixture-model-for-myocardial-segmentation-combining-multi-source-images
#6
Xiahai Zhuang
The author proposes a method for simultaneous registration and segmentation of multi-source images, using the multivariate mixture model (MvMM) and maximum of log-likelihood (LL) framework. Specifically, the method is applied to the myocardial segmentation combining the complementary information from multi-sequence (MS) cardiac magnetic resonance (CMR) images. For the image misalignment and incongruent data, the MvMM is formulated with transformations and is further generalized for dealing with the hetero-coverage multi-modality images (HC-MMIs)...
September 10, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30207949/estimating-the-number-of-correct-matches-using-only-spatial-order
#7
Lior Talker, Yael Moses, Ilan Shimshoni
Correctly matching feature points in a pair of images is an important preprocessing step for many computer vision applications. In this paper we propose an efficient method for estimating the number of correct matches without explicitly computing them. To this end, we propose to analyze the set of matches using the spatial order of the features, as projected to the x-axis of the image. The set of features in each image is thus represented by a sequence, and analyzed using the Kendall and Spearman Footrule distance metrics between permutations...
September 10, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30207948/anthropomorphic-features-for-on-line-signatures
#8
Moises Diaz, Miguel A A Ferrer, Jose J Quintana Hernandez
Many features have been proposed in on-line signature verification. Generally, these features rely on the position of the on-line signature samples and their dynamic properties, as recorded by a tablet. This paper proposes a novel feature space to describe efficiently on-line signatures. Since producing a signature requires a skeletal arm system and its associated muscles, the new feature space is based on characterizing the movement of the shoulder, the elbow and the wrist joints when signing. As this motion is not directly obtained from a digital tablet, the new features are calculated by means of a virtual skeletal arm (VSA) model, which simulates the architecture of a real arm and forearm...
September 10, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30188812/height-from-polarisation-with-unknown-lighting-or-albedo
#9
William Smith, Ravi Ramamoorthi, Silvia Tozza
We present a method for estimating surface height directly from a single polarisation image simply by solving a large, sparse system of linear equations. To do so, we show how to express polarisation constraints as equations that are linear in the unknown height. The local ambiguity in the surface normal azimuth angle is resolved globally when the optimal surface height is reconstructed. Our method is applicable to dielectric objects exhibiting diffuse and specular reflectance, though lighting and albedo must be known...
September 6, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30188814/discriminant-functional-learning-of-color-features-for-the-recognition-of-facial-action-units-and-their-intensities
#10
Fabian Benitez-Quiroz, Ramprakash Srinivasan, Aleix M Martinez
Color is a fundamental image feature of facial expressions. For example, when we furrow our eyebrows in anger, blood rushes in and a reddish color becomes apparent around that area of the face. Surprisingly, these image properties have not been exploited to recognize the facial action units (AUs) associated with these expressions. Herein, we present the first system to do recognition of AUs and their intensities using these functional color changes. These color features are shown to be robust to changes in identity, gender, race, ethnicity and skin color...
September 5, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30188813/transferable-representation-learning-with-deep-adaptation-networks
#11
Mingsheng Long, Yue Cao, Zhangjie Cao, Jianmin Wang, Michael I Jordan
Domain adaptation generalizes a learning machine across source domain and target domain under different distributions. Recent studies reveal that deep neural networks can learn transferable features generalizing well to similar novel tasks for domain adaptation. However, as deep features eventually transition from general to specific along the network, feature transferability drops significantly in higher task-specific layers with increasing domain discrepancy. To formally reduce the dataset shift and enhance the feature transferability in task-specific layers, this paper presents a novel framework for deep adaptation networks, which generalizes deep convolutional neural networks to domain adaptation...
September 5, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30183621/temporal-segment-networks-for-action-recognition-in-videos
#12
Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, Luc Van Gool
Deep convolutional networks have achieved great success for image recognition. However, for action recognition in videos, their advantage over traditional methods is not so evident. We present a general and flexible video-level framework for learning action models in videos. This method, called temporal segment network (TSN), aims to model long-range temporal structures with a new segment-based sampling and aggregation module. This unique design enables our TSN to efficiently learn action models by using the whole action videos...
September 3, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30183620/representation-learning-by-rotating-your-faces
#13
Luan Quoc Tran, Xi Yin, Xiaoming Liu
The large pose discrepancy between two face images is one of the fundamental challenges in automatic face recognition. Conventional approaches to pose-invariant face recognition either perform face frontalization on, or learn a pose-invariant representation from, a non-frontal face image. We argue that it is more desirable to perform both tasks jointly to allow them to leverage each other. To this end, this paper proposes a Disentangled Representation learning-Generative Adversarial Network (DR-GAN) with three distinct novelties...
September 3, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30183619/dense-3d-object-reconstruction-from-a-single-depth-view
#14
Bo Yang, Stefano Rosa, Andrew Markham, Niki Trigoni, Hongkai Wen
In this paper, we propose a novel approach, 3D-RecGAN++, which reconstructs the complete 3D structure of a given object from a single arbitrary depth view using generative adversarial networks. Unlike existing work which typically requires multiple views of the same object or class labels to recover the full 3D geometry, the proposed 3D-RecGAN++ only takes the voxel grid representation of a depth view of the object as input, and is able to generate the complete 3D occupancy grid with a high resolution of $256^3$ by recovering the occluded/missing regions...
September 3, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30176581/generative-zero-shot-learning-via-low-rank-embedded-semantic-dictionary
#15
Zhengming Ding, Ming Shao, Yun Fu
Zero-shot learning for visual recognition, which approaches recognizing unseen categories through a shared visual-semantic function learned on the seen categories and is expected to well adapt to unseen categories, has received considerable research attention in the most recent years. Here, we propose a two-stage generative adversarial networks to enhance the generalizability of semantic dictionary through low-rank embedding for zero-shot learning in this paper. Specifically, we formulate a novel framework to jointly seek a two-stage generative model and a semantic dictionary to link visual features with their semantic representations under low-rank embedding, which manages to capture shared features across different observed classes...
August 30, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30176580/nonlinear-asymmetric-multi-valued-hashing
#16
Cheng Da, Gaofeng Meng, Shiming Xiang, Kun Ding, Shibiao Xu, Qing Yang, Chunhong Pan
Most existing hashing methods resort to binary codes for large scale similarity search, owing to the high efficiency of computation and storage. However, binary codes lack enough capability in similarity preservation, resulting in less desirable performance. To address this issue, we propose Nonlinear Asymmetric Multi-Valued Hashing (NAMVH) supported by two distinct non-binary embeddings. Specifically, a real-valued embedding is used for representing the newly-coming query by an ideally nonlinear transformation...
August 30, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30136933/semi-supervised-domain-adaptation-by-covariance-matching
#17
Limin Li, Zhenyue Zhang
Transferring knowledge from a source domain to a target domain by domain adaptation has been an interesting and challenging problem in many machine learning applications. The key problem is how to match the data distributions of the two heterogeneous domains in a proper way such that they can be treated indifferently for learning. We propose a covariance matching approach DACoM for semi-supervised domain adaptation.The DACoM embeds the original samples into a common latent space linearly such that the covariance mismatch of the two mapped distributions is minimized, and the local geometric structure and discriminative information are preserved simultaneously...
August 23, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30136932/personalized-saliency-and-its-prediction
#18
Yanyu Xu, Shenghua Gao, Junru Wu, Nianyi Li, Jingyi Yu
Almost all existing visual saliency models focus on predicting a universal saliency map across all observers. Yet psychology studies suggest that visual attention of different observers can vary a lot under some circumstances. In this paper, we set out to study this heterogenous visual attention pattern between different observers and build the first dataset for personalized saliency detection. Further, we propose to decompose a personalized saliency map (PSM) into a universal saliency map (USM) which can be predicted by any existing saliency detection models and a discrepancy between them...
August 23, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30136931/hierarchical-binary-cnns-for-landmark-localization-with-limited-resources
#19
Adrian Bulat, Yorgos Tzimiropoulos
Our goal is to design architectures that retain the groundbreaking performance of Convolutional Neural Networks (CNNs) for landmark localization and at the same time are lightweight, compact and suitable for applications with limited computational resources. To this end, we make the following contributions: (a) we are the first to study the effect of neural network binarization on localization tasks, namely human pose estimation and face alignment. We exhaustively evaluate various design choices, identify performance bottlenecks, and more importantly propose multiple orthogonal ways to boost performance...
August 23, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
https://www.readbyqxmd.com/read/30130178/video-imprint
#20
Zhanning Gao, Le Wang, Nebojsa Jojic, Zhenxing Niu, Nanning Zheng, Gang Hua
A new unified video analytics framework (ER3) is proposed for complex event retrieval, recognition and recounting, based on the proposed video imprint representation, which exploits temporal correlations among image features across video frames. With the video imprint representation, it is convenient to reverse map back to both temporal and spatial locations in video frames, allowing for both key frame identification and key areas localization within each frame. In the proposed framework, a dedicated feature alignment module is incorporated for redundancy removal across frames to produce the tensor representation, i...
August 20, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
journal
journal
34134
1
2
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read
×

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"