Read by QxMD icon Read

image captioning

Feiran Huang, Xiaoming Zhang, Zhoujun Li, Zhonghua Zhao
Image-text matching by deep models has recently made remarkable achievements in many tasks, such as image caption and image search. A major challenge of matching the image and text lies in that they usually have complicated underlying relations between them and simply modeling the relations may lead to suboptimal performance. In this paper, we develop a novel approach Bi-directional Spatial-Semantic Attention Networks (BSSAN), which leverages both the word to regions (W2R) relation and image object to words (O2W) relation in a holistic deep framework for more effectively matching...
November 19, 2018: IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society
Konda Reddy Mopuri, Utsav Garg, R Venkatesh Babu
Deep convolutional neural networks (CNN) have revolutionized various fields of vision research and have seen unprecedented adoption for multiple tasks such as classification, detection, captioning, etc. However, they offer little transparency into their inner workings and are often treated as black boxes that deliver excellent performance. In this work, we aim at alleviating this opaqueness of CNNs by providing visual explanations for the network's predictions. Our approach can analyze a variety of CNN based models trained for vision applications such as object recognition and caption generation...
November 16, 2018: IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society
Tharindu Fernando, Simon Denman, Sridha Sridharan, Clinton Fookes
As humans we possess an intuitive ability for navigation which we master through years of practice; however existing approaches to model this trait for diverse tasks including monitoring pedestrian flow and detecting abnormal events have been limited by using a variety of hand-crafted features. Recent research in the area of deep-learning has demonstrated the power of learning features directly from the data; and related research in recurrent neural networks has shown exemplary results in sequence-to-sequence problems such as neural machine translation and neural image caption generation...
September 20, 2018: Neural Networks: the Official Journal of the International Neural Network Society
Harry A Atwater, Artur R Davoyan, Ognjen Ilic, Deep Jariwala, Michelle C Sherrott, Cora M Went, William S Whitney, Joeson Wong
In the version of this Perspective originally published, Fig. 1 was missing the following credit line from the caption: 'Background image from ESA/Hubble (A. Fujii).' This has now been corrected in the online versions of the Perspective.
October 12, 2018: Nature Materials
Marcus A Badgeley, Manway Liu, Benjamin S Glicksberg, Mark Shervey, John Zech, Khader Shameer, Joseph Lehar, Eric K Oermann, Michael V McConnell, Thomas M Snyder, Joel T Dudley
Motivation: Radiologists have used algorithms for Computer-Aided Diagnosis (CAD) for decades. These algorithms use machine learning with engineered features, and there have been mixed findings on whether they improve radiologists' interpretations. Deep learning offers superior performance, but requires more training data and has not been evaluated in joint algorithm-radiologist decision systems. Results: We developed the Computer-Aided Note and Diagnosis Interface (CANDI) for collaboratively annotating radiographs and evaluating how algorithms alter human interpretation...
October 10, 2018: Bioinformatics
Mattia Gentile, Emanuele Agolini, Dario Cocciadiferro, Romina Ficarella, Emanuela Ponzi, Emanuele Bellacchio, Maria F Antonucci, Antonio Novelli
Biallelic exostosin-2 (EXT2) pathogenic variants have been described as the cause of the Seizures-Scoliosis-Macrocephaly syndrome (OMIM 616682) characterized by intellectual disability, facial dysmorphisms and seizures. More recently, it has been proposed to rename this disorder with the acronym AREXT2 (autosomal recessive EXT2-related syndrome). Here, we report the third family affected by AREXT2 syndrome, harboring compound missense variants in EXT2, p.Asp227Asn, and p.Tyr608Cys. In addition, our patients developed multiple exostoses, which were not observed in the previously described families...
October 4, 2018: Clinical Genetics
Maryam Ghanbarian, Mohammad Hossein Nicknam, Alireza Mesdaghinia, Masud Yunesian, Mohammad Sadegh Hassanvand, Narjes Soleimanifar, Soheila Rezaei, Zahra Atafar, Marjan Ghanbarian, Maryam Faraji, Mohammad Ghanbari Ghozikali, Kazem Naddafi
The original version of this article unfortunately contained a mistake. Figure 6 caption should be "The light microscopic image (a) and transmission electron microscopic image (b) of A549 cell after 24 h of exposure to PM10 (150 μg/ml).
September 4, 2018: Biological Trace Element Research
Pierre Antherieu, R Levy, T De Saint Denis, L Lohkamp, G Paternoster, F Di Rocco, N Boddaert, M Zerah
The article which was recently published contained error. The figures and figure captions were interchanged during the publication process of the paper.
August 22, 2018: Child's Nervous System: ChNS: Official Journal of the International Society for Pediatric Neurosurgery
Mingxing Zhang, Yang Yang, Hanwang Zhang, Yanli Ji, Heng Tao Shen, Tat-Seng Chua
Recently, a great progress in automatic image captioning has been achieved by using semantic concepts detected from the image. However, we argue that existing concepts-to-caption framework, in which the concept detector is trained using the image-caption pairs to minimize the vocabulary discrepancy, suffers from the deficiency of insufficient concepts. The reasons are two-fold: 1) the extreme imbalance between the number of occurrence positive and negative samples of the concept and 2) the incomplete labeling in training captions caused by the biased annotation and usage of synonyms...
January 2019: IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society
Senmao Ye, Nian Liu, Junwei Han
We propose a novel attention framework called attentive linear transformation (ALT). Instead of learning the spatial or channel-wise attention in existing models, ALT learns to attend to the high-dimensional transformation matrix from the image feature space to the context vector space. Thus ALT can learn various relevant feature abstractions, including spatial attention, channel-wise attention and visual dependence. Besides, we propose a soft threshold regression to predict the attention probabilities for local regions...
July 12, 2018: IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society
Cesc Chunseong Park, Byeongchang Kim, Gunhee Kim
We address personalized image captioning, which generates a descriptive sentence for a user's image, accounting for prior knowledge such as her active vocabularies or writing style in her previous documents. As applications of personalized image captioning, we solve two post automation tasks in social networks: hashtag prediction and post generation. The hashtag prediction predicts a list of hashtags for an image, while the post generation creates a natural post text consisting of normal words, emojis, and even hashtags...
April 10, 2018: IEEE Transactions on Pattern Analysis and Machine Intelligence
Xiao Xie, Xiwen Cai, Junpei Zhou, Nan Cao, Yingcai Wu
Interactive visualization of large image collections is important and useful in many applications, such as personal album management and user profiling on images. However, most prior studies focus on using low-level visual features of images, such as texture and color histogram, to create visualizations without considering the more important semantic information embedded in images. This paper proposes a novel visual analytic system to analyze images in a semantic-aware manner. The system mainly comprises two components: a semantic information extractor and a visual layout generator...
May 15, 2018: IEEE Transactions on Visualization and Computer Graphics
Kun Fu, Jin Li, Junqi Jin, Changshui Zhang
Image captioning aims to generate natural language sentences to describe the salient parts of a given image. Although neural networks have recently achieved promising results, a key problem is that they can only describe concepts seen in the training image-sentence pairs. Efficient learning of novel concepts has thus been a topic of recent interest to alleviate the expensive manpower of labeling data. In this paper, we propose a novel method, Image-Text Surgery, to synthesize pseudoimage-sentence pairs. The pseudopairs are generated under the guidance of a knowledge base, with syntax from a seed data set (i...
December 2018: IEEE Transactions on Neural Networks and Learning Systems
Sachin Muralidhara, Michael J Paul
BACKGROUND: Social media provides a complementary source of information for public health surveillance. The dominate data source for this type of monitoring is the microblogging platform Twitter, which is convenient due to the free availability of public data. Less is known about the utility of other social media platforms, despite their popularity. OBJECTIVE: This work aims to characterize the health topics that are prominently discussed in the image-sharing platform Instagram, as a step toward understanding how this data might be used for public health research...
June 29, 2018: JMIR Public Health and Surveillance
M Kaiser, M Jacobson, P H Andersen, P Bækbo, J J Cerón, J Dahl, D Escribano, S Jacobsen
The original article [1] contains an error whereby the caption in Figure 8 is incorrect; the correct caption can be seen ahead alongside its respective image.
June 1, 2018: BMC Veterinary Research
Toshiya Miyatsu, Reshma Gouravajhala, Robert M Nosofsky, Mark A McDaniel
Learning naturalistic categories, which tend to have fuzzy boundaries and vary on many dimensions, can often be harder than learning well defined categories. One method for facilitating the category learning of naturalistic stimuli may be to provide explicit feature descriptions that highlight the characteristic features of each category. Although this method is commonly used in textbooks and classrooms, theoretically it remains uncertain whether feature descriptions should advantage learning complex natural-science categories...
April 26, 2018: Journal of Experimental Psychology. Learning, Memory, and Cognition
Caio Rodrigues-Silva, Ricardo A R Monteiro, Márcia Dezotti, Adrián M T Silva, Eugénia Pinto, Rui A R Boaventura, Vítor J P Vilar
In the present work, a facile method to prepare translucent anatase thin films on cellulose acetate monolithic (CAM) structures was developed. A simple sol-gel method was applied to synthesize photoactive TiO2 anatase nanoparticles using tetra-n-butyl titanium as precursor. The immobilization of the photocatalyst on CAM structures was performed by a simple dip-coating method. The translucent anatase thin films allow the UV light penetration through the CAM internal walls. The photocatalytic activity was tested on the degradation of n-decane (model volatile organic compound-VOC) in gas phase, using a tubular lab-scale (irradiated by simulated solar light) and pilot-scale (irradiated by natural solar light or UVA light) reactors packed with TiO2 -CAM structures, both equipped with compound parabolic collectors (CPCs)...
April 26, 2018: Environmental Science and Pollution Research International
An Tang, Roger Tam, Alexandre Cadrin-Chênevert, Will Guest, Jaron Chong, Joseph Barfett, Leonid Chepelev, Robyn Cairns, J Ross Mitchell, Mark D Cicero, Manuel Gaudreau Poudrette, Jacob L Jaremko, Caroline Reinhold, Benoit Gallix, Bruce Gray, Raym Geis
Artificial intelligence (AI) is rapidly moving from an experimental phase to an implementation phase in many fields, including medicine. The combination of improved availability of large datasets, increasing computing power, and advances in learning algorithms has created major performance breakthroughs in the development of AI applications. In the last 5 years, AI techniques known as deep learning have delivered rapidly improving performance in image recognition, caption generation, and speech recognition...
May 2018: Canadian Association of Radiologists Journal, Journal L'Association Canadienne des Radiologistes
Ruth C Fong, Walter J Scheirer, David D Cox
Machine learning is a field of computer science that builds algorithms that learn. In many cases, machine learning algorithms are used to recreate a human ability like adding a caption to a photo, driving a car, or playing a game. While the human brain has long served as a source of inspiration for machine learning, little effort has been made to directly use data collected from working brains as a guide for machine learning algorithms. Here we demonstrate a new paradigm of "neurally-weighted" machine learning, which takes fMRI measurements of human brain activity from subjects viewing images, and infuses these data into the training process of an object recognition learning algorithm to make it more consistent with the human brain...
March 29, 2018: Scientific Reports
Airi Nishimi, Takeo Isozaki, Shinichiro Nishimi, Sho Ishii, Takahiro Tokunaga, Hidekazu Furuya, Kuninobu Wakabayashi, Tsuyoshi Kasama
The original version of this article, unfortunately, contained errors. Figure citation, caption, image and updated sentence in the Result section are now presented correctly in this article.
April 2018: Clinical Rheumatology
Fetch more papers »
Fetching more papers... Fetching...
Read by QxMD. Sign in or create an account to discover new knowledge that matter to you.
Remove bar
Read by QxMD icon Read

Search Tips

Use Boolean operators: AND/OR

diabetic AND foot
diabetes OR diabetic

Exclude a word using the 'minus' sign

Virchow -triad

Use Parentheses

water AND (cup OR glass)

Add an asterisk (*) at end of a word to include word stems

Neuro* will search for Neurology, Neuroscientist, Neurological, and so on

Use quotes to search for an exact phrase

"primary prevention of cancer"
(heart or cardiac or cardio*) AND arrest -"American Heart Association"