NORMA eResearch @NCI Library

Image Captioning: Capsule Network vs CNN approach

Deka, Jaydeep (2020) Image Captioning: Capsule Network vs CNN approach. Masters thesis, Dublin, National College of Ireland.

[img]
Preview
PDF (Master of Science)
Download (2MB) | Preview

Abstract

Given an image, generating a relevant sentence to describe the objects and the activities is an active research area popularly termed as `Image Captioning'. The problem requires the integration of both computer vision and natural language processing. Different approaches have been proposed over the last decade which used neural networks to achieve state-of-the-art results. Most of the recent researches have used an encoder-decoder framework that uses Convolutional Neural Network (CNN) for image feature extraction. Though CNN based solutions have performed remarkably well, CNN fails to retain information on spatial hierarchy and lacks rotational invariance. These drawbacks of CNN are addressed in a comparatively new neural network called Capsule Network (CapsNet). This research takes a novel approach in the implementation of an image captioning solution using CapsNet as the image feature extractor. There are six different models trained using both CapsNet and CNN on the Flickr8k dataset and evaluated using BLEU-(n) scores. The experiments have shown convincing results from CapsNet based solution considering much smaller size than CNN. The BLEU-1 score of the CapsNet based solution is 0.536 compared to 0.534 of the CNN based solution.
Keywords| Image Captioning, Encoder-decoder, CNN, Capsule Network, BLEU-(n), Flickr8k

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science

Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Dan English
Date Deposited: 17 Jun 2020 11:01
Last Modified: 17 Jun 2020 11:01
URI: http://trap.ncirl.ie/id/eprint/4298

Actions (login required)

View Item View Item