Project: Missouri River Fish Classification

9 classes of fish in Missouri River

Over the past few years, deep learning has been widely used and obtained very good results in image recognition. In this project, several state-of-the-art deep learning models and their combinations have been applied to fish recognition in images, in particular 9 common species of fish in Missouri rivers. Four different data processing and machine learnings pipelines have been developed and extensive experiments have been conducted to evaluate their performances. The deep convolutional neural network (CNN) models used in these pipelines include SSD, VGG16, ResNet50, etc. The four pipelines are image-based, instance-based, instance rotation based, and ensemble, with increasing complexity. Without doing any preprocessing, the image-based pipeline takes an entire image as input to classify the image into one of the target classes using deep CNNs. This pipeline achieved up to 75.57% classification accuracy on our test dataset. The instance-based pipeline consists of object detection by one deep CNN followed by classification by another deep CNN. This method achieved up to 80.03% accuracy on our test dataset. The instance rotation based pipeline adds a deep CNN to do pose estimation between object detection and classification. The posture-adjusted fish image is used as the input to the classification model, which help the pipeline to achieve up to 82.83% accuracy on the same dataset. Finally, the ensemble pipeline is a combination of two instance rotation based pipelines. The difference of these two instance rotation based pipelines is in the classification model: one is VGG16 and the other ResNet50. The ensemble pipeline achieved up to 87.22% accuracy, outperforming all other pipelines significantly.

 


9 classes of fish in Missouri River
9 classes of fish in Missouri River

Dataset

Screen Shot 2018-04-11 at 9.32.03 PM.png

Pipeline:

Screen Shot 2018-03-18 at 10.47.07 AM.png

Detection

For the detection stuff, we used SSD_Caffe and YOLO v2.

SSD (Single Shot MultiBox Detector)

  • Framework: Caffe
  • Input size: 512*512
  • Base net: VGG pretrained on imagenet.

In SSD, mostly you need to train a new model on your own dataset, for more convenience, I recommend you to generate the training data format by using Kitti-SSD: https://github.com/jinfagang/kitti-ssd

YOLO v2 (You only look once)

  • Framework: Darknet
  • Input size: 544*544
  • Real-Time Object Detection
  • Base net: Darknet19 pretrained on imagenet

Still working on Detection. YOLO v2 may not be considered later.

Classification:

For the training phase, we have

Screen Shot 2018-04-11 at 9.34.17 PM.png

Screen Shot 2018-04-11 at 9.34.44 PM.png

Image Pipeline

Screen Shot 2018-04-11 at 9.49.00 PM.png

SSD Detector

Screen Shot 2018-04-11 at 9.49.44 PM.png

 

VGG16 Instance Pipeline

Screen Shot 2018-04-11 at 9.50.52 PM.png

Pose Estimator

Screen Shot 2018-04-11 at 9.51.17 PM.png

VGG16 Instance Rotation Pipeline

Screen Shot 2018-04-11 at 9.51.34 PM.png

ResNet50 Instance Rotation Pipeline

 

Screen Shot 2018-04-11 at 9.53.49 PM.png

Ensemble Model

Screen Shot 2018-04-11 at 9.52.57 PM.png

Screen Shot 2018-04-11 at 9.54.51 PM.png

 

Screen Shot 2018-04-11 at 9.55.30 PM.png

Fish pixel only:

Tool: Mask RCNN

For here, I just show some result because I am still working on it.

Trainval: 200

Testing: 30

Left: ground truth; Right: predection

Screen Shot 2018-02-03 at 1.35.14 PM.png

Screen Shot 2018-02-03 at 1.36.10 PM.png

Screen Shot 2018-02-03 at 1.36.29 PM.png

Still working…

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s