ClassifyFirework

Introduction

In this study, we propose the two-stream pipeline model which focuses on classifying the dynamic patterns of videos using both static frames and motion optical flows. Traditional hand-crafted features are known to be insufficient in classifying complex video information. Inspired by the remarkable success of deep learning methods, we propose a two-stream neural net architecture, which classifies dynamic video patterns, using appearance and motion features. Our goal of this study is to investigate the performance of single-stream and two-stream networks with the proposed virtual communication architecture. During this study, we use our firework dataset to validate the proposed models. Prior successful studies of video classifications only focused on the relationship between the standalone streams themselves. However, the proposed virtual communication long short-term memory (VC-LSTM) architecture interconnects between both streams’ next time steps to generate and feed a newly learned information to the following time steps. The VC-LSTM extends the general purpose of LSTM by virtually communicating with previous cells’ states of both appearance and optical flow stream. The experimental results demonstrated that the proposed two-stream Dual-CNNVCLSTM architecture significantly outperforms with training accuracy of 81.76% over single and two-stream baseline architectures.

Models

Dataset

Experiments

to Models

Models

Virtual Communication Long Short-Term Memory (VC-LSTM)

we focused on updating and resetting cells state information rather than the standard cell state update process.

The first step of this process extracts the previous time step’s cells state from both streams LSTM units. And then average the extracted cell state information and update the cells sates of next time step additionally giving this newly learned information.

VCLSTM

The way of the defined virtual package, dealing with LSTM sequence

VCLSTM

Proposed two-stream, RGB and optical flow based Virtual Communication Long Short-Term Memory (VC-LSTM)

VCLSTM

The way of the defined virtual package, dealing with LSTM sequence

1/2

Baseline Models

In this study, we examine the performance of single-stream and two-stream networks with proposed virtual communication LSTM architecture. As single-stream baseline architectures, we designed three models, using a sequence of LSTMs (Single-LSTM), a combination of CNN and LSTM in which the LSTM is placed after the fully connected layer of CNN (Single-CNNLSTM) and using the 3DCNN architecture (Single-3DCNNLSTM). Two-Stream networks Dual-LSTM, Dual-CNNLSTM and Dual-3DCNNLSTM are designed by parallelly pipelining these single-stream networks for classifying firework patterns.

to Dataset

Dataset

Manually categorized dataset into eight classes: Chrysanhemum, Crosette, Desi, Dot, Drop, Fish, Palm and WaterFlower by looking at each video clip.

Chrysanthemum

Dot

Crosette

Drop

Desi

Fish

Palm

WaterFlower

to EXPERIMENTS

Experiments

In this section, we discuss the results of proposed virtual communication models over the without communication models. Further experiments are designed to study the effectiveness of both single-stream and two-stream networks and, to evaluate the model skill with different size of datasets and fusion methods as well.