Paper Title
Smart Gallery and Video Captioning Using Deep Learning

Abstract
Video understanding has become extremely crucial since most of the data being generated today is in the form of videos. Surveillance, social media, and informational videos have become a very common occurrence in our day-to-day lives. Video captioning offers an easier way to recapitulate the data and use it for various other purposes like indexing and searching. We provide a method for incorporating existing Deep learning models in an ensemble way for the purpose of captioning and provide more accurate results in structuring and retrieval of video data. Video captioning in the field of deep learning aims to generate elucidations for the events in the video automatically according to the visual information of the given videos. Hence, we solve this with automatic generation of scene based video captions to summarize the data which can be used for reference later. The result of this paper would be an end-to-end product framework that allows the users to drop their videos into an envisioned smart gallery system where they can upload as many videos as they please and are able to retrieve any event from the videos that have been uploaded into the gallery by entering the suitable events as text into the input stream. Keywords - Video Captioning, Object Recognition, Image Processing, Understanding and Structuring Video Data, Short- Boundary-Detection, Smart Video Gallery