Non-Separable Filtering With Side-information and Contextually-Designed Filters For Next Generation Video Codecs
Onur Guleryuz, Debargha Mukherjee, Yue Chen, Keng-Shih Lu, Urvang Joshi
-
SPS
IEEE Members: $11.00
Non-members: $15.00Length: 00:15:19
Even though CNNs can classify objects in images very accurately, it is well known that the attention of the network may not always be on the semantically important regions of the scene. It has been observed that networks often learn background textures, which are not relevant to the object. This makes the networks susceptible to variations and changes in the background which may negatively affect their performance. We propose a new training procedure called split training to reduce this bias in CNNs for object recognition using infrared and Color imagery. First, a model is trained to recognize objects in images without background, and the activations produced by the higher layers are observed. Next, a second network is trained (using MSE loss) to produce the same activations, but in response to the objects embedded in background. This forces the second network to ignore the background while focusing on the object. Finally, with layers producing the activations frozen, the rest of the second network is trained to classify the objects. Our training method outperforms the traditional training procedure in both a simple CNN, and in deep CNNs like VGG and learns to mimic human vision which focuses more on shape and structure.