GAZE PRE-TRAIN FOR IMPROVING DISPARITY ESTIMATION NETWORKS
Ron M Hecht (General Motors); Ohad Rahamim (General Motors); Shaul Oron (GM); Andrea Forgacs (General Motors); Gershon Celniker (General Motors); Dan Levi (General Motors); Omer Tsimhoni (General Motors)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
In the process of training Neural Networks, pre-training is an unsupervised training process that uses automatically generated labels for real end-goal task inputs. It usually precedes a supervised training stage, can improve neural network performance, and can reduce training loss. In this work, we used pre-training in the automotive domain where the setup was composed of a camera aimed outside the vehicle and an eye tracking system observing the driver. Our pre-training process used images from the camera as input and eye gaze direction as the automatic generated label. The eye gaze pre-training goal was to initiate and thus improve disparity estimation networks. Selecting eye gaze as labels is the best of both worlds. On one hand, it is somewhat similar to supervised training. Labels are generated by humans, by drivers who have deep understanding of the scenes and driving situation. On the other hand, it is similar to unsupervised training. The labels can be generated automatically. Large quantities of data can be collected easily. Overall, the eye gaze pre-train helped reduce the L1 loss from 0.65 when not using pre-train to 0.45 when using it on the validation set.