1. For optical flow approach, divide the field into 4 different channels, +x, -x, +y and -y, and then Gaussian on the four channels, which contains positive value only. By doing this, the field information would not be cancelled by Gaussian;
2. Create similarity matrix among each frame from both input video and the video in data set. Then, by searching the maximum from the matrix, we could overcome the differences from the uncertain beginning of the action. For example, working could be initialized from right leg or left leg.
3. Add time information to synchronize the movie script and the video scene by using the subtitle and its time line.
subtitle movie script
time t: ABC... ABC...
time t would the time tag of "ABC"