Dense Captioning of Video Demonstrating the Upgraded Boston Dynamics Atlas Robot

Artist and programmer Gene Kogan ran the Boston Dynamics video demonstrating their upgraded Atlas robot through the Densecap captioning system, which tries to identify objects in a video. The system is both impressive and at times wildly inaccurate, labeling the robot in the resulting video as a variety of incorrect things like a person skiing, a motorcycle, or a fire hydrant.

Captions are generated by densecap on individual video frames. The video is made by a python script which merges matching captions along sequences of consecutive frames with a set of (mostly greedy) heuristics. Presumably, it would be possible to caption sequences of regions directly rather than a naive merging algorithm, but I’m not sure how 🙂

via Prosthetic Knowledge

Help Laughing Squid grow with a monthly pledge of support.

What do you think?

0 points
Upvote Downvote

Total votes: 0

Upvotes: 0

Upvotes percentage: 0.000000%

Downvotes: 0

Downvotes percentage: 0.000000%


Leave a Reply


Hand of The Queen, A Supercut Tribute to Tyrion Lannister’s Rise to Power on Game of Thrones

Suicide Squad’s Margot Robbie Defines 50 Australian Slang Terms in Under 4 Minutes