Benchmark from NVIDIA Creates Rigorous New AI Test


The new benchmark could mark new advances in computer vision and tasks, allowing companies to rely more heavily on machines for mundane or dangerous labor.

Computer vision tests up to this point showed machines falling far behind human evaluators in basic visual recognition. For AI, success meant correct object identification less than 70% of the time compared to human subjects at 99%. Now, a new benchmark from NIVIDA, based on a 50-year-old Russian concept, could launch a new era of artificial intelligence.

Russian computer scientist M.M. Bongard invented a collection of 100 human designed tasks designed to tease out artificial intelligence capability by providing a foundation for assessing cognition. These problems, known as Bongard Problems (BP), have been a standard for years.

See also: Businesses Outfit Cameras with AI to Prevent Coronavirus Spread

BP isn’t designed for state-of-the-art machine learning because of its small size and reliance on natural language. NVIDIA’s work aims to overcome these limitations and provide a new benchmark that better addresses state of the art machine and deep learning.

The new benchmark, called Bongard-LOGO, expands the test sets to 12,000 problem instances spanning across three different tasks:

  • Free-form shape problems: The machine must induce underlying shape problems to determine whether test images match the generated programs.
  • Basic-shape problems: Tests analogy making for features that cause problems for machines in basic shapes but aren’t an issue with free-form shapes.
  • Abstract-shape problems: Assesses the machine’s ability to discover shapes and reason. It prevents the machine from memorizing and instead forces some understanding of the underlying concept.

These three areas help us better understand what a machine is capable of recognizing beyond simple memorization. Humans can draw conclusions and make interpretations beyond the simple line and shape, finding patterns, and making analogies. Now, these tasks create a stronger benchmark for machines to do the same.

What the benchmark means for business

Computer vision is the next greatest capability for various business models that still rely heavily on human labor and intervention. A prime example is found in manufacturing. For example, manufacturing still requires humans to perform rote tasks because of the machine’s inability to assess basic shapes.

These new benchmarks could mark new advances in computer vision and tasks, allowing companies to rely more heavily on machines for mundane or dangerous labor. We’ll be watching what new advances arise from NVIDIA’s testing.

Elizabeth Wallace

About Elizabeth Wallace

Elizabeth Wallace is a Nashville-based freelance writer with a soft spot for data science and AI and a background in linguistics. She spent 13 years teaching language in higher ed and now helps startups and other organizations explain - clearly - what it is they do.

Leave a Reply

Your email address will not be published. Required fields are marked *