Citizen Data Scientists Needed to Save the Planet

PinIt

The Earth Challenge 2020 initiative overcomes AI model training challenges using citizen data scientists to collect data for environment and healthcare apps.

A small army of citizen data scientists is being mobilized to collect data to help train machine learning algorithms that will be embedded in a range of environment and healthcare applications.

As part of an Earth Challenge 2020 initiative sponsored by the Earth Day Network, the Wilson Center, and the U.S. State Department, applications that tackle everything from food safety and the tracking of insect populations to plastics pollution and air quality are now being made available.

See also: Researchers Develop Algorithm to Detect Crude Oil on Water

Earth Challenge 2020 is an arm of an Earth School coalition spearheaded by the United Nations Environment Programme and TED-Ed, which is committed to providing free educational science content to students, parents, and teachers.

The goal of the Earth Challenge 2020 initiative is to enable citizen data scientists to collect, label, and tag data using mobile computing applications that is then fed into an analytics database from Kinetica that runs on graphical processor units (GPUs). That approach among other applications will enable school children to take photos of insects using a Picture Pile application from Applied Systems Analysis to train machine learning algorithms to recognize not just different types of insects, but where they are also found at different times of the year.

The labeled insect images that collected are then added to a data set collected by the European Space Agency that is being created to better understand how insects such as bees impact food production. Once enough images are labeled the machine learning algorithms eventually start to recognize different images, which then allows them to automatically label and tag them without any further human assistance required.

All the data sets being collected by the mobile applications created as part of the Earth Challenge 2020 initiative will be made available for free to data scientists via a Citizen Science Cloud service or Kinetica REST application programming interfaces (APIs), says Daniel Raskin, chief marketing officer for Kinetica.

“It will all be in the public domain,” says Raskin.

Kinetica is participating in this effort as part of an effort to spur adoption of an analytics database that runs natively on the same GPUs that are being widely employed to train artificial intelligence (AI) models. Machine learning algorithms run considerably faster of GPUs, which reduces the time and costs associated with training AI models.

The challenge many organizations building AI models face is collecting all the data needed to train an AI model. The Earth Challenge 2020 initiative helps address that issue by enlisting what will hopefully become an army of citizen data scientists to help collect data for what will become a broad portfolio of environment and healthcare-related applications, says Raskin.

It’s too early to tell just what impact individuals armed with smartphones capable of capturing high-quality images might have on data science. With more individuals of all ages spending more time at home to combat the COVID-19 pandemic the opportunity to potentially motivate people around the world to participate in various initiatives has never been greater. The challenge, of course, is finding a way to let all those potential citizen data scientists that the opportunity to participate exists in the first place.  

Leave a Reply