Please note that the details of this Challenge are no longer open. This challenge is withdrawn and is no longer accepting new submissions. You can:
Challenge EFSA Challenge: Automated Data Extraction

EFSA Challenge: Automated Data Extraction

Award: $28,000 USD
Deadline: Withdrawn
Active Solvers: 256
Posted: Apr 11 2018
Challenge ID: 9933871
Team Project Rooms are available on this Challenge. Team Share Challenge Share

Data extraction from text and images is a fundamental human action and is key to gathering information in any field. With the exponential increase in material being published on any given subject it is becoming increasingly hard for humans to read and extract data from every relevant source of information. Automating data extraction would save a tremendous amount of human resources and possibly result in more accurate and more extensive extraction, and thus the Seeker is looking for a general algorithm for automated data extraction from electronic documents including graphs and images.

This is a Reduction-to-Practice Challenge that requires written documentation, output from the data extraction algorithm, and submission of source code and executable.


Data extraction from texts and images is a fundamental human action. Anytime we read a book or newspaper we’re extracting data whether we know it or not. This extraction may be non-targeted, such as when reading an article on a new topic and placing various key points into memory, or targeted, such as when searching for the score of a particular sporting event. Beyond everyday reading, data extraction is a key part of gathering information for almost any endeavor and spans all fields of work. Investors scan news items for companies of interest and stock prices, scientists read publications to extract data relevant to their own studies, auto mechanics look for torque specification for tightening bolts, etc.. These are targeted data extractions wherein the person is looking for and extracting specific information from the content and the data elements to be extracted can be defined beforehand. Automation of this type of targeted data extraction would save a tremendous amount of human resources for organizations that depend on extracting data from published material, particularly considering the ever-increasing amount of such material available. The Seeker is interested in gathering and comparing the performance of different algorithms and methods for automated data extraction and will provide specific datasets and data elements to extract for this Challenge. The ideal solution will be a tool that could perform all meaningful and relevant information/data extraction from texts, graphs and images.

A submission to the Challenge should include the following:

  1. A detailed description of the proposed Solution addressing specific Technical Requirements presented in the Detailed Description of the Challenge. This should also include a thorough description of the algorithm used in the Solution accompanied by a well-articulated rationale for the method employed.
  2. Output from the proposed algorithm applied to the provided dataset presented in the Challenge.
  3. A free license software/algorithm/package including source code and executable with sufficient documentation to enable the Seeker to compile, execute the algorithm, and validate the method using additional test data sets.

The Challenge award is contingent upon theoretical evaluation of the method/algorithm by the Seeker, and validation by the Seeker of the submitted software/algorithm/package.

To receive an award, the Solvers will not have to transfer their exclusive IP rights to the Seeker. Instead, Solvers will grant to the Seeker a non-exclusive license to practice their solutions.  The award(s) will be paid by ADAS under the procurement contract referenced in the ABOUT THE SEEKER section.

Submissions to this Challenge must be received by 11:59 PM on 10 July 2018 US Eastern Time (05:59 AM on 11 July 2018 European Central Time).

Late submissions will not be considered.



In line with the rights and obligations laid down in the Staff Regulations and CEOs deriving from their contract of employment with EFSA, EFSA staff shall seek permission prior engaging in the Challenge (outside activity) since receiving the award will be equivalent to accepting from other sources outside EFSA any honor, decoration, favor, gift or payment of any kind.



EFSA is a decentralized agency of the European Union (EU) funded by the European Union that operates independently of the European legislative and executive institutions (European Commission, Council, and European Parliament) and EU Member States. EFSA contributes to the safety of the EU food chain by providing scientific advice to risk managers, by communicating on risks to the public, and by cooperating with Member States and other parties to deliver a coherent, trusted food safety system in the EU.

EFSA ( commissioned a two year project in 2016 titled “OC/EFSA/AMU/2015/03: Crowdsourcing: engaging communities effectively in food and feed risk assessment”. This project was awarded to ADAS ( and this current Challenge being run through the InnoCentive platform is being conducted via ADAS on behalf of EFSA as part of this project. ADAS are therefore an intermediary in this process and the ultimate seeker remains EFSA.

What is InnoCentive?
InnoCentive is the global innovation marketplace where creative minds solve some of the world's most important problems for cash awards up to $1 million. Commercial, governmental and humanitarian organizations engage with InnoCentive to solve problems that can impact humankind in areas ranging from the environment to medical advancements.

What is an RTP Challenge?

An InnoCentive RTP (Reduction to Practice) Challenge is a prototype that proves an idea, and is similar to an InnoCentive Theoretical Challenge in its high level of detail. However, an RTP requires the Solver to submit a validated solution, either in the form of original data or a physical sample. Also the Seeker is allowed to test the proposed solution. For details about treatment of IP rights, please see the Challenge-Specific Agreement.

Share This Challenge