Challenge Methods for Determining Similar Sequences Across Genomes

Methods for Determining Similar Sequences Across Genomes

Award: $50,000 USD
Deadline: Nov 23 2019 23:59 EST
Active Solvers: 356
Posted: Jul 26 2019
Challenge ID: 9934185
Team Project Rooms are available on this Challenge. Team Share Challenge Share

Individuals of a species differ from one another at the genetic level to various degrees. These differences represent different genotypes, or genetic constitutions, within a species. To better understand the genetic content of each individual genome, it is important to understand similarities and differences of gene sequences and their sub-components when compared across genomes. Therefore, the Seeker is looking for a methodology to accurately identify similar gene sequences across genomes from individuals of a single species.

This is a Reduction-to-Practice Challenge that requires written documentation and output from the data analysis algorithm, and submission of source code and executable if requested by the Seeker.


Individuals of a species differ from one another at the genetic level to various degrees. To deeply characterize the genetic content for each individual genome, it is important to understand which sequences of common ancestry have been inherited, possibly in a modified form, across the genomes. Existing knowledge about a gene variant from a well-characterized genome can be applied to better understand other variants, or alleles, of the same gene in different, uncharacterized genomes. Knowledge of which sequences represent the same genes in different individuals is necessary to understand the impact of similarities or any differences that may exist in the gene sequences of individuals from different genetic backgrounds.

The difficulty lies in determining which gene-derived sequences in the genomes are allelic. Transcription of a gene may produce many alternative transcript representations which differ in sequence composition. Finding the best mapping between transcripts of different genomes is a difficult and time-consuming task. Current methods rely on a combination of common software and proprietary techniques, but the reliability and accuracy of the processed results could be improved. Therefore, the Seeker is interested in a better methodology, with algorithms and/or best selections of existing software/programs, able to relate transcript sets of two genotypes within a species quickly and accurately to identify the allelic relationships.

The submitted proposal should include the following:

  1. A detailed description of the proposed Solution addressing specific Technical Requirements presented in the Detailed Description of the Challenge. This should also include a thorough description of the method used in the Solution accompanied by a well-articulated rationale for the software or programs employed and/or the algorithms developed.
  2. Output from the proposed method applied to the test sets presented in the Challenge in the required format described in DATA-Expected Output-Format
  3. Upon request, a software/algorithm/package including source code and executable(s) with sufficient documentation to enable the Seeker to compile, execute the algorithm, and validate the method using additional validation data sets.

The Challenge award is contingent upon theoretical evaluation of the method/algorithm by the Seeker, and validation by the Seeker of the submitted software/algorithm/package.

To receive an award, the Solvers will not have to transfer their exclusive IP rights to the Seeker. Instead, Solvers will grant to the Seeker a non-exclusive license to practice their solutions. 

Submissions to this Challenge must be received by 11:59 PM (US Eastern Time) on November 23, 2019. Late submissions will not be considered.

What is InnoCentive?
InnoCentive is the global innovation marketplace where creative minds solve some of the world's most important problems for cash awards up to $1 million. Commercial, governmental and humanitarian organizations engage with InnoCentive to solve problems that can impact humankind in areas ranging from the environment to medical advancements.

What is an RTP Challenge?

An InnoCentive RTP (Reduction to Practice) Challenge is a prototype that proves an idea, and is similar to an InnoCentive Theoretical Challenge in its high level of detail. However, an RTP requires the Solver to submit a validated solution, either in the form of original data or a physical sample. Also the Seeker is allowed to test the proposed solution. For details about treatment of IP rights, please see the Challenge-Specific Agreement.

Share This Challenge
Challenge Data (Whats This?)
Table showing data points for submissions received
Table showing data points for submissions received
Solver Map
Solver Map