Please note that the details of this Challenge are no longer open. This challenge is awarded and is no longer accepting new submissions. You can:
Challenge <p> Cluster Analysis</p>

Cluster Analysis

STATUS: Awarded
Active Solvers: 485
Posted: Jun 08 2017
Challenge ID: 9933979
Team Project Rooms are available on this Challenge. Team Share Challenge Share

The Seeker is searching for a method that can determine whether an automated data clustering system has properly clustered a scatterplot.  The desired method is a classification algorithm that examines a scatterplot previously clustered by the automated data clustering system and classifies the scatterplot as either pass (well clustered) or fail (poorly clustered.)

This is an RTP Challenge that requires written documentation, output from the classification algorithm, and submission of source code for validation upon request.


The Seeker employs a process in which an assay is applied to many samples in parallel to determine the state of each sample. Typically, a sample can exist in one of three states. The assay produces two signals whose magnitudes depend on the state of the sample. When the two signals produced by an assay applied to a set of samples are plotted against each other as an XY scatterplot, three distinct clusters are observed corresponding to the three possible states present among the samples. To infer the state of each sample, the clusters must be identified and labeled correctly.  The state of each sample is then inferred from the cluster in which its datapoint appears.

Clustering may be performed by humans, termed manual clustering, or by computers, termed auto-clustering. Clustering is challenging due to several factors.  Manual clustering has proven to be somewhat more accurate than auto-clustering, but is much more labor intensive. In order to use auto-clustering effectively it is necessary to identify scatterplots from auto-clustering that have been clustered poorly and submit them for manual review. This Challenge is seeking an algorithm that can automatically determine if a scatterplot from auto-clustering has been clustered poorly and classify it as such. The desired algorithm must be heavily weighted toward eliminating false negatives (poorly clustered scatterplots that are incorrectly classified as well clustered) so that all scatterplots that are clustered poorly received manual review. Solvers will be provided with numerous example scatterplots (a training set) to use in developing and testing their algorithm.

A submission to the Challenge should include the following:

  1. A detailed description of the proposed Solution addressing specific Technical Requirements presented in the Detailed Description of the Challenge. This should also include a thorough description of the algorithm used in the Solution accompanied by a well-articulated rationale for the method employed.
  2. Output from the proposed algorithm applied once, and only once, to a test data set. Output must be in the form described in the Detailed Description of the Challenge.  Submissions will be ranked by the Seeker based on accuracy against the test data set.
  3. For the top ranked submissions, the Seeker may request source code and/or an executable with sufficient documentation to enable the Seeker to compile, execute the algorithm, and validate the method using additional test data sets.

The Challenge award is contingent upon theoretical evaluation of the method/algorithm by the Seeker, and validation by the Seeker of the submitted RTP Solution.

To receive an award, the Solvers will not have to transfer their exclusive IP rights to the Seeker. Instead, Solvers will grant to the Seeker a non-exclusive license to practice their solutions.  

Submissions to this Challenge must be received by 11:59 PM (US Eastern Time) on August 7, 2017. 

Late submissions will not be considered.

What is InnoCentive?
InnoCentive is the global innovation marketplace where creative minds solve some of the world's most important problems for cash awards up to $1 million. Commercial, governmental and humanitarian organizations engage with InnoCentive to solve problems that can impact humankind in areas ranging from the environment to medical advancements.

What is an RTP Challenge?

An InnoCentive RTP (Reduction to Practice) Challenge is a prototype that proves an idea, and is similar to an InnoCentive Theoretical Challenge in its high level of detail. However, an RTP requires the Solver to submit a validated solution, either in the form of original data or a physical sample. Also the Seeker is allowed to test the proposed solution. For details about treatment of IP rights, please see the Challenge-Specific Agreement.

Share This Challenge
InnoCentive Trust Partners