The Seeker is searching for a method that can determine whether an automated data clustering system has properly clustered a scatterplot. The desired method is a classification algorithm that examines a scatterplot previously clustered by the automated data clustering system and classifies the scatterplot as either pass (well clustered) or fail (poorly clustered.)
This is an RTP Challenge that requires written documentation, output from the classification algorithm, and submission of source code for validation upon request.
The Seeker employs a process in which an assay is applied to many samples in parallel to determine the state of each sample. Typically, a sample can exist in one of three states. The assay produces two signals whose magnitudes depend on the state of the sample. When the two signals produced by an assay applied to a set of samples are plotted against each other as an XY scatterplot, three distinct clusters are observed corresponding to the three possible states present among the samples. To infer the state of each sample, the clusters must be identified and labeled correctly. The state of each sample is then inferred from the cluster in which its datapoint appears.
Clustering may be performed by humans, termed manual clustering, or by computers, termed auto-clustering. Clustering is challenging due to several factors. Manual clustering has proven to be somewhat more accurate than auto-clustering, but is much more labor intensive. In order to use auto-clustering effectively it is necessary to identify scatterplots from auto-clustering that have been clustered poorly and submit them for manual review. This Challenge is seeking an algorithm that can automatically determine if a scatterplot from auto-clustering has been clustered poorly and classify it as such. The desired algorithm must be heavily weighted toward eliminating false negatives (poorly clustered scatterplots that are incorrectly classified as well clustered) so that all scatterplots that are clustered poorly received manual review. Solvers will be provided with numerous example scatterplots (a training set) to use in developing and testing their algorithm.
A submission to the Challenge should include the following:
The Challenge award is contingent upon theoretical evaluation of the method/algorithm by the Seeker, and validation by the Seeker of the submitted RTP Solution.
To receive an award, the Solvers will not have to transfer their exclusive IP rights to the Seeker. Instead, Solvers will grant to the Seeker a non-exclusive license to practice their solutions.
Submissions to this Challenge must be received by 11:59 PM (US Eastern Time) on August 7, 2017.
Late submissions will not be considered.
What is an RTP Challenge?
An InnoCentive RTP (Reduction to Practice) Challenge is a prototype that proves an idea, and is similar to an InnoCentive Theoretical Challenge in its high level of detail. However, an RTP requires the Solver to submit a validated solution, either in the form of original data or a physical sample. Also the Seeker is allowed to test the proposed solution. For details about treatment of IP rights, please see the Challenge-Specific Agreement.