|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.ObjectcrowdUser.Refiner
public class Refiner
Parses csv formatted raw results and allows their amelioration by get another label, a program designed by P Ipeirotis (a researcher in computer science specialized in crowdsourcing).
In doing so, the noise is reduced and the comparisons are more accurate, thus assuring a better sort.
More generally, this class parses results in order them to be easily used by a CrowdManager instance, for example by transforming a csv file into a list of String.
Field Summary | |
---|---|
private java.lang.Integer |
columnAnswer
The column in which is stored the result of the question asked to the worker. |
private java.lang.Integer |
columnAxis
The column in which the axis considered for the comparison is stored. |
private java.lang.Integer |
columnMedia1
The column in which the identifier of the first media is. |
private java.lang.Integer |
columnMedia2
The column in which the identifier of the second media is. |
private java.lang.String[] |
columnsCSV
The labels of the columns in the csv file. |
private java.lang.Integer |
columnWorkerId
The column in which the identifier of the worker who performed the comparison. |
private java.util.Map<java.lang.String,java.lang.Integer> |
compScores
A Map |
private java.util.List<java.lang.String[]> |
data
A List of array of string containing the data currently studied : hard data or treated one, depending on the moment. |
Constructor Summary | |
---|---|
Refiner(java.util.List<java.lang.String> fields)
Creates a Refiner instance and sets the fields considered in the formatted results. |
Method Summary | |
---|---|
void |
csv2List(java.lang.String csvFile)
Puts the data contained in a csv file in the data attribute. |
private java.lang.String |
generateCorrectFile()
Creates the "correct file" used by get another label (it is returned as a String). |
private java.lang.String |
generateInputFile()
Creates the "input file" used by 'get another label' (it is returned as a String) and puts the comparisons in the compScores and compNumber Map. |
void |
getanotherlabel()
Calls Panos Ipeirotis and his students' get another label to treat the results of the HIT. |
java.util.List<java.lang.String[]> |
getData()
Returns the data attribute. |
private void |
getFields(java.util.List<java.lang.String> csvFields)
Initializes columnsCSV and column* using the content of the 'csvFields' parameter. |
private java.lang.String |
signature(java.lang.String axis,
java.lang.String idMedia1,
java.lang.String idMedia2)
Returns the "signature" of a comparison, a String identifying it without ambiguity. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private java.lang.String[] columnsCSV
private java.lang.Integer columnAxis
private java.lang.Integer columnMedia1
private java.lang.Integer columnMedia2
private java.lang.Integer columnWorkerId
private java.lang.Integer columnAnswer
private java.util.Map<java.lang.String,java.lang.Integer> compScores
private java.util.List<java.lang.String[]> data
Constructor Detail |
---|
public Refiner(java.util.List<java.lang.String> fields)
It will use the fields contained in the 'fields' parameter to initialize the columnsCSV attribute as well as all those corresponding to a column number (i.e : column[Axis | Media[1|2] | Answer | WorkerId]).
fields
- A list containing the fields of the CSV file containing the HIT results.Method Detail |
---|
private java.lang.String signature(java.lang.String axis, java.lang.String idMedia1, java.lang.String idMedia2)
The signature of a comparison is defined as follow : if a comparisons has been performed between idMedia1 and idMedia2 along the axis axis, its signature is : axis__idMedia1__idMedia2. It is the key under which the result of the comparison is stored in the compScores attribute.
axis
- the axis along which the studied comparison is performedidMedia1
- the identifier of the first mediaidMedia2
- the identifier of the second media
compScores
private void getFields(java.util.List<java.lang.String> csvFields)
For each String contained in csvFields, it is checked whether its value is one of the compulsory ones (i.e, idMedia1, idMedia2, WorkerId, axis, and a last much longer one corresponding to the question asked), in which case the column number is saved in the corresponding attribute.
Although MiscData[1/2] are also compulsory fields, they are not saved here in order to save unnecessary space.
csvFields
- A lists containing the fields of the CSV file containing the HIT results.public void csv2List(java.lang.String csvFile)
The data attribute is a List of Arrays of String. Each array is made of the content of one row of the csv file, each one of its case being the content of a column in this line.
csvFile
- The path to the csv file within the '../data/HITresults/' directory, minus the
extension (for instance, results$JOB_ID instead of results$JOB_ID.csv).public void getanotherlabel()
This method has three steps:
For more details on how get another label works, feel free to read its partially commented source files or to browse its absence of documentation.
private java.lang.String generateCorrectFile()
"Get another label" needs a "correct file" in its input. For more details, see this (correct file section).
private java.lang.String generateInputFile()
"Get another label" needs an "input file" in its input. For more details, see this (input file section).
This String formatted like a file (it contains "\n" at the end of each "line") contains the hard results of the HITs. The String is actually generated in the end of the "try" block. In this String, results are recorded they way they are in the csv file, i.e "1" means that the greater of the two media is the first, "2" meaning the contrary.
However, the score of each comparisons is calculated in this function for better performance. The score is an integer. This score is 0 in the beginning, then 1 is added if media1 IS greater than media2 and -1 is if it is actually the contrary. All the hard results are checked only once, results being added to the compScores Map attribute as they are checked.
public java.util.List<java.lang.String[]> getData()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |