0.8
Sorting media using crowdsourcing.   
Doxygen
LIRIS
Public Member Functions | Static Public Member Functions | Private Member Functions | Private Attributes

myShell.Interpreter Class Reference

Deals with several xml files, either reading or writing them. More...

Collaboration diagram for myShell.Interpreter:

List of all members.

Public Member Functions

 Interpreter (MediaBase database)
 Creates an Interpreter instance.
List< Integer[]> getCurrentState (LogWriter log, String axis)
 Returns the current state of the SplitSort along the axis axis if there is a file containing it in ../data/previously, otherwise returns the result of getAllMedia().
void generateIntermediaryXMLs (String xmlFile, List< Integer[]> content)
 Generates an xml file containing the current state of the splitsort.
String generateFinalXML (String xmlFile, Map< String, Integer[]> sortResults)
 Writes an xml file containing the position of each media along each axis as well as all their characteristics stored in the dbMedia; returns also the content of this file in a String for further treatment.

Static Public Member Functions

static Map< String, String > getConfig ()
 Reads the value of several parameters in the "../data/config.xml" file.
static List< String > getAxes ()
 Returns a List of String containing the axes considered during the comparisons.
static List< String > getFields (String fields)
 Returns a List of String containing the names of the fields used to retrieve hard data from the HIT results file.
static Boolean checkFile (LogWriter log)
 Checks if the config.xml file contains all the fields necessary for CPS to work.
static Integer getLastJob ()
 Returns the Integer saved in the '../data/previously/lastJob' file.
static void setLastJob (Integer id)
 Saves a job identifier in the '../data/previously/lastJob' file.

Private Member Functions

List< Integer[]> parsePrevious (String axis)
 Parses the xml file containing the current state of the sort along the axis axis.
Double rank2value (Integer rank)
 Transforms a rank among the other media into a real value along an axis.
List< Map< String, Integer > > preformating (Map< String, Integer[]> sortResults)
 Transforms the map firstMap into a List in order it to be used by the generateFinalXML() method.

Private Attributes

MediaBase mediaBase
 The myDataBases.MediaBase in which the names and other characteristics of the media sorted are stored.
Integer numberOfMedia
 The number of media sorted.

Detailed Description

Deals with several xml files, either reading or writing them.

This class reads the configuration file config.xml to retrieve the constants used in this program, as well as the axes along which comparisons must be performed and the structure of the information used (i.e : fields of the hit table and their matching elements in the csv file retrieved from the Antechamber).

It also writes intermediary results in order for the splitsort to continue where it stopped as well as final results (those you want if you are using this software).

Author:
Leo Perrin (perrin.leo@gmail.com)

Definition at line 34 of file Interpreter.java.


Constructor & Destructor Documentation

myShell.Interpreter.Interpreter ( MediaBase  database)

Creates an Interpreter instance.

This instance will use the myDataBases.MediaBase MediaBase to get (among other tings) the names of the media. It also parses previous results in order to know how much media are to be sorted this time.

Parameters:
databaseThe MediaBase instance to be aggregated to the Interpreter.

Definition at line 62 of file Interpreter.java.

      {
            // media
            this.mediaBase = database;
            this.numberOfMedia = 0;
            for (Integer[] array : this.mediaBase.getAllMedia())
                  this.numberOfMedia += array.length;
      }

Member Function Documentation

static Boolean myShell.Interpreter.checkFile ( LogWriter  log) [static]

Checks if the config.xml file contains all the fields necessary for CPS to work.

This function verifies if the configuration file contains :

  • mediaPath: the path to the file containing information on the media
  • sqlBase: the name of the MySQL database used.
  • sqlUser: the user of the MySQL database
  • antechamberUrl: the URL of the Antechamber, a website in charge of receiving results from CF and parse them (among other things).
  • cfKey: the key given by CrowdFlower (it is available here if you have an account. And you should, otherwise there is no chance for CPS to work !)
  • cfJobNumber: the identifier of the job used. This time, it is here
  • At least one axis along which to perform comparisons.
  • At least the media1, media2, greater and axis fields in the hardDataFields node
Parameters:
logThe LogWriter instance where the results are saved.
Returns:
true if everything is fine in the config.xml file, false otherwise.

Definition at line 222 of file Interpreter.java.

      {
            Boolean fine = true;
            log.append("------- Checking config.xml");

            // checking the constants
            Map<String,String> configContent = getConfig() ;
            String[] necessaryConstants = { "mediaPath", "sqlBase", "sqlUser", "antechamberUrl", "cfKey", "cfJobNumber" };
            for (String constant : necessaryConstants)
                  if (configContent.keySet().contains(constant))
                        log.append(constant+" constant found");
                  else
                  {
                        log.append("ERROR : " + constant + " NOT FOUND. Interrupting.");
                        fine = false;
                  }
            
            // checking the presence of axes
            List<String> axes = getAxes() ;
            if (axes == null)
            {
                  fine = false;
                  log.append("ERROR : no axis found. Interrupting.");
            }
            else
                  log.append(axes.size() + " axes found.");
            
            // checking the content of the hard data fields
            List<String> fields = getFields("SQLfield");
            String[] necessaryFields = { "media1", "media2", "greater", "axis" };
            if (fields == null)
                  log.append("ERROR : no 'hardDataFields' section in the config.xml. Interrupting.");
            else
                  for (String field : necessaryFields)
                        if (fields.contains(field))
                              log.append(field +" found in the hardDataFields section");
                        else
                        {
                              log.append("ERROR : " + field + " NOT FOUND in the hardDataFields section. Interrupting.");
                              fine = false;
                        }
            
            log.append("------- END check config.xml\n");
            return fine;
      }
String myShell.Interpreter.generateFinalXML ( String  xmlFile,
Map< String, Integer[]>  sortResults 
)

Writes an xml file containing the position of each media along each axis as well as all their characteristics stored in the dbMedia; returns also the content of this file in a String for further treatment.

The results.xml file has the following structure :

< sorted >< media >< characteristic >a first characteristic< /characteristic > < othercharac >a second characteristic< /othercharac >< axis >abscissa along this axis< /axis > < otherAxis >abscissa along this other axis< /otherAxis >< /media >...< /sorted>

Parameters:
xmlFileThe path where the xml file will be written.
sortResultsThe results to be written in the xml file. It must have the following structure: [ "axis considered", "order of the media along this axis"].
Returns:
An XML formatted String which is identical to what is written in the file.

Definition at line 516 of file Interpreter.java.

      {
            Element rootResults = new Element("results");
            org.jdom.Document documentResults = new Document(rootResults);
            Map<Integer, Map<String,String>> mediaCharac= this.mediaBase.getContent();
            Map<String,String> media ;
            List< Map<String,Integer> > results = this.preformating(sortResults) ;
            Element charac;
            String xmlString = "A problem was encoutered!";
            for (Integer mediaID : mediaCharac.keySet())
            {
                  // retrieving information on this media
                  media = mediaCharac.get(mediaID);
                  // creation of a new node corresponding to a new media
                  Element mediaElement = new Element("media");
                  Attribute identifier = new Attribute("id",String.valueOf(mediaID));
                  mediaElement.setAttribute(identifier);
                  for (String entry : media.keySet())
                  {
                        charac = new Element(entry);
                        charac.setText(media.get(entry));
                        mediaElement.addContent(charac);
                  }
                  // addition of its values along each axis
                  Map<String,Integer> resultsForThisMedia = results.get(mediaID); 
                  for ( Iterator<String> emotion = resultsForThisMedia.keySet().iterator(); emotion.hasNext(); )
                  {
                        String axis = emotion.next();
                        Integer rank = resultsForThisMedia.get(axis);
                        Element coordinate = new Element(axis);
                        coordinate.setText(this.rank2value(rank).toString());
                        mediaElement.addContent(coordinate);
                  }
                  rootResults.addContent(mediaElement);
            }
            // Writing the xml Document in a hard file
            try
            {
                  XMLOutputter sortie = new XMLOutputter(Format.getPrettyFormat());
                  sortie.output(documentResults, new FileOutputStream(xmlFile));
                  xmlString = sortie.outputString(documentResults);
            }
            catch (Exception e){ e.printStackTrace(); }
            return xmlString;
      }
void myShell.Interpreter.generateIntermediaryXMLs ( String  xmlFile,
List< Integer[]>  content 
)

Generates an xml file containing the current state of the splitsort.

Generates a file formatted so as to store the content of a list of array. The aim is for this file to be used as a starting point in a next splitsort iteration by the getCurrentState() method.

Parameters:
xmlFileThe file in which the current state of an axis is being saved right now.
contentThe organization of the media along a given axis during the current iteration. For instance, if there are three arrays at the end of this iteration along the axis axis, it must contain the following entry : { [2,1,3],[4],[7,6,5] }.

Definition at line 411 of file Interpreter.java.

      {

            Element rootResults = new Element("results");
            org.jdom.Document documentResults = new Document(rootResults);
            
            Map<Integer, Map<String,String>> mediaCharac= this.mediaBase.getContent();
            Map<String,String> media ;
            Element charac;
            
            for (Integer[] array : content)
            {
                  Element arrayElement = new Element("array");
                  for (Integer mediaID : array)
                  {
                        Element identifier = new Element("id");
                        identifier.setText(String.valueOf(mediaID));
                        arrayElement.addContent(identifier);
                  }
                  rootResults.addContent(arrayElement);
            }
            try
            {
                  XMLOutputter sortie = new XMLOutputter(Format.getPrettyFormat());
                  sortie.output(documentResults, new FileOutputStream(xmlFile));
            }
            catch (Exception e){ e.printStackTrace(); }
      }
static List<String> myShell.Interpreter.getAxes ( ) [static]

Returns a List of String containing the axes considered during the comparisons.

Returns all the children of the "axes" node of the config.xml file whose name is "axis". They correspond to the axes along which the sorting is to be performed.

Returns:
A list of String, each String being an axis.

Definition at line 120 of file Interpreter.java.

      {

            Element rootConfig;
            org.jdom.Document documentConfig;
            SAXBuilder sxb = new SAXBuilder();
            List<String> listAxes = new ArrayList <String>() ;
            try
            {
                  documentConfig= sxb.build(new File("../data/config.xml"));
                  rootConfig = documentConfig.getRootElement();
                  Element axes = rootConfig.getChild("axes");
                  if (axes != null)
                  {
                        Iterator it = axes.getChildren().iterator();
                        while (it.hasNext())
                        {
                              Element current = (Element)it.next();
                              listAxes.add(current.getValue());
                        }
                  }
                  else
                        listAxes = null;
            }
            catch(Exception e){ e.printStackTrace(); }
            return listAxes;
      }
static Map<String,String> myShell.Interpreter.getConfig ( ) [static]

Reads the value of several parameters in the "../data/config.xml" file.

This method reads the content of the "constants" node of the config.xml file. It should contain all the constants used by CPS, such as the path to file where media are stored the login used to connect with the database.

The results are stored in a map : each parameter is in the field which key is its identifier (for example, result["mediaPath"] = "../data/media.xml" (default).

Definition at line 89 of file Interpreter.java.

      {
            Element rootConfig;
            org.jdom.Document documentConfig;
          Map <String,String> conf = new Hashtable <String,String>() ;
            SAXBuilder sxb = new SAXBuilder();
            try
            {
                  documentConfig= sxb.build(new File("../data/config.xml"));
                rootConfig = documentConfig.getRootElement();
                Element constants = rootConfig.getChild("constants");
                  Iterator it = constants.getChildren().iterator();
                  while (it.hasNext())
                  {
                        Element current = (Element)it.next();
                        conf.put(current.getName(), current.getValue());
                  }
            }
          catch(Exception e){ e.printStackTrace(); }
            return conf ;
      }
List<Integer[]> myShell.Interpreter.getCurrentState ( LogWriter  log,
String  axis 
)

Returns the current state of the SplitSort along the axis axis if there is a file containing it in ../data/previously, otherwise returns the result of getAllMedia().

This function checks if an axis.xml file in the ../data/previously/ directory exists. If so, it parses its content and returns it in a list of arrays. Each element of this list is an array waiting for a splitsort iteration (or not, if its size is too small).

If there is no such file, it calls mediaBase.getAllMedia() in order to get a new list of the media to sort.

Parameters:
logThe LogWriter instance in which save the advancement.
axisThe axis from which we want to retrieve the results.
Returns:
The state of the media along the axis at the end of the previous iteration or, in its absence, all media's Id.

Definition at line 367 of file Interpreter.java.

      {
            List<Integer[]> lst;
            File previouslyInCPS = new File("../data/previously/");
            // checking if the file exists.
            Integer count = 0;
            for (String file : previouslyInCPS.list())
                  if (file.equals(axis + ".xml"))
                        break;
                  else
                        count ++;
            if (count == previouslyInCPS.list().length)
            {
                  lst = new ArrayList<Integer[]>();
                  lst = this.mediaBase.getAllMedia();
                  log.append("[DONE] Reading media to sort.\n   ... Note: no previous results found for " + axis +
                              ": starting from the beginning.");
            }
            else
            {
                  lst = parsePrevious(axis);
                  log.append("[DONE] Reading media to sort.\n   ... Note: previous results found for " + axis +
                              ": starting where we left.");
            }
            return lst;
      }
static List<String> myShell.Interpreter.getFields ( String  fields) [static]

Returns a List of String containing the names of the fields used to retrieve hard data from the HIT results file.

What is returned depends on the value of the parameter fields. If it equals :

  • "SQLfield" : it returns the fields of the SQL table HITresults in which hard results are stored
  • "CSVfield" : it returns the fields of the CSV file containing HIT results returned by CrowdFlower
  • "SQLtype" : it returns the types of the SQL fields, i.e "INT", "VARCHAR(40)", etc.
  • anything else : returns null.
Parameters:
fieldsThe name of the field to retrieve. It must have one of these values : "SQLfield", "CSVfield" or "SQLtype"
Returns:
The list of the fields wanted.

Definition at line 168 of file Interpreter.java.

      {
            Element rootConfig;
            org.jdom.Document documentConfig;
            List<String> listFields = new ArrayList <String>() ;
            if ( (fields.equals("SQLfield")) || (fields.equals("SQLtype")) || (fields.equals("CSVfield")) )
            {
                  SAXBuilder sxb = new SAXBuilder();
                  try
                  {
                        documentConfig= sxb.build(new File("../data/config.xml"));
                        rootConfig = documentConfig.getRootElement();
                        Element HDfields = rootConfig.getChild("hardDataFields");
                        Iterator it = HDfields.getChildren().iterator();
                        while (it.hasNext())
                        {
                              Element current = (Element)it.next();
                              String entry = current.getChild(fields).getValue() ;
                              if (entry.length() > 1)
                                    listFields.add(entry);
                        }
                  }
                  catch(Exception e){ e.printStackTrace(); }
            }
            else
                  listFields = null;
            return listFields;
      }
static Integer myShell.Interpreter.getLastJob ( ) [static]

Returns the Integer saved in the '../data/previously/lastJob' file.

Returns:
The identifier of the last job used.

Definition at line 278 of file Interpreter.java.

      {
            Integer id = 0;
            try{
                  InputStream ips = new FileInputStream("../data/previously/lastJob"); 
                  InputStreamReader ipsr = new InputStreamReader(ips);
                  BufferedReader br = new BufferedReader(ipsr);
                  id = Integer.parseInt(br.readLine());
                  br.close(); 
            }           
            catch (Exception e){ e.printStackTrace(); }
            return id;
      }
List<Integer[]> myShell.Interpreter.parsePrevious ( String  axis) [private]

Parses the xml file containing the current state of the sort along the axis axis.

Reads the '../data/previously/$AXIS.xml' file and returns its content. This file should be formatted so as to contain a list of arrays of Integer.

Parameters:
axisThe axis from which we want to retrieve the results.
Returns:
The state of the splitsort along this axis.

Definition at line 321 of file Interpreter.java.

      {
            Element rootPrevious;
            org.jdom.Document documentPrevious;
            List<Integer[]> result = new ArrayList<Integer[]>();
            SAXBuilder sxb = new SAXBuilder();
            try
            {
                  documentPrevious = sxb.build(new File("../data/previously/" + axis + ".xml"));
                rootPrevious = documentPrevious.getRootElement();
                  Iterator it = rootPrevious.getChildren().iterator();
                  while (it.hasNext())
                  {
                        Element array = (Element)it.next();
                        List<Integer> arrayContent = new ArrayList<Integer>();
                        Iterator jt = array.getChildren().iterator();
                        while (jt.hasNext())
                        {
                              Element entry = (Element)jt.next();
                              arrayContent.add (Integer.parseInt(entry.getValue()) );
                        }
                        result.add(arrayContent.toArray(new Integer[0]));
                  }
            }
            catch(Exception e){ e.printStackTrace(); }
            
            return result;
      }
List< Map<String,Integer> > myShell.Interpreter.preformating ( Map< String, Integer[]>  sortResults) [private]

Transforms the map firstMap into a List in order it to be used by the generateFinalXML() method.

The Map of Arrays returned by the last loop of the main function has a structure that is not suited for writing with a correct formatting in the results.xml file. Therefore, this method returns a List of Map that is suited for such a writing. The structure of the input and output is explained below.

Parameters:
sortResultsThe map to modify. Its structure is : {axis considered, [order of the media along this axis]}.
Returns:
A list of map corresponding to the input map of arrays. Its structure is : [ {axis,rank along this axis of the media whose ID is its rank in the list of map}... ]
See also:
Interpreter..generateFinalXML(String, Map)

Definition at line 475 of file Interpreter.java.

      {
            List< Map<String,Integer> > newMap = new ArrayList< Map<String,Integer> >();
            // loop on the ID of the media considered
            for (int media=0; media<this.numberOfMedia; media++)
            {
                  Map<String,Integer> mediaRanks = new Hashtable<String,Integer>();
                  // loop on the label of the axis considered
                  for ( String axis : sortResults.keySet() )
                  {
                        int rank = 0;
                        // loop to find the rank of the media along the "axis" axis
                        while (rank<sortResults.get(axis).length)
                              if (sortResults.get(axis)[rank] == media)
                                    break;
                              else
                                    rank++ ;
                        mediaRanks.put(axis,rank);
                  }
                  newMap.add(mediaRanks);
            }
            return newMap;
      }
Double myShell.Interpreter.rank2value ( Integer  rank) [private]

Transforms a rank among the other media into a real value along an axis.

In this case, we assume that the media follow a uniform distribution in [-1,1], therefore the abscissa is given by: ${ {2 \times rank} \over {numberOfMedia - 1} } - 1$ The first media, at the 0thth rank, will have -1 as abscissa and the last one, at the (numberOfMedia - 1)th rank, will have 1.

Parameters:
rankThe rank of the media along an axis
Returns:
The value corresponding to this rank.

Definition at line 452 of file Interpreter.java.

      {
            Double value =  ( (2.0*rank)/(this.numberOfMedia-1) ) - 1.0;
            return value;
      }
static void myShell.Interpreter.setLastJob ( Integer  id) [static]

Saves a job identifier in the '../data/previously/lastJob' file.

Parameters:
idThe identifier of the job to save.

Definition at line 298 of file Interpreter.java.

      {
            try
            {
                  FileWriter fw = new FileWriter("../data/previously/lastJob", false);
                  BufferedWriter output = new BufferedWriter(fw);
                  output.write(id.toString());
                  output.flush();
                  output.close();
            } catch(Exception e) { e.printStackTrace(); }
      }

Member Data Documentation

The myDataBases.MediaBase in which the names and other characteristics of the media sorted are stored.

Definition at line 43 of file Interpreter.java.

The number of media sorted.

Definition at line 47 of file Interpreter.java.


The documentation for this class was generated from the following file:
 All Classes Namespaces Files Functions Variables