Class PrincipalComponents

  • All Implemented Interfaces:
    java.io.Serializable, AttributeEvaluator, AttributeTransformer, CapabilitiesHandler, OptionHandler, RevisionHandler

    public class PrincipalComponents
    extends UnsupervisedAttributeEvaluator
    implements AttributeTransformer, OptionHandler
    Performs a principal components analysis and transformation of the data. Use in conjunction with a Ranker search. Dimensionality reduction is accomplished by choosing enough eigenvectors to account for some percentage of the variance in the original data---default 0.95 (95%). Attribute noise can be filtered by transforming to the PC space, eliminating some of the worst eigenvectors, and then transforming back to the original space.

    Valid options are:

     -D
      Don't normalize input data.
     -R
      Retain enough PC attributes to account 
      for this proportion of variance in the original data.
      (default = 0.95)
     -O
      Transform through the PC space and 
      back to the original space.
     -A
      Maximum number of attributes to include in 
      transformed attribute names. (-1 = include all)
    Version:
    $Revision: 6690 $
    Author:
    Mark Hall (mhall@cs.waikato.ac.nz), Gabi Schmidberger (gabi@cs.waikato.ac.nz)
    See Also:
    Serialized Form
    • Constructor Detail

      • PrincipalComponents

        public PrincipalComponents()
    • Method Detail

      • globalInfo

        public java.lang.String globalInfo()
        Returns a string describing this attribute transformer
        Returns:
        a description of the evaluator suitable for displaying in the explorer/experimenter gui
      • listOptions

        public java.util.Enumeration listOptions()
        Returns an enumeration describing the available options.

        Specified by:
        listOptions in interface OptionHandler
        Returns:
        an enumeration of all the available options.
      • setOptions

        public void setOptions​(java.lang.String[] options)
                        throws java.lang.Exception
        Parses a given list of options.

        Valid options are:

         -D
          Don't normalize input data.
         -R
          Retain enough PC attributes to account 
          for this proportion of variance in the original data.
          (default = 0.95)
         -O
          Transform through the PC space and 
          back to the original space.
         -A
          Maximum number of attributes to include in 
          transformed attribute names. (-1 = include all)
        Specified by:
        setOptions in interface OptionHandler
        Parameters:
        options - the list of options as an array of strings
        Throws:
        java.lang.Exception - if an option is not supported
      • centerDataTipText

        public java.lang.String centerDataTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setCenterData

        public void setCenterData​(boolean center)
        Set whether to center (rather than standardize) the data. If set to true then PCA is computed from the covariance rather than correlation matrix.
        Parameters:
        center - true if the data is to be centered rather than standardized
      • getCenterData

        public boolean getCenterData()
        Get whether to center (rather than standardize) the data. If true then PCA is computed from the covariance rather than correlation matrix.
        Returns:
        true if the data is to be centered rather than standardized.
      • varianceCoveredTipText

        public java.lang.String varianceCoveredTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setVarianceCovered

        public void setVarianceCovered​(double vc)
        Sets the amount of variance to account for when retaining principal components
        Parameters:
        vc - the proportion of total variance to account for
      • getVarianceCovered

        public double getVarianceCovered()
        Gets the proportion of total variance to account for when retaining principal components
        Returns:
        the proportion of variance to account for
      • maximumAttributeNamesTipText

        public java.lang.String maximumAttributeNamesTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setMaximumAttributeNames

        public void setMaximumAttributeNames​(int m)
        Sets maximum number of attributes to include in transformed attribute names.
        Parameters:
        m - the maximum number of attributes
      • getMaximumAttributeNames

        public int getMaximumAttributeNames()
        Gets maximum number of attributes to include in transformed attribute names.
        Returns:
        the maximum number of attributes
      • transformBackToOriginalTipText

        public java.lang.String transformBackToOriginalTipText()
        Returns the tip text for this property
        Returns:
        tip text for this property suitable for displaying in the explorer/experimenter gui
      • setTransformBackToOriginal

        public void setTransformBackToOriginal​(boolean b)
        Sets whether the data should be transformed back to the original space
        Parameters:
        b - true if the data should be transformed back to the original space
      • getTransformBackToOriginal

        public boolean getTransformBackToOriginal()
        Gets whether the data is to be transformed back to the original space.
        Returns:
        true if the data is to be transformed back to the original space
      • getOptions

        public java.lang.String[] getOptions()
        Gets the current settings of PrincipalComponents
        Specified by:
        getOptions in interface OptionHandler
        Returns:
        an array of strings suitable for passing to setOptions()
      • buildEvaluator

        public void buildEvaluator​(Instances data)
                            throws java.lang.Exception
        Initializes principal components and performs the analysis
        Specified by:
        buildEvaluator in class ASEvaluation
        Parameters:
        data - the instances to analyse/transform
        Throws:
        java.lang.Exception - if analysis fails
      • transformedHeader

        public Instances transformedHeader()
                                    throws java.lang.Exception
        Returns just the header for the transformed data (ie. an empty set of instances. This is so that AttributeSelection can determine the structure of the transformed data without actually having to get all the transformed data through transformedData().
        Specified by:
        transformedHeader in interface AttributeTransformer
        Returns:
        the header of the transformed data.
        Throws:
        java.lang.Exception - if the header of the transformed data can't be determined.
      • transformedData

        public Instances transformedData​(Instances data)
                                  throws java.lang.Exception
        Gets the transformed training data.
        Specified by:
        transformedData in interface AttributeTransformer
        Returns:
        the transformed training data
        Throws:
        java.lang.Exception - if transformed data can't be returned
      • evaluateAttribute

        public double evaluateAttribute​(int att)
                                 throws java.lang.Exception
        Evaluates the merit of a transformed attribute. This is defined to be 1 minus the cumulative variance explained. Merit can't be meaningfully evaluated if the data is to be transformed back to the original space.
        Specified by:
        evaluateAttribute in interface AttributeEvaluator
        Parameters:
        att - the attribute to be evaluated
        Returns:
        the merit of a transformed attribute
        Throws:
        java.lang.Exception - if attribute can't be evaluated
      • toString

        public java.lang.String toString()
        Returns a description of this attribute transformer
        Overrides:
        toString in class java.lang.Object
        Returns:
        a String describing this attribute transformer
      • convertInstance

        public Instance convertInstance​(Instance instance)
                                 throws java.lang.Exception
        Transform an instance in original (unormalized) format. Convert back to the original space if requested.
        Specified by:
        convertInstance in interface AttributeTransformer
        Parameters:
        instance - an instance in the original (unormalized) format
        Returns:
        a transformed instance
        Throws:
        java.lang.Exception - if instance cant be transformed
      • main

        public static void main​(java.lang.String[] argv)
        Main method for testing this class
        Parameters:
        argv - should contain the command line arguments to the evaluator/transformer (see AttributeSelection)