Package weka.classifiers.meta
Class Bagging
- java.lang.Object
-
- All Implemented Interfaces:
java.io.Serializable
,java.lang.Cloneable
,AdditionalMeasureProducer
,CapabilitiesHandler
,OptionHandler
,Randomizable
,RevisionHandler
,TechnicalInformationHandler
,WeightedInstancesHandler
public class Bagging extends RandomizableIteratedSingleClassifierEnhancer implements WeightedInstancesHandler, AdditionalMeasureProducer, TechnicalInformationHandler
Class for bagging a classifier to reduce variance. Can do classification and regression depending on the base learner.
For more information, see
Leo Breiman (1996). Bagging predictors. Machine Learning. 24(2):123-140. BibTeX:@article{Breiman1996, author = {Leo Breiman}, journal = {Machine Learning}, number = {2}, pages = {123-140}, title = {Bagging predictors}, volume = {24}, year = {1996} }
Valid options are:-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.- Version:
- $Revision: 11572 $
- Author:
- Eibe Frank (eibe@cs.waikato.ac.nz), Len Trigg (len@reeltwo.com), Richard Kirkby (rkirkby@cs.waikato.ac.nz)
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description Bagging()
Constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
bagSizePercentTipText()
Returns the tip text for this propertyvoid
buildClassifier(Instances data)
Bagging method.java.lang.String
calcOutOfBagTipText()
Returns the tip text for this propertydouble[]
distributionForInstance(Instance instance)
Calculates the class membership probabilities for the given test instance.java.util.Enumeration
enumerateMeasures()
Returns an enumeration of the additional measure names.int
getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.boolean
getCalcOutOfBag()
Get whether the out of bag error is calculated.double
getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.java.lang.String[]
getOptions()
Gets the current settings of the Classifier.java.lang.String
getRevision()
Returns the revision string.TechnicalInformation
getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.java.lang.String
globalInfo()
Returns a string describing classifierjava.util.Enumeration
listOptions()
Returns an enumeration describing the available options.static void
main(java.lang.String[] argv)
Main method for testing this class.double
measureOutOfBagError()
Gets the out of bag error that was calculated as the classifier was built.void
setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.void
setCalcOutOfBag(boolean calcOutOfBag)
Set whether the out of bag error is calculated.void
setOptions(java.lang.String[] options)
Parses a given list of options.java.lang.String
toString()
Returns description of the bagged classifier.-
Methods inherited from class weka.classifiers.RandomizableIteratedSingleClassifierEnhancer
getSeed, seedTipText, setSeed
-
Methods inherited from class weka.classifiers.IteratedSingleClassifierEnhancer
getNumIterations, numIterationsTipText, setNumIterations
-
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getCapabilities, getClassifier, setClassifier
-
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
-
-
-
-
Method Detail
-
globalInfo
public java.lang.String globalInfo()
Returns a string describing classifier- Returns:
- a description suitable for displaying in the explorer/experimenter gui
-
getTechnicalInformation
public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.- Specified by:
getTechnicalInformation
in interfaceTechnicalInformationHandler
- Returns:
- the technical information about this class
-
listOptions
public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.- Specified by:
listOptions
in interfaceOptionHandler
- Overrides:
listOptions
in classRandomizableIteratedSingleClassifierEnhancer
- Returns:
- an enumeration of all the available options.
-
setOptions
public void setOptions(java.lang.String[] options) throws java.lang.Exception
Parses a given list of options. Valid options are:-P Size of each bag, as a percentage of the training set size. (default 100)
-O Calculate the out of bag error.
-S <num> Random number seed. (default 1)
-I <num> Number of iterations. (default 10)
-D If set, classifier is run in debug mode and may output additional info to the console
-W Full name of base classifier. (default: weka.classifiers.trees.REPTree)
Options specific to classifier weka.classifiers.trees.REPTree:
-M <minimum number of instances> Set minimum number of instances per leaf (default 2).
-V <minimum variance for split> Set minimum numeric class variance proportion of train variance for split (default 1e-3).
-N <number of folds> Number of folds for reduced error pruning (default 3).
-S <seed> Seed for random data shuffling (default 1).
-P No pruning.
-L Maximum tree depth (default -1, no maximum)
Options after -- are passed to the designated classifier.- Specified by:
setOptions
in interfaceOptionHandler
- Overrides:
setOptions
in classRandomizableIteratedSingleClassifierEnhancer
- Parameters:
options
- the list of options as an array of strings- Throws:
java.lang.Exception
- if an option is not supported
-
getOptions
public java.lang.String[] getOptions()
Gets the current settings of the Classifier.- Specified by:
getOptions
in interfaceOptionHandler
- Overrides:
getOptions
in classRandomizableIteratedSingleClassifierEnhancer
- Returns:
- an array of strings suitable for passing to setOptions
-
bagSizePercentTipText
public java.lang.String bagSizePercentTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
getBagSizePercent
public int getBagSizePercent()
Gets the size of each bag, as a percentage of the training set size.- Returns:
- the bag size, as a percentage.
-
setBagSizePercent
public void setBagSizePercent(int newBagSizePercent)
Sets the size of each bag, as a percentage of the training set size.- Parameters:
newBagSizePercent
- the bag size, as a percentage.
-
calcOutOfBagTipText
public java.lang.String calcOutOfBagTipText()
Returns the tip text for this property- Returns:
- tip text for this property suitable for displaying in the explorer/experimenter gui
-
setCalcOutOfBag
public void setCalcOutOfBag(boolean calcOutOfBag)
Set whether the out of bag error is calculated.- Parameters:
calcOutOfBag
- whether to calculate the out of bag error
-
getCalcOutOfBag
public boolean getCalcOutOfBag()
Get whether the out of bag error is calculated.- Returns:
- whether the out of bag error is calculated
-
measureOutOfBagError
public double measureOutOfBagError()
Gets the out of bag error that was calculated as the classifier was built.- Returns:
- the out of bag error
-
enumerateMeasures
public java.util.Enumeration enumerateMeasures()
Returns an enumeration of the additional measure names.- Specified by:
enumerateMeasures
in interfaceAdditionalMeasureProducer
- Returns:
- an enumeration of the measure names
-
getMeasure
public double getMeasure(java.lang.String additionalMeasureName)
Returns the value of the named measure.- Specified by:
getMeasure
in interfaceAdditionalMeasureProducer
- Parameters:
additionalMeasureName
- the name of the measure to query for its value- Returns:
- the value of the named measure
- Throws:
java.lang.IllegalArgumentException
- if the named measure is not supported
-
buildClassifier
public void buildClassifier(Instances data) throws java.lang.Exception
Bagging method.- Overrides:
buildClassifier
in classIteratedSingleClassifierEnhancer
- Parameters:
data
- the training data to be used for generating the bagged classifier.- Throws:
java.lang.Exception
- if the classifier could not be built successfully
-
distributionForInstance
public double[] distributionForInstance(Instance instance) throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.- Overrides:
distributionForInstance
in classClassifier
- Parameters:
instance
- the instance to be classified- Returns:
- preedicted class probability distribution
- Throws:
java.lang.Exception
- if distribution can't be computed successfully
-
toString
public java.lang.String toString()
Returns description of the bagged classifier.- Overrides:
toString
in classjava.lang.Object
- Returns:
- description of the bagged classifier as a string
-
getRevision
public java.lang.String getRevision()
Returns the revision string.- Specified by:
getRevision
in interfaceRevisionHandler
- Overrides:
getRevision
in classClassifier
- Returns:
- the revision
-
main
public static void main(java.lang.String[] argv)
Main method for testing this class.- Parameters:
argv
- the options
-
-