|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectclassifier.WekaSVM
public class WekaSVM
This class represents the implementation of the classifier interface, suited to be used with the SMO (SVM) implementation of the WEKA(TM) machine learning library.
Nested Class Summary |
---|
Nested classes/interfaces inherited from interface classifier.Classifier |
---|
Classifier.DATA_TYPE |
Constructor Summary | |
---|---|
WekaSVM(org.apache.log4j.Logger logger,
ClassificationAction ca)
Constructor, just initiates the necessary variables. |
Method Summary | |
---|---|
void |
applyAttributeFilter(java.util.List<java.lang.Integer> attributeFilter,
int maxNumFeatures,
java.io.File toBeFilteredFile)
After having used featureselection to get a filter, this filter can be used to change the featurefiles in order to optimize the svms. |
boolean |
buildClassifier()
This method builds an SVM model file from a file with trainingexamples. |
java.lang.Double |
classify_single_instance_fast(double[] features)
Use the trained classifier to classify a single instance of data in a very fast way, without having to resort to string parsing procedures (recommanded method for doing these classifications). |
java.lang.String |
classify_single_instance(java.lang.String instance)
Use the trained classifier to classify a single instance of data, defined by the instance parameter. |
void |
classify(java.lang.String testFile,
java.lang.String outputFile)
Use the trained (or untrained, the modelfile must be set before though) SVM to classify data, and write the output to an outputfile |
CrossValidationResult |
crossValidate(int n,
int maxPosTrain,
int maxNegTrain)
Performs a crossvalidation of trainingfile. |
java.lang.String |
generateFeatureString(java.util.List<java.lang.Double> data,
Classifier.DATA_TYPE dataType)
Creates a string of features for only one featurevector |
java.lang.String |
getFileExtension()
|
java.lang.String |
getModelFile()
|
int[] |
getPosNegExamplesInFile(java.io.File file)
Returns the amount of positive and negative examples in a trainingfile. |
double |
getSigmoid_A()
|
double |
getSigmoid_B()
|
weka.core.Instances |
getTrainingFileInstances()
Returns the the instances used for training the SMO. |
boolean |
loadClassifier()
Sets the modelfile, and - dependend on the implementation - there may be an attempt to build the SVM from this modelfile. |
boolean |
loadClassifier(java.lang.String svmFile)
Sets the modelfile, and - dependend on the implementation - there may be an attempt to build the SVM from this modelfile. |
SMO |
loadModel(java.lang.String fileIn)
This method loads the SMO java-object from a file. |
java.io.File |
mergeFeatureFiles(java.io.File tempFilePositive,
java.io.File tempFileNegative)
Merges the featurefiles (one with positive training features, one with negative training features), in order to make the actual training file. |
java.util.List<ValPosCombination> |
performAttributeEvaluation(boolean sort,
weka.attributeSelection.AttributeEvaluator evaluator)
Performs feature selection by evaluating different attributes. |
java.lang.String[] |
prepareCrossvalidationCommand(int fold,
java.lang.String fileIn,
java.lang.String fileOut)
Creates an array with string values, to be parsed by the implementation of the classifier. |
java.lang.String[] |
prepareTrainingCommand(java.lang.String fileIn,
java.lang.String fileOut)
Creates an array with string values, to be parsed by the classifier. |
void |
saveModel(SMO smo,
java.lang.String fileOut)
This method saves the SMO java-object to a file in the filesystem. |
void |
setModelFile(java.lang.String svmModelFile)
Changes the model for the classifier by changing the name of the modelfile. |
void |
setOptions(ClassifierOptions options)
Changes the various options of this classifier. |
void |
setSigmoid_A(double sigmoid_A)
Changes the sigmoid variable A (see documentation about restructuring the output by use of sigmoid curves) |
void |
setSigmoid_B(double sigmoid_B)
Changes the sigmoid variable B (see documentation about restructuring the output by use of sigmoid curves) |
java.lang.String |
to_genomeview_output(int id,
java.lang.Double distance,
int funsite_start,
int funsite_stop,
java.lang.String classification_name)
Method which produces a string that can be used by the GenomeView program. |
java.lang.String |
to_splice_machine_output(java.lang.Double distance,
int funsite,
int increase,
java.lang.String classification_name)
Method which produces a string that is similar to the output provided by Splicemachine (with the provided results). |
java.lang.String |
to_splice_machine_output(java.lang.String classification_result,
int funsite,
int increase,
java.lang.String classification_name)
Method which produces a string that is similar to that of the Splicemachine program, according to the provided results. |
java.io.File |
writeTemporaryFeatureData(java.lang.String tempFileName,
boolean forward_strand,
java.util.List<java.util.List<java.lang.Double>> data,
Classifier.DATA_TYPE dataType)
This method writes the temporary featuredata (being all the features extracted from 1 sequence, each feature in a different list) to a file. |
java.io.File |
writeTemporaryFeatureData(java.lang.String tempFileName,
java.util.List<java.util.List<java.lang.Double>> data,
Classifier.DATA_TYPE dataType)
This method writes the temporary featuredata (being all the features extracted from 1 sequence, each feature in a different list) to a file. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public WekaSVM(org.apache.log4j.Logger logger, ClassificationAction ca)
logger
- The logging facilityca
- The ClassificationAction to which this classifier belongs (every
classifier belongs to a classification action).Method Detail |
---|
public java.lang.String getFileExtension()
getFileExtension
in interface Classifier
public CrossValidationResult crossValidate(int n, int maxPosTrain, int maxNegTrain)
crossValidate
in interface Classifier
n
- The fold of the crossvalidation. Frequent numbers are 2,5 and 10maxPosTrain
- The maximum amount of positive training examples during the
training phase of the crossvalidation.maxNegTrain
- The maximum amount of negative training examples during the
training phase of the crossvalidation.
public java.lang.String[] prepareCrossvalidationCommand(int fold, java.lang.String fileIn, java.lang.String fileOut)
prepareCrossvalidationCommand
in interface Classifier
fold
- The fold of the crossvalidationfileIn
- The file containing the training features.fileOut
- The file for output (if applicable).
public java.lang.String[] prepareTrainingCommand(java.lang.String fileIn, java.lang.String fileOut)
prepareTrainingCommand
in interface Classifier
fileIn
- The name of the file containing the extracted features.fileOut
- The name of the file to which the output should be written (if applicable).
public boolean buildClassifier()
buildClassifier
in interface Classifier
public void saveModel(SMO smo, java.lang.String fileOut) throws java.lang.Exception
smo
- The SMO to be storedfileOut
- The name of the file in which the SMO should be stored.
java.lang.Exception
- Exceptions might occur, but the caller should handle thempublic SMO loadModel(java.lang.String fileIn) throws java.lang.Exception
fileIn
- The name of the file containing the SMO object
java.lang.Exception
- should be handled by the callerpublic java.io.File mergeFeatureFiles(java.io.File tempFilePositive, java.io.File tempFileNegative)
mergeFeatureFiles
in interface Classifier
tempFilePositive
- The name of the file with features for positive trainingtempFileNegative
- The name of the file with features for negative training
public java.io.File writeTemporaryFeatureData(java.lang.String tempFileName, boolean forward_strand, java.util.List<java.util.List<java.lang.Double>> data, Classifier.DATA_TYPE dataType)
writeTemporaryFeatureData
in interface Classifier
tempFileName
- The name of the file to which the data should be written.forward_strand
- Indicates whether or not the data is located on the forward strand.data
- The featuredata, put in a nested linked list.dataType
- The type of data (see enum in this interface)
public java.io.File writeTemporaryFeatureData(java.lang.String tempFileName, java.util.List<java.util.List<java.lang.Double>> data, Classifier.DATA_TYPE dataType)
writeTemporaryFeatureData
in interface Classifier
tempFileName
- The name of the file to which the data should be written.data
- The featuredatadataType
- The type of data (see enum in this interface)
public java.lang.String generateFeatureString(java.util.List<java.lang.Double> data, Classifier.DATA_TYPE dataType)
generateFeatureString
in interface Classifier
data
- The featuredatadataType
- The datatype (positive,negative,unclassified)
public boolean loadClassifier()
loadClassifier
in interface Classifier
public boolean loadClassifier(java.lang.String svmFile)
loadClassifier
in interface Classifier
svmFile
- The name of the modelfile
public void classify(java.lang.String testFile, java.lang.String outputFile)
classify
in interface Classifier
testFile
- The name of the file that contains the extracted features,
outputdirectory is supposed to be in the filename.outputFile
- The name of the outputfile, outputdirectory is
supposed to be in the filename.public java.lang.String classify_single_instance(java.lang.String instance)
classify_single_instance
in interface Classifier
instance
- The instance (consisting of extracted features) to be classified.
public java.lang.Double classify_single_instance_fast(double[] features)
classify_single_instance_fast
in interface Classifier
features
- The features that make up the instance that needs to be classified.
public java.lang.String to_splice_machine_output(java.lang.String classification_result, int funsite, int increase, java.lang.String classification_name)
to_splice_machine_output
in interface Classifier
classification_result
- The result of the classification, in string format.funsite
- The location of the functional site in the sequence.increase
- An extra increae for the output (see documentation).classification_name
- The name for this type of functional site.
public java.lang.String to_splice_machine_output(java.lang.Double distance, int funsite, int increase, java.lang.String classification_name)
to_splice_machine_output
in interface Classifier
distance
- A value (distance to hyperplane for SVM's) that is used to give a score to
a certain functional site.funsite
- The location of the fuctional site in the sequence.increase
- An extra increase for the location of the functional site
in the output (see documentation).classification_name
- The name for this type of evaluated functional site.
public java.lang.String to_genomeview_output(int id, java.lang.Double distance, int funsite_start, int funsite_stop, java.lang.String classification_name)
to_genomeview_output
in interface Classifier
id
- A unique id for the functional site in the sequencedistance
- A value (distance to hyperplane for SVM's) that is used to give a score to
a certain functional site.funsite_start
- The start of the functional site in the sequencefunsite_stop
- The stop of the functional site in the sequenceclassification_name
- The name for this type of evaluated functional site.
public int[] getPosNegExamplesInFile(java.io.File file)
getPosNegExamplesInFile
in interface Classifier
file
- The trainingfile
public java.lang.String getModelFile()
getModelFile
in interface Classifier
public void setModelFile(java.lang.String svmModelFile)
setModelFile
in interface Classifier
svmModelFile
- The name of the file containg the new model.public void setOptions(ClassifierOptions options)
setOptions
in interface Classifier
options
- The new set of options for this classifier.public java.util.List<ValPosCombination> performAttributeEvaluation(boolean sort, weka.attributeSelection.AttributeEvaluator evaluator)
performAttributeEvaluation
in interface Classifier
sort
- Whether to sort the resulting valposcombinations according to their valuesevaluator
- The evaluator used for performing the evaluation of the attributes
public weka.core.Instances getTrainingFileInstances()
getTrainingFileInstances
in interface Classifier
public void applyAttributeFilter(java.util.List<java.lang.Integer> attributeFilter, int maxNumFeatures, java.io.File toBeFilteredFile)
applyAttributeFilter
in interface Classifier
attributeFilter
- The filter: this is an array with the numbers of the attributes
that MUST be preserved.maxNumFeatures
- The maximum amount of features to be used by the classifier.toBeFilteredFile
- The file containing the various features (set in a classifier dependend
way) which should be filtered by the given attributefilter.public double getSigmoid_A()
getSigmoid_A
in interface Classifier
public void setSigmoid_A(double sigmoid_A)
setSigmoid_A
in interface Classifier
sigmoid_A
- The new sigmoid variable Apublic double getSigmoid_B()
getSigmoid_B
in interface Classifier
public void setSigmoid_B(double sigmoid_B)
setSigmoid_B
in interface Classifier
sigmoid_B
- The new sigmoid variable B
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |