Supervised Learning : PRACTICAL Decision Tree classifier- Apple / Orange Classifier :
Supervised Learning Recipe :
1. collect training data : These are the examples of problem we wan to solve.
for our problem we are going to write function to classify piece of fruit.
we will take a description of the fruit as input and predict weather its apple or an orange as output based on features. like texture(bumpy ,smooth) and wait
To collect our training data imagine we haed out to an orchard.
we will look at different apples and oranges and write down measurements that
describe them in a table.
In Machine Learning these measurement called features. To keep things simple
we just used two.
How much each fruit weights in grams and its texture which can be bumpy or smooth.
A good features makes it easy to discriminate between different types of fruit .
Each row in our training data is an example , it describes one piece of fruit..
The last column is called the label.It identifies what type of fruit in each row. and there are just two possibilities ,Apples and Oranges.
The whole Table is a training data . Think of these as all the examples , we want to classifier to learn from.the more training data you have ,The better a classifier you can create.
import sklearn
#we will used two variables features and labels, features contains the first two columns and label contain the last column
features = [[140,'smooth'] ,[150,'smooth'] ,[170,'bumpy'],[180,'bumpy'] ]
labels =['Apple','Apple','Orange','Orange']
#You can thing features as a input and labels as a output.
# now we replace string by number : example 1 for smooth and 0 for bumpy
#0 for Apple and 1 for Orange.
features=[0,0,1,1]
Now use this data(features) to train classifier . There are number of classifiers
we strat with decision tree classifier
To use decision tree classifier first import tree
from sklearn import tree
clf=tree.DecisionTreeClassifier();
AT this point Its just a empty box of rules , means our classifier is empty, we not train yet for any data. It doesn't know anything about apple and orange yet.
To train it we will need Learning Algorithm , if a classifier is a box of rules ,then u can thing about learning algorithm as the procedure that creates them
It does that by finding patterns in your training data.
for example it might notice, oranges tend to weight more , so it will create a rule saying heavier fruit is orange.
Now we include training data in classifier object and its called fit.
You can think of Fit as being a Synonym for Find Patterns in data.
clf = clf.fit(features,labels)
At this point we have train classifier.
Test the machine :
Input to classifier is the new feature for example :
print clf.predict([[160,1]])
output : orange.
Complete Example :
from sklearn import tree from sklearn.tree import DecisionTreeClassifier features = [[140,1],[130,1],[150,0] ,[170,0]] labels = [0,0,1,1] clf= DecisionTreeClassifier(); clf.fit(features,labels) print(clf.predict([[160,0]])) print(clf.predict([[140,1]]))
160 weight - Orange
140 weight - Apple
we will get output as 1 and 0
Comments
Post a Comment