I will try to explain step by step the analysis so that you can understand the problem or another person with the same problem can understand how to solve these things.
First, I'm going to generate 2 vectors, target
and predicted
, which will simulate the result of your classification. These vectors were created from the data you passed.
First, classification_report
says you have 56,000 of class 0 and 119341 of class 1 in your rank. So I'm going to generate a vector with 56,000 zeros and 119341 ones.
import nump as np
class0 = 56000
class1 = 119341
total = class0 + class1
target = np.zeros(total, dtype=np.int)
target[class0:] = np.ones(class1, dtype=np.int)
# pra provar que os valores estao certos
sum(target == 0) == class0, sum(target == 1) == class1
With this, you have the target
vector, with the data that your classification should have hit. Let's now generate predicted
, which will have what your rating reported. This data was taken from its confusion matrix.
class0_hit = 52624 # qto acertou da classe 0
class0_miss = 3376 # qto errou da classe 0
class1_miss = 45307 # qto errou da classe 1
class1_hit = 74034 # qto acertou da classe 1
predicted = np.zeros(total, dtype=np.int)
predicted[class0_hit:class0_hit + class0_miss + class1_hit] = np.ones(class0_miss + class1_hit, dtype=np.int)
# pra provar que os valores estao certos
sum(predicted == 0) == class0_hit + class1_miss, sum(predicted == 1) == class0_miss + class1_hit
Now we can look at the sklearn's classification report and see what it tells us about these values:
from sklearn.metrics import classification_report
print (classification_report(target, predicted))
precision recall f1-score support
0 0.54 0.94 0.68 56000
1 0.96 0.62 0.75 119341
avg / total 0.82 0.72 0.73 175341
This is exactly the same as the classification report you have pasted. We get to the same point as you.
Looking at the confusion matrix:
from sklearn.metrics import confusion_matrix
print (confusion_matrix(target, predicted))
[[52624 3376]
[45307 74034]]
Still the same. Let's look at what the acuracy says:
from sklearn.metrics import accuracy_score
accuracy_score(target, predicted)
> 0.7223524446649672
It returns 72%. Equal to the classification report. So why are your accounts giving 51% accuracy? In your account this:
(TP + TN)/total
(74034 + 52624)/(52624 + 74034 + 45307 + 74034)*100 = 51%
If you repair, the value 74.034 is repeated 2x. By doing the accounts using the values set in the code, it would look like this:
acc = (class0_hit + class1_hit) / total
> 0.7223524446649672
That hits the value of accuracy_score
. The calculation of precision and recall are right:
from sklearn.metrics import precision_score
precision_score(target, predicted)
> 0.9563880635576799
from sklearn.metrics import recall_score
recall_score(target, predicted)
> 0.6203567927200208
But why, then, classification_report
is returning those weird values at the end? The answer is simple and it's in his documentation.
The reported averages are the prevalence-weighted macro-average
across classes (equivalent to precision_recall_fscore_support
with average = 'weighted').
That is, it does not make the simple calculation, it takes into account the quantity of each class to calculate the average.
Let's take a look at this precision_recall_fscore_support
method. It has a parameter called average
, which is used to control the calculation behavior. Running it with the same parameter as classification_report
we get the same result:
from sklearn.metrics import precision_recall_fscore_support
precision_recall_fscore_support(target, predicted, average='weighted')
> (0.8225591977440773, 0.7223524446649672, 0.7305824989909749, None)
Now, since your classification has only 2 classes, the right thing is to ask it to calculate with average
binary. By changing the% parameter by%, we have:
precision_recall_fscore_support(target, predicted, average='binary')
> (0.9563880635576799, 0.6203567927200208, 0.75256542533456, None)
What exactly is the result we find using sklearn's own functions or doing the calculation in the hand.