Confusion Matrix

Harshitakumari
4 min readJun 4, 2021

What is confusion matrix?

šŸ‘‰A Confusion matrix is an N x N matrix used for evaluating the performance of a classification model, where N is the number of target classes. The matrix compares the actual target values with those predicted by the machine learning model. This gives us a holistic view of how well our classification model is performing and what kinds of errors it is making.

For a binary classification problem, we would have a 2 x 2 matrix as shown below with 4 values:

True Positive (TP)

  • The predicted value matches the actual value
  • The actual value was positive and the model predicted a positive value

True Negative (TN)

  • The predicted value matches the actual value
  • The actual value was negative and the model predicted a negative value

False Positive (FP) ā€” Type 1 error

  • The predicted value was falsely predicted
  • The actual value was negative but the model predicted a positive value
  • Also known as the Type 1 error

False Negative (FN) ā€” Type 2 error

  • The predicted value was falsely predicted
  • The actual value was positive but the model predicted a negative value
  • Also known as the Type 2 error

The different values of the Confusion matrix would be as follows:

  • True Positive (TP) = 560; meaning 560 positive class data points were correctly classified by the model
  • True Negative (TN) = 330; meaning 330 negative class data points were correctly classified by the model
  • False Positive (FP) = 60; meaning 60 negative class data points were incorrectly classified as belonging to the positive class by the model

The different values of the Confusion matrix would be as follows:

  • True Positive (TP) = 560; meaning 560 positive class data points were correctly classified by the model
  • True Negative (TN) = 330; meaning 330 negative class data points were correctly classified by the model
  • False Positive (FP) = 60; meaning 60 negative class data points were incorrectly classified as belonging to the positive class by the model

Cyber crime using confusion matrix:

Cyber-attacks have become one of the biggest problems of the world. They cause serious financial damages to countries and people every day. The increase in cyber-attacks also brings along cyber-crime.

The key factors in the fight against crime and criminals are identifying the perpetrators of cyber-crime and understanding the methods of attack. Detecting and avoiding cyber-attacks are difficult tasks.

There are three main objectives in our study. The first is to use actual cyber-crime data as input to predict a cyber-crime method and compare the accuracy results. The second is to measure whether cyber-crime perpetrators can be predicted based on the available data. The third objective is to understand the effect of victim profiles on cyber-attacks.

The number of crimes, damages, attacks and methods of attack in the dataset.

However, researchers have recently been solving these problems by developing security models and making predictions through artificial intelligence and Machine learning methods. A high number of methods of crime prediction are available in the literature. On the other hand, they suffer from a deficiency in predicting cyber-crime and cyber-attack methods.

This problem can be tackled by identifying an attack and the perpetrator of such attack, using actual data. The data include the type of crime, gender of perpetrator, damage and methods of attack. The data can be acquired from the applications of the persons who were exposed to cyber-attacks to the forensic units.

Here, we analyze cyber-crimes in two different models with machine-learning methods and predict the effect of the defined features on the detection of the cyber-attack method and the perpetrator. We used some machine-learning methods in our approach and concluded that their accuracy ratios were close.

The Logistic Regression was the leading method in detecting attackers with an accuracy rate of 65.42%.

In other method, we predicted whether the perpetrators could be identified by comparing their characteristics.

Where 0 is ā€œPerpetrator Knownā€, 1 is ā€œPerpetrator Unknownā€.

Thanks for reading this Articleā€¦ā€¦šŸ˜ŠšŸ˜Š

By Harshita Kumariā€¦!

--

--