KNN ALGORITHM IN REAL WORLD

KNN ALGORITHM IN REAL WORLD
Image by Joshua J. Cotten, from Unsplash

Introduction

In the initial period When Machine Learning was introduced, there were very few ways to make the machine learn and what we wanted them to learn. There we started to get an idea of machines and what a machine can do for us, in this process, we started generating the new algorithms of machine learning. 

In this process, we got one special ML algorithm, that was designed for the sake of its application, i.e., the “K-Nearest Neighbour” algorithm. In this blog, we will learn about what KNN is and why it is used.

What is KNN

  • The k-nearest neighbors' classifier (kNN) is a non-parametric supervised machine learning algorithm. It is a distance-based type. It classifies objects based on their near neighbors’ classes. kNN is mostly used for classification but it can apply to regression problems as well. The parameter k in kNN refers to the number of labeled points (neighbors) considered for classification. And The value of k shows the number of these points used to determine the result. Our main task is to calculate the distance and identify which categories are closest to our unknown entity.

How does KNN work

  • The KNN method is mostly employed as a classifier, as previously stated. So, Let's have a look here at how KNN classifies data points that aren't visible. Unlike artificial neural network classification, k-nearest neighbor classification is straightforward to understand and implement. It's suitable for situations that have well-defined or non-linear data points. As to show it another way, KNN uses a voting method to decide the class of an unknown observation. This shows that the data point's class will be determined by the class with the most votes. 

see a Real-Life Example

Source - Google Images

in the above figure, we have two classes. Class A belongs to the yellow family and Class B is belonged to the purple class according to the above figure. Then train the dataset in the kNN model which we discuss later but focus on just example here k=3 is three nearest neighbors k=6 six nearest neighbors. so when we take k=3 then what happens and when k=6 then what happens? When k=3 then two belong to a purple class and one belongs to the yellow class majority vote of purple so here the purple class is considered similarly when k=6 then four belong to a yellow class and two belong to the purple class so majority votes are yellow so consider the yellow class. So in this way, kNN works.

In the classification setting, the k-Nearest neighbor algorithm essentially boils down to forming a majority vote between the k most similar instances to the given ‘unseen’ observation.

As KNN is a distance-based classifier, the more closely the two points are, the greater the similarities in behavior and therefore selection choice. The different methods used to measure the distance are

1. Manhattan   2. Euclidean  

 A popular choice is Euclidean distance Very often, especially when measuring the distance in the plane, we use the formula for the Euclidean distance. According to the Euclidean distance formula, the distance between two points in the plane with coordinates (x, y) and (a, b) is given by

Source - Google Images

Why it is used?

KNN is a simple algorithm, which is used to learn an unknown function of desired precision and accuracy. The algorithm also finds the neighborhood of an unknown input, its range or distance from it, and other parameters. It’s based on the principle of “information gain”. And there the algorithm finds out which is most suitable to predict an unknown value.

How k-Nearest Neighbors algorithm used a lot in real life?

KNN has a lot of applications in machine learning because of the nature of the problem which is solved by a k-nearest neighbor. In other words, the issue of the k-nearest neighbor is fundamental and it is used in a lot of solutions. For example, in data representation such as tSNE, to run the algorithm we need to compute the k-nearest neighbor of each point base on the predefined perplexity.

The KNN algorithm is one of the most popular algorithms for text categorization. 

Another interesting application is the evaluation of forest inventories and for estimating forest variables. In these applications, satellite imagery is used, to map the land cover and land use with few discrete classes. The other applications of the k-NN method in agriculture include climate forecasting and estimating soil water parameters.

  Some of the other applications of KNN in finance are mentioned below:

  • Forecasting stock market: Predict the price of a stock, based on company performance measures and economic data.
  • Currency exchange rate
  • Bank bankruptcies
  • Understanding and managing financial risk
  • Trading futures
  • Credit rating
  • Loan management
  • Bank customer profiling
  • Money laundering analyses

See Real-world applications of KNN

Preprocessing of data: In the preprocessing of data many missing values can be found in datasets. Missing data imputation is a procedure that uses the KNN algorithm to estimate missing values.

Recognizing patterns: In recognizing patterns the KNN algorithm's capacity to recognize patterns and can see a vast range of possibilities. And It can assist, detect and spotting suspicious patterns in credit card usage. for example, Pattern detection can also be used to spot patterns in client purchasing habits.

Recommendation systems: Recommendation systems are a type of system that is used to make recommendations. in which KNN can be used in recommendation systems since it can help locate people with comparable traits. It can be used in an online video streaming platform, for example, to propose content that a user is more likely to view based on what other users watch.

Computer Vision: here for picture classification, the KNN algorithm is used. It's important in various computer vision applications since it can group comparable data points, such as cats and dogs in separate classes.

In Medical field

  • The knn can Predict whether a patient is hospitalized due to a heart attack, and also it will analyze to have a second heart attack. As The prediction is based on demographic, diet, and clinical measurements for that patient.
  • And knn can Estimate the amount of glucose in the blood of a diabetic person, from the infrared absorption spectrum of that person’s blood.
  • And it will also Identifies the risk factors for prostate cancer, based on clinical and demographic variables.


The KNN algorithm has been also applied for analyzing microarray gene expression data, where the KNN algorithm has been coupled with genetic algorithms, which are used as a search tool. And Other applications such as the prediction of solvent accessibility in protein molecules.

KNN method for сar manufacturing: for example, an automaker has designed prototypes of a new truck and sedan. To determine their chances of success, the company has to find out which current vehicles on the market are most similar to the prototypes. Their competitors are their "nearest neighbors.” To identify them, the car manufacturer needs to input data such as price, horsepower, engine size, wheelbase, curb weight, fuel tank capacity, etc., and compare the existing models. The kNN algorithm classifies complicated multi-featured prototypes according to their closeness to similar competitors’ products.

KNN in E-commerce: K-nearest neighbors is an excellent solution for cold-starting an online store recommendation system, but with the growth of the dataset more advanced techniques are usually needed. The algorithm can select the items that specific customers would like or predict their actions based on customer behavior data. For example, kNN will quickly tell whether or not a new visitor will likely carry out a transaction.

kNN application for education: Another kNN application is classifying groups of students based on their behavior and class attendance. With the help of the k-nearest neighbors' analysis, it is possible to identify students who are likely to drop out or fail early. These insights would allow educators and course managers to take timely measures to motivate and help them master the material.

I will recommend one exciting company which works in the area of artificial intelligence, which is AiEnsured. There are so many technical experts and tech guides which will boost your knowledge in different ai areas. So once check the site aiensured and visit their blog also blog aiensured   

References

stackoverflow.comk-nearest-neighbors-algorithm-used-a-lot-in-real-life

towardsdatascience.com

analytics in algorithms

neptune.ai/blog/knn-algorithm-explanation-opportunities-limitations

serokell.io/blog

machine-learning-researcher/k-nearest-neighbors-in-machine-learning

By Krishna Ravilla