C++實現的簡單k近鄰演算法（K-Nearest-Neighbour，K-NN）

阿新 • • 發佈：2019-01-11

</#include<map>
#include<vector>
#include<stdio.h>
#include<cmath>
#include<cstdlib>
#include<algorithm>
#include<fstream>

using namespace std;

typedef char tLabel;
typedef double tData;
typedef pair<int, double>  PAIR;
const int colLen = 2;
const int rowLen = 12;
ifstream fin;
ofstream fout;

class KNN
{
private:
	tData dataSet[rowLen][colLen];
	tLabel labels[rowLen];
	tData testData[colLen];
	int k;
	map<int, double> map_index_dis;
	map<tLabel, int> map_label_freq;
	double get_distance(tData *d1, tData *d2);
public:

	KNN(int k);

	void get_all_distance();

	void get_max_freq_label();

	struct CmpByValue
	{
		bool operator() (const PAIR& lhs, const PAIR& rhs)
		{
			return lhs.second < rhs.second;
		}
	};

};

KNN::KNN(int k)
{
	this->k = k;

	fin.open("C:\\Users\\zws\\Desktop\\K近鄰\\data.txt");

	if (!fin)
	{
		cout << "can not open the file data.txt" << endl;
		exit(1);
	}

	/* input the dataSet */
	for (int i = 0; i<rowLen; i++)
	{
		for (int j = 0; j<colLen; j++)
		{
			fin >> dataSet[i][j];
		}
		fin >> labels[i];
	}

	cout << "please input the test data :" << endl;
	/* inuput the test data */
	for (int i = 0; i<colLen; i++)
		cin >> testData[i];

}

/*
* calculate the distance between test data and dataSet[i]
*/
double KNN::get_distance(tData *d1, tData *d2)
{
	double sum = 0;
	for (int i = 0; i<colLen; i++)
	{
		sum += pow((d1[i] - d2[i]), 2);
	}

	//	cout<<"the sum is = "<<sum<<endl;
	return sqrt(sum);
}

/*
* calculate all the distance between test data and each training data
*/
void KNN::get_all_distance()
{
	double distance;
	int i;
	for (i = 0; i<rowLen; i++)
	{
		distance = get_distance(dataSet[i], testData);
		//<key,value> => <i,distance>
		map_index_dis[i] = distance;
	}

	//traverse the map to print the index and distance
	map<int, double>::const_iterator it = map_index_dis.begin();
	while (it != map_index_dis.end())
	{
		cout << "index = " << it->first << " distance = " << it->second << endl;
		it++;
	}
}

/*
* check which label the test data belongs to to classify the test data
*/
void KNN::get_max_freq_label()
{
	//transform the map_index_dis to vec_index_dis
	vector<PAIR> vec_index_dis(map_index_dis.begin(), map_index_dis.end());
	//sort the vec_index_dis by distance from low to high to get the nearest data
	sort(vec_index_dis.begin(), vec_index_dis.end(), CmpByValue());

	for (int i = 0; i<k; i++)
	{
		cout << "the index = " << vec_index_dis[i].first << " the distance = " << vec_index_dis[i].second << " the label = " << labels[vec_index_dis[i].first] << " the coordinate ( " << dataSet[vec_index_dis[i].first][0] << "," << dataSet[vec_index_dis[i].first][1] << " )" << endl;
		//calculate the count of each label
		map_label_freq[labels[vec_index_dis[i].first]]++;
	}

	map<tLabel, int>::const_iterator map_it = map_label_freq.begin();
	tLabel label;
	int max_freq = 0;
	//find the most frequent label
	while (map_it != map_label_freq.end())
	{
		if (map_it->second > max_freq)
		{
			max_freq = map_it->second;
			label = map_it->first;
		}
		map_it++;
	}
	cout << "The test data belongs to the " << label << " label" << endl;
}

int main()
{
	int k;
	cout << "please input the k value : " << endl;
	cin >> k;
	KNN knn(k);
	knn.get_all_distance();
	knn.get_max_freq_label();
	system("pause");
	return 0;
}

測試結果：

C++實現的簡單k近鄰演算法（K-Nearest-Neighbour，K-NN）

</#include<map> #include<vector> #include<stdio.h> #include<cmath> #include<cstdlib> #include<algorithm> #include<

基於scikit-learn實現k近鄰演算法（kNN）與超引數的除錯

前一篇關於kNN的部落格介紹了演算法的底層實現，這片部落格讓我們一起看一看基於scikit-learn如何快速的實現kNN演算法。 scikit-learn內建了很多資料集，就不用我們自己編造假資料了，下面我們分別選用鳶尾花和手寫數字識別的資料集。首先匯入需要的庫 from sklea

機器學習--k-近鄰演算法（kNN）實現手寫數字識別

這裡的手寫數字以0,1的形式儲存在文字檔案中，大小是32x32.目錄trainingDigits有1934個樣本。0-9每個數字大約有200個樣本，命名規則如下：下劃線前的數字代表是樣本0-9的

K 近鄰演算法（KNN）與KD 樹實現

KD樹節點 /// <summary> /// ＫＤ樹節點 /// /2016/4/1安晟新增 /// </summary> [Serializable] p

《機器學習實戰》第二章：k-近鄰演算法（1）簡單KNN

收拾下心情，繼續上路。最近開始看Peter Harrington的《Machine Learning in Action》... 的中文版《機器學習實戰》。準備在部落格裡面記錄些筆記。這本書附帶的程式碼和資料及可以在這裡找到。這本書裡程式碼基本是用python寫的

機器學習經典分類演算法 —— k-近鄰演算法（附python實現程式碼及資料集）

目錄工作原理 python實現演算法實戰約會物件好感度預測故事背景準備資料：從文字檔案中解析資料分析資料：使用Matplotlib建立散點圖

小白python學習——機器學習篇——k-近鄰演算法（KNN演算法）

一、演算法理解一般給你一資料集，作為該題目的資料（一個矩陣，每一行是所有特徵），而且每一組資料都是分了類，然後給你一個數據，讓這個你預測這組資料屬於什麼類別。你需要對資料集進行處理，如：歸一化數值。處理後可以用matplotlib繪製出影象，一般選兩個特徵繪製x，y軸，然後核心是計算出預測點到

Python中的k—近鄰演算法（處理常見的分類問題）

最近買了一本機器學習的書，書名叫《機器學習實戰》，剛學了第一個演算法，k—近鄰演算法，所以寫篇部落格分享一下。那麼開始，我們假設平面座標系上面有四個座標點，座標分別是 [1.0, 1.1], [1.0, 1.0], [0, 0], [0, 0.1] 然後這四個點有兩個

K-近鄰演算法（KNN）

#-*- coding:utf-8 -*- import numpy as np import operator def createDataset(): #四組二維特徵 group = np.array([[5,115],[7,106],[56,11],[66,9]])

機器學習十大經典演算法之K-近鄰演算法（學習筆記）

演算法概述 K-近鄰演算法(k-Nearest Neighbor，KNN)是機器學習演算法中最簡單最容易理解的演算法。該演算法的思路是：給定一個訓練資料集，對新的輸入例項，在訓練資料集中找到與該例項最鄰近的K個例項，這K個例項的多數屬於某個類，就把該輸入例項分

資料分析06sklearn資料集及K近鄰演算法（轉）

機器學習應用程式的步驟（1）收集資料我們可以使用很多方法收集樣本護具，如：公司自有資料製作網路爬蟲從網站上抽取資料、第三方購買的資料合作機構提供的資料從RSS反饋或者API中得到資訊、裝置傳送過來的實測資料。（2）準備輸入資料得到資料之後

《機器學習實戰》學習總結1——K-近鄰演算法（程式清單2-1）

程式碼如下： def classify0(inX, dataSet, labels, k): # inX是用於分類的輸入向量，dataSet是輸入的訓練樣本集，lebels是標籤向量，k是用於選擇最近鄰居的數目 dataSetSiz

機器學習之K-近鄰演算法（二）

本章內容： K-近鄰分類演算法從文字檔案中解析和匯入資料使用matplotlib建立擴散圖歸一化數值 2-1 K-近鄰演算法概述簡單的說，K-近鄰演算法採用測量不同特徵值之間的距離方法進行分類。 K-近鄰演算法優點：精度高、對異常

機器學習筆記九：K近鄰演算法（KNN）

一.基本思想 K近鄰演算法，即是給定一個訓練資料集，對新的輸入例項，在訓練資料集中找到與該例項最鄰近的K個例項，這K個例項的多數屬於某個類，就把該輸入例項分類到這個類中。如下面的圖：通俗一點來說，就是找最“鄰近”的夥伴，通過這些夥伴的類別來看自己的類別

Python高階--K-近鄰演算法（KNN）

K nearest neighbour K-近鄰演算法採用測量不同特徵值之間的距離方法進行分類。優點：精度高、對異常值不敏感、無資料輸入假定。缺點：時間複雜度高、空間複雜度高。適用資料範圍：數值型和標稱型。一、K

機器學習實戰之k-近鄰演算法（3）---如何視覺化資料

關於視覺化：《機器學習實戰》書中的一個小錯誤，P22的datingTestSet.txt這個檔案，根據網上的原始碼，應該選擇datingTestSet2.txt這個檔案。主要的區別是最後的標籤，作者原來使用字串‘veryLike’作為標籤，但是Python轉換會出現Val

K近鄰演算法（kNN）學習——kd樹

構造kd樹的過程我自己總結了一個口訣就是：“選擇中位數，一橫一豎” 構造平衡kd樹演算法輸入：k維空間資料集T={x1,x2,...,xN},其中xi=(x(1)i,x(2)i,...,x(k)i)，i=1,2...,N; 輸出kd樹。（1）分別基於輸入

K近鄰演算法（三）--kaggle競賽之Titanic

小白好難得會用python做第分類，實踐一下用於kaggle入門賽之泰坦尼克生還預測問題介紹：泰坦尼克電影大家都看過，大災難過後有些人生還了，有些人卻遭遇了不信，官方提供了1309名乘客的具體資訊以及提供了其中891名乘客的最後的存活情況，讓我們去預測另外418乘客的存活

Python3《機器學習實戰》01：k-近鄰演算法（完整程式碼及註釋）

執行平臺： Windows Python版本： Python3 IDE： Anaconda3 # -*- coding: utf-8 -*- """ Created on Sun Apr 29 20:32:03 2018 @author: Wang

k近鄰演算法（K-Nearest Neighbor）

k近鄰是一種常用的分類與迴歸演算法，其原理比較簡單基本思想給定一個訓練資料集，其中的例項的類別已定，對於新的例項，根據其K個距離最短的訓練例項的類別出現的頻率，對新的例項進行預測。距離計算歐式距離曼哈頓距離 K的取值

C++實現的簡單k近鄰演算法（K-Nearest-Neighbour，K-NN）

相關推薦