I need guidance with the KNN algorithm

Hello everyone,

I’m trying to apply the K-Nearest Neighbours (KNN) method on a dataset in Python, but I’m getting an error that I can’t figure out. Could you please assist me in locating the problem?

Here’s the pertinent section of my code:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Custom dataset
X = np.array([[2, 4], [3, 6], [5, 8], [7, 10], [9, 12]])
y = np.array([0, 1, 0, 1, 0])

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Min-max scaling the features
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Implement KNN
k = 3
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)

# Make predictions on the test set
y_pred = knn.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

I’m getting the following problem when I execute the code:

ValueError: Expected 2D array, got 1D array instead:
array=[0.14285714 0.57142857 1.        ]. Reshape your data either using array.reshape(-1, 1) 
if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

I’m not sure what’s generating this issue, so I read this post to have a better understanding, but I couldn’t go past the “reshape” option. Could someone kindly help me figure out how to fix this mistake and get my code to operate properly? Thank you in advance for your assistance!

This is a tough one…

The error message says:

But in the code section you provided, there is no place where such an error should occur. Your X_train and X_test are already 2D arrays after the MinMaxScaler operation.

Can you give your full code? I think the error might be in another section.

4 Likes

try:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score

# Custom dataset with overlapping data points
X = np.array(
    [[2, 4], [3, 6], [4, 8], [6, 9], [7, 10], [9, 12], [8, 11], [5, 7], [6, 5], [4, 6]]
)
y = np.array([0, 1, 1, 1, 1, 0, 0, 1, 1, 1])

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Min-max scaling the features
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

# Implement KNN
k = 3
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train, y_train)

# Make predictions on the test set
y_pred = knn.predict(X_test)

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy}")

Does this work? I mean… it shows a accuracy of 1… which is obviously a fail. Atleast its not 0 lol. If this doesn’t solve the problem, then I can conclude that I am trash at python. Oh yes it means it works because @QwertyQwerty88 liked it. Hope this help ty.

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.