One error you may encounter in R is:

Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

This error occurs when you attempt to perform k-means clustering in R but the data frame you’re using has one or more missing values.

This tutorial shares exactly how to fix this error.

**How to Reproduce the Error**

Suppose we have the following data frame in R with a missing value in the second row:

#create data frame df <- data.frame(var1=c(2, 4, 4, 6, 7, 8, 8, 9, 9, 12), var2=c(12, 14, 14, 8, 8, 15, 16, 9, 9, 11), var3=c(22, NA, 23, 24, 28, 23, 19, 16, 12, 15)) row.names(df) <- LETTERS[1:10] #view data frame df var1 var2 var3 A 2 12 22 B 4 14 NA C 4 14 23 D 6 8 24 E 7 8 28 F 8 15 23 G 8 16 19 H 9 9 16 I 9 9 12 J 12 11 15

If we attempt to use the **kmeans()** function to perform k-means clustering on this data frame, we’ll receive an error:

#attempt to perform k-means clustering with k = 3 clusters km <- kmeans(df, centers = 3) Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

**How to Fix the Error**

The easiest way to fix this error is to simply use the na.omit() function to remove rows with missing values from the data frame:

#remove rows with NA values df <- na.omit(df) #perform k-means clustering with k = 3 clusters km <- kmeans(df, centers = 3) #view results km K-means clustering with 3 clusters of sizes 4, 3, 2 Cluster means: var1 var2 var3 1 5.5 14.250000 21.75000 2 10.0 9.666667 14.33333 3 6.5 8.000000 26.00000 Clustering vector: A C D E F G H I J 1 1 3 3 1 1 2 2 2 Within cluster sum of squares by cluster: [1] 46.50000 17.33333 8.50000 (between_SS / total_SS = 79.5 %) Available components: [1] "cluster" "centers" "totss" "withinss" "tot.withinss" [6] "betweenss" "size" "iter" "ifault"

Notice that the k-means clustering algorithm runs successfully once we remove the rows with missing values from the data frame.

**Bonus:** A complete step-by-step guide to k-means clustering in R

