How to Fix: error in do_one(nmeth) : na/nan/inf in foreign function call (arg 1)


One error you may encounter in R is:

Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

This error occurs when you attempt to perform k-means clustering in R but the data frame you’re using has one or more missing values.

This tutorial shares exactly how to fix this error.

How to Reproduce the Error

Suppose we have the following data frame in R with a missing value in the second row:

#create data frame
df <- data.frame(var1=c(2, 4, 4, 6, 7, 8, 8, 9, 9, 12),
                 var2=c(12, 14, 14, 8, 8, 15, 16, 9, 9, 11),
                 var3=c(22, NA, 23, 24, 28, 23, 19, 16, 12, 15))

row.names(df) <- LETTERS[1:10]

#view data frame
df

  var1 var2 var3
A    2   12   22
B    4   14   NA
C    4   14   23
D    6    8   24
E    7    8   28
F    8   15   23
G    8   16   19
H    9    9   16
I    9    9   12
J   12   11   15

If we attempt to use the kmeans() function to perform k-means clustering on this data frame, we’ll receive an error:

#attempt to perform k-means clustering with k = 3 clusters
km <- kmeans(df, centers = 3)

Error in do_one(nmeth) : NA/NaN/Inf in foreign function call (arg 1)

How to Fix the Error

The easiest way to fix this error is to simply use the na.omit() function to remove rows with missing values from the data frame:

#remove rows with NA values
df <- na.omit(df)

#perform k-means clustering with k = 3 clusters
km <- kmeans(df, centers = 3)

#view results
km

K-means clustering with 3 clusters of sizes 4, 3, 2

Cluster means:
  var1      var2     var3
1  5.5 14.250000 21.75000
2 10.0  9.666667 14.33333
3  6.5  8.000000 26.00000

Clustering vector:
A C D E F G H I J 
1 1 3 3 1 1 2 2 2 

Within cluster sum of squares by cluster:
[1] 46.50000 17.33333  8.50000
 (between_SS / total_SS =  79.5 %)

Available components:

[1] "cluster"      "centers"      "totss"        "withinss"     "tot.withinss"
[6] "betweenss"    "size"         "iter"         "ifault"  

Notice that the k-means clustering algorithm runs successfully once we remove the rows with missing values from the data frame.

Bonus: A complete step-by-step guide to k-means clustering in R

Additional Resources

How to Fix in R: NAs Introduced by Coercion
How to Fix in R: Subscript out of bounds
How to Fix in R: longer object length is not a multiple of shorter object length

Leave a Reply

Your email address will not be published.