1 PREPARING FOR THE ANALYSIS


1.1 Install and load the package ppclust

This vignette is designed to be used with the ‘ppclust’ package. You can download the recent version of the package from CRAN with the following command:

install.packages("ppclust")

If you have already installed ‘ppclust’, you can load it into R working environment by using the following command:

library(ppclust)

1.2 Load the required packages

For visualization of the clustering results, some examples in this vignette use the functions from some cluster analysis packages such as ‘cluster’, ‘fclust’ and ‘factoextra’. Therefore, these packages should be loaded into R working environment with the following commands:

library(factoextra)
library(cluster)
library(fclust)

1.3 Load the data set

We demonstrate PFCM on the Iris data set (Anderson, 1935). It is a real data set of the four features (Sepal.Length, Sepal.Width, Petal.Length and Petal.Width in the first four columns) with a class variable showing the iris species (classes) in the last column. This four-dimensional data set contains totally 150 observations as 50 samples from each of three iris species. One of these three natural clusters (Class 1) is linearly well-separated from the other two clusters, while Classes 2 and 3 have some overlap as seen in the plot below.

data(iris)
x=iris[,-5]
x

Plot the data by the classes of iris species

pairs(x, col=iris[,5])