SVM Implementation step by step with R: Data Preparation

In this post, we will try to implement SVM with the e1071 package for a Ice-cream shop which has recorded the following attributes on sales:

  • The temperature in the city
  • Sales on a particular day
  • Labeling whether its “Good” or “Bad” sales.

Steps:

  1. Lets install the necessary packages using the command

    install.packages(‘e1071’,dependencies=TRUE)

  2. It will ask for the CRAN mirror, choose the one nearest to your country. Subsequently you will see the message that binary packages has been installed in the specific path. Please note that I’m using RGui(32-bit) windows xp version.

3.  To start using this library you can issue the following command, I got a warning message that it was built for version 2.15.3, you can upgrade R to avoid this message:

    library(e1071)

4. I have the Ice-Cream parlor sales data in a excel workbook. You check with my earlier post on importing excel workbook with R for importing data or you can convert the excel to CSV format and read it using read.csv() method.

5. I will use the later one as given below:

6. Now we have the necessary data and you can see the columns read as “SalesRating”, “CityTemperature”, “IceCreamSales”.

7. I’m assigning the data from the CSV file to a dataset like the following

    dataset<-read.csv("data.csv")

8. We will use the 70% of the data for Training Dataset and 30% for Testing Dataset. Ideally we are going to subset a larget dataset. The first step towards that is creating a index, like the one given below to determine the index from the 1st to the nth row of the dataset:

    index<-1:nrow(dataset)

9. If you would like to see what exists in the index, just try to console it out. Next we are going to create testindex to sample out the 30% of the dataset using the following commands

    testindex<-sample(index,trunc(length(index)*30/100))

10. Now we need to segregate the testdataset and trainingdataset using the testindex we have create given below
061513_0022_SVMImplemen3.png
11. Now we will output testset and trainingset summary which will give you an idea of how it has got segregated:

So far we have seen the steps in preparing the data for analysis using SVM which has 44 TestSet records and 105 TrainingSet records, in the next post we will see the SVM Process.

Advertisements

9 thoughts on “SVM Implementation step by step with R: Data Preparation

  1. Pingback: SVM Implementation Step by Step with R: Ice-cream sales prediction | My exploration into data analytics

  2. Sriram,

    Sorry I dont have the dataset as my System has got crashed. It would be easy to prepare have 3 column Temperature, Sales and SalesRating label and use random function in excel to prepare your own data. Have some criteria to determine the SalesRating based on the sales. That should help you to prepare the excel or CSV and you can start up on from there. Do let me know if you need any help.

    regards
    Siva.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s