Step by Step Sentiment analysis on Twitter data using R with Airtel Tweets: Part – II

In the previous post we saw what is sentiment analysis and what are the steps involved in it. In this post we will go through step by step instruction on doing Sentiment Analysis on the Micro blogging site “Twitter”. We will have specific objective to do so. I came across an interesting post by Chetan S on the DTH operators involvement in using Social Media for providing customer support. It triggered me the idea for this post.

Goal: To do sentiment analysis on Airtel Customer support via Twitter in India.

In this Post: We will retrieve the Tweets, look at how to access the Twitter API and make best use of the TwitteR R package and write these tweets to a file.

Important Note:

1. when you would like to use the searchTwitter, go to dev.twitter.com and your application go to the “Settings” tab and select “Read, Write and Access direct messages”. Make sure to click on the save button after doing this.”

Refer to this link http://stackoverflow.com/questions/15713073/twitter-help-unable-to-authorize-even-with-registering

2. When you are trying to search using searchTwitter after the above step if you get ssl problem make sure you have enable rCurl and do the steps outline here: http://stackoverflow.com/questions/15347233/ssl-certificate-failed-for-twitter-in-r.

options(RCurlOptions = list(cainfo = system.file(“CurlSSL”, “cacert.pem”, package = “RCurl”))) also make sure you have loaded the Necessary Packages like ROAuth,

Step 1: Make sure you have done the OAuth authentication with Twitter using the Previous post and the steps outlined above, you can also check the library loaded with sessionInfo(). Step 2: Make sure you load the tweets from the Twitter from the Twitter Handle accordingly > airtel.tweets=searchTwitter(“@airtel_presence”,n=1500) Now we have loaded the 1499 tweets which was responded by the Twitter API in to airtel.tweets. Now what we will do is to save these to a file for future processing. Step 3: Before we write these tweets to a file, for better understanding we will try to look at some of the tweets and data collected so far. head(airtel.tweets) provides the top 6 tweets. Further to our analysis, we try to get the length of the tweets, what kind of class it is and how can we access the tweets. Look at the below given screenshot. Step 4: We will look at some examples of How to access the twitter data in a better fashion with respect to the Twitter API using TwitteR library by accessing one tweets from the 1499 available. In this above given example we have selected the 3rd item from the list and we have tried to get till the user information, how many friends he has and how many followers he has, etc., These are the things which are vital to understand as these factors can become viral and impact the image of a particular brand. Now will go to the next step of identifying the steps to store these tweets for further analysis. Step 5: We will store these tweets we collected in airtel.tweets to a file for future analysis and reference. We are going to convert the list of tweets to separate data using apply functions and write to a file. We are going to use the library plyr for the same. Plyr allows the user to split a data set apart into smaller subsets, apply methods to the subsets, and combine the results. Please click here for detailed introduction on plyr. So we are converting the list to data frame for preparing it to be written to a file. Now the tweets and all the necessary information is available in the tweets.df data frame. You can look at the below screenshots for its summary. Step 6: Setup the Working directory and write the tweets.df data frame to the file airteltweets.csv. You can verify the data available in this file using Notepad++ or Excel In the next post we will look at how to do sentiment analysis with this file data.

Advertisements

10 thoughts on “Step by Step Sentiment analysis on Twitter data using R with Airtel Tweets: Part – II

  1. excellent post, I look forward to the next since and tried to implement svm about some tweet but and failure, hope to see you solution.
    thanks

    • Thanks for your appreciation. I will be doing the next post based on Jeffrey sentiment score formula after which I’m planning to expand upon Naive Bayes. The next post will be done this weekend.

  2. Pingback: Step by Step Sentiment analysis on Twitter data using R with Airtel Tweets: Part – III | My exploration in data analytics

  3. Hi Siva,
    pretty interesting post.. am trying to get something done from twitter using R but couldnt figure out a few basic stuff…Im not a programmer and R is probably where I have coded the most (i.e. over last 2 months) in my life, so excuse if the queries are a bit too basic

    cred <- OAuthFactory$new(consumerKey="Twitter user name",
    consumerSecret="twitter password",
    requestURL='https://api.twitter.com/oauth/request_token&#039;,
    accessURL='https://api.twitter.com/oauth/access_token&#039;,
    authURL='https://api.twitter.com/oauth/authorize&#039😉
    cred$handshake(cainfo="cacert.pem")

    Im using the above to handshake with twitter on my windows 7 machine, but it returns unauthorized.
    One basic error could be that Im using my general twitter user name and password for consumerkey and consumerSecret. is that correct or should I go to developer account in twitter and ask for a token, thought tokens are available only for developed applications and dont have any.

    would appreciate any help on this.

    Murali

    • Dear Murali,

      Thanks for your comments. Certainly you need to get the tokens and credentials. Your Twitter Username and password will not work. Request twitter for tokens then you should be good to go. Follow the steps:
      a. Goto https://dev.twitter.com/
      b. Settings-> My Applications
      c. Create a new applications

      Then you will get those details such as consumerkey, consumerSecret outlined in my blog post.

  4. Hello Siva , Thanks for your excellent post .it is very helpful .

    I have one question related to searchTwitter() function , i feel that this function return less tweets than actual tweets . I checked in topsy.com which returned to me more tweets than search Twitter function . More over most of the time number of tweets returned by this function is ‘some 99’ say 199 or 299 or 399 or … 1499 it looks suspicious to me

    Thanks in adavance

  5. Hi Guru,
    I have this sample data that i am working on using r. I want to load it into a data frame and perform k-means and wordcloud. how can i go about it? Is there any recomended packages to use?
    Thanks

    {“created_at”:”Wed Feb 27 14:24:12 +0000 2013″,”id”:306771719996186625,”id_str”:”306771719996186625″,”text”:”@Joeypearce we’ve got another bellend coming to see the car I’m having too help clean :-\/ I’ll see you when work ends ! X”,”source”:”\u003ca href=\”http:\/\/twitter.com\/download\/iphone\” rel=\”nofollow\”\u003eTwitter for iPhone\u003c\/a\u003e”,”truncated”:false,”in_reply_to_status_id”:306763650054627328,”in_reply_to_status_id_str”:”306763650054627328″,”in_reply_to_user_id”:127665137,”in_reply_to_user_id_str”:”127665137″,”in_reply_to_screen_name”:”Joeypearce”,”user”:{“id”:274997668,”id_str”:”274997668″,”name”:”Ell Beaton \u00a9″,”screen_name”:”Ell_Beaton”,”location”:””,”url”:null,”description”:”Go Glen, Or Go Home.”,”protected”:false,”followers_count”:147,”friends_count”:85,”listed_count”:0,”created_at”:”Thu Mar 31 12:44:39 +0000 2011″,”favourites_count”:132,”utc_offset”:0,”time_zone”:”London”,”geo_enabled”:true,”verified”:false,”statuses_count”:1087,”lang”:”en”,”contributors_enabled”:false,”is_translator”:false,”profile_background_color”:”1A1B1F”,”profile_background_image_url”:”http:\/\/a0.twimg.com\/profile_background_images\/768018009\/7a0b3fe303f234e8d6a5429bb9ede9a9.jpeg”,”profile_background_image_url_https”:”https:\/\/si0.twimg.com\/profile_background_images\/768018009\/7a0b3fe303f234e8d6a5429bb9ede9a9.jpeg”,”profile_background_tile”:true,”profile_image_url”:”http:\/\/a0.twimg.com\/profile_images\/3304123896\/606a7413bce208a1a38b1eb41fd017c9_normal.jpeg”,”profile_image_url_https”:”https:\/\/si0.twimg.com\/profile_images\/3304123896\/606a7413bce208a1a38b1eb41fd017c9_normal.jpeg”,”profile_banner_url”:”https:\/\/si0.twimg.com\/profile_banners\/274997668\/1361751912″,”profile_link_color”:”F50E0E”,”profile_sidebar_border_color”:”000000″,”profile_sidebar_fill_color”:”252429″,”profile_text_color”:”666666″,”profile_use_background_image”:true,”default_profile”:false,”default_profile_image”:false,”following”:null,”follow_request_sent”:null,”notifications”:null},”geo”:{“type”:”Point”,”coordinates”:[52.43718380,-2.14324244]},”coordinates”:{“type”:”Point”,”coordinates”:[-2.14324244,52.43718380]},”place”:{“id”:”ddeec3dc241e5b6a”,”url”:”http:\/\/api.twitter.com\/1\/geo\/id\/ddeec3dc241e5b6a.json”,”place_type”:”city”,”name”:”Dudley”,”full_name”:”Dudley, Dudley”,”country_code”:”GB”,”country”:”United Kingdom”,”bounding_box”:{“type”:”Polygon”,”coordinates”:[[[-2.191947,52.426012],[-2.191947,52.558221],[-2.011849,52.558221],[-2.011849,52.426012]]]},”attributes”:{}},”contributors”:null,”retweet_count”:0,”entities”:{“hashtags”:[],”urls”:[],”user_mentions”:[{“screen_name”:”Joeypearce”,”name”:”Joey Pearce”,”id”:127665137,”id_str”:”127665137″,”indices”:[0,11]}]},”favorited”:false,”retweeted”:false,”filter_level”:”medium”}

    {“created_at”:”Wed Feb 27 14:24:12 +0000 2013″,”id”:306771720080064512,”id_str”:”306771720080064512″,”text”:”Broomfield passes ban on marijuana businesses: The Broomfield City Council has voted to temporarily ban… http:\/\/t.co\/VqgmuHVGW8″,”source”:”\u003ca href=\”http:\/\/dlvr.it\” rel=\”nofollow\”\u003edlvr.it\u003c\/a\u003e”,”truncated”:false,”in_reply_to_status_id”:null,”in_reply_to_status_id_str”:null,”in_reply_to_user_id”:null,”in_reply_to_user_id_str”:null,”in_reply_to_screen_name”:null,”user”:{“id”:70381121,”id_str”:”70381121″,”name”:”Denver CP”,”screen_name”:”DenverCP”,”location”:”Denver, CO”,”url”:”http:\/\/denver.cityandpress.com\/”,”description”:”Denver City And Press”,”protected”:false,”followers_count”:2069,”friends_count”:2,”listed_count”:80,”created_at”:”Mon Aug 31 12:32:37 +0000 2009″,”favourites_count”:0,”utc_offset”:-25200,”time_zone”:”Mountain Time (US & Canada)”,”geo_enabled”:true,”verified”:false,”statuses_count”:122230,”lang”:”en”,”contributors_enabled”:false,”is_translator”:false,”profile_background_color”:”FFFFFF”,”profile_background_image_url”:”http:\/\/a0.twimg.com\/images\/themes\/theme1\/bg.png”,”profile_background_image_url_https”:”https:\/\/si0.twimg.com\/images\/themes\/theme1\/bg.png”,”profile_background_tile”:false,”profile_image_url”:”http:\/\/a0.twimg.com\/profile_images\/391014951\/twitter-d_normal.gif”,”profile_image_url_https”:”https:\/\/si0.twimg.com\/profile_images\/391014951\/twitter-d_normal.gif”,”profile_link_color”:”57113B”,”profile_sidebar_border_color”:”FFFFFF”,”profile_sidebar_fill_color”:”FFFFFF”,”profile_text_color”:”2A2C31″,”profile_use_background_image”:false,”default_profile”:false,”default_profile_image”:false,”following”:null,”follow_request_sent”:null,”notifications”:null},”geo”:{“type”:”Point”,”coordinates”:[39.74601199,-104.99459343]},”coordinates”:{“type”:”Point”,”coordinates”:[-104.99459343,39.74601199]},”place”:{“id”:”b49b3053b5c25bf5″,”url”:”http:\/\/api.twitter.com\/1\/geo\/id\/b49b3053b5c25bf5.json”,”place_type”:”city”,”name”:”Denver”,”full_name”:”Denver, CO”,”country_code”:”US”,”country”:”United States”,”bounding_box”:{“type”:”Polygon”,”coordinates”:[[[-105.109927,39.614337],[-105.109927,39.914247],[-104.600302,39.914247],[-104.600302,39.614337]]]},”attributes”:{}},”contributors”:null,”retweet_count”:0,”entities”:{“hashtags”:[],”urls”:[{“url”:”http:\/\/t.co\/VqgmuHVGW8″,”expanded_url”:”http:\/\/dlvr.it\/312b6y”,”display_url”:”dlvr.it\/312b6y”,”indices”:[107,129]}],”user_mentions”:[]},”favorited”:false,”retweeted”:false,”possibly_sensitive”:false,”filter_level”:”medium”}

    {“created_at”:”Wed Feb 27 14:24:12 +0000 2013″,”id”:306771720193339393,”id_str”:”306771720193339393″,”text”:”Geeee”,”source”:”\u003ca href=\”http:\/\/twitter.com\/download\/iphone\” rel=\”nofollow\”\u003eTwitter for iPhone\u003c\/a\u003e”,”truncated”:false,”in_reply_to_status_id”:null,”in_reply_to_status_id_str”:null,”in_reply_to_user_id”:null,”in_reply_to_user_id_str”:null,”in_reply_to_screen_name”:null,”user”:{“id”:274990882,”id_str”:”274990882″,”name”:”Anne Curtis-Smith “,”screen_name”:”Pawlavableeee”,”location”:”Sa ref”,”url”:”http:\/\/johannapawla.tumblr.com\/”,”description”:”Fil- Aussie Actress\/Host and Pursuer of dreams”,”protected”:false,”followers_count”:423,”friends_count”:145,”listed_count”:2,”created_at”:”Thu Mar 31 12:26:48 +0000 2011″,”favourites_count”:5063,”utc_offset”:-32400,”time_zone”:”Alaska”,”geo_enabled”:true,”verified”:false,”statuses_count”:23038,”lang”:”en”,”contributors_enabled”:false,”is_translator”:false,”profile_background_color”:”FFFFFF”,”profile_background_image_url”:”http:\/\/a0.twimg.com\/profile_background_images\/779953914\/44cd05b6ca02282caecf5656ed80dd16.png”,”profile_background_image_url_https”:”https:\/\/si0.twimg.com\/profile_background_images\/779953914\/44cd05b6ca02282caecf5656ed80dd16.png”,”profile_background_tile”:true,”profile_image_url”:”http:\/\/a0.twimg.com\/profile_images\/3300953457\/fce4b8629cf7c31aa7547c83723f8097_normal.jpeg”,”profile_image_url_https”:”https:\/\/si0.twimg.com\/profile_images\/3300953457\/fce4b8629cf7c31aa7547c83723f8097_normal.jpeg”,”profile_banner_url”:”https:\/\/si0.twimg.com\/profile_banners\/274990882\/1361611207″,”profile_link_color”:”5950FA”,”profile_sidebar_border_color”:”FFFFFF”,”profile_sidebar_fill_color”:”FFFFFF”,”profile_text_color”:”000000″,”profile_use_background_image”:true,”default_profile”:false,”default_profile_image”:false,”following”:null,”follow_request_sent”:null,”notifications”:null},”geo”:{“type”:”Point”,”coordinates”:[14.77285192,120.88028117]},”coordinates”:{“type”:”Point”,”coordinates”:[120.88028117,14.77285192]},”place”:null,”contributors”:null,”retweet_count”:0,”entities”:{“hashtags”:[],”urls”:[],”user_mentions”:[]},”favorited”:false,”retweeted”:false,”filter_level”:”medium”}

    {“created_at”:”Wed Feb 27 14:24:12 +0000 2013″,”id”:306771720247853057,”id_str”:”306771720247853057″,”text”:”@cabiedas @Cuckisalam @MarinaBlancas jajajajajajajajjajajaja dios le han discriminado.. quiero.mi mural D:”,”source”:”\u003ca href=\”http:\/\/twitter.com\/download\/android\” rel=\”nofollow\”\u003eTwitter for Android\u003c\/a\u003e”,”truncated”:false,”in_reply_to_status_id”:306771396174950400,”in_reply_to_status_id_str”:”306771396174950400″,”in_reply_to_user_id”:557292576,”in_reply_to_user_id_str”:”557292576″,”in_reply_to_screen_name”:”cabiedas”,”user”:{“id”:570212330,”id_str”:”570212330″,”name”:”Iurdana Banana.”,”screen_name”:”IurdanaS”,”location”:””,”url”:null,”description”:”Sus almas estallaron en su primer beso. Podr\u00eda vivir en Pas de la Casa para siempre. ‘Iurdana the banana jelly bean’ Muy fan de @Iansomerhalder. “,”protected”:false,”followers_count”:183,”friends_count”:288,”listed_count”:0,”created_at”:”Thu May 03 19:12:03 +0000 2012″,”favourites_count”:23,”utc_offset”:null,”time_zone”:null,”geo_enabled”:true,”verified”:false,”statuses_count”:4324,”lang”:”es”,”contributors_enabled”:false,”is_translator”:false,”profile_background_color”:”C0DEED”,”profile_background_image_url”:”http:\/\/a0.twimg.com\/profile_background_images\/764631051\/e6d5e2e3770c9961b7cdbc9a0d32e85b.jpeg”,”profile_background_image_url_https”:”https:\/\/si0.twimg.com\/profile_background_images\/764631051\/e6d5e2e3770c9961b7cdbc9a0d32e85b.jpeg”,”profile_background_tile”:false,”profile_image_url”:”http:\/\/a0.twimg.com\/profile_images\/3234453149\/acee84d78f14de281914b928dfb8fc88_normal.jpeg”,”profile_image_url_https”:”https:\/\/si0.twimg.com\/profile_images\/3234453149\/acee84d78f14de281914b928dfb8fc88_normal.jpeg”,”profile_banner_url”:”https:\/\/si0.twimg.com\/profile_banners\/570212330\/1358272071″,”profile_link_color”:”0084B4″,”profile_sidebar_border_color”:”000000″,”profile_sidebar_fill_color”:”DDEEF6″,”profile_text_color”:”333333″,”profile_use_background_image”:true,”default_profile”:false,”default_profile_image”:false,”following”:null,”follow_request_sent”:null,”notifications”:null},”geo”:{“type”:”Point”,”coordinates”:[40.0317527,-3.5862202]},”coordinates”:{“type”:”Point”,”coordinates”:[-3.5862202,40.0317527]},”place”:{“id”:”55066be1721898d3″,”url”:”http:\/\/api.twitter.com\/1\/geo\/id\/55066be1721898d3.json”,”place_type”:”city”,”name”:”Aranjuez”,”full_name”:”Aranjuez, Madrid”,”country_code”:”ES”,”country”:”Espa\u00f1a”,”bounding_box”:{“type”:”Polygon”,”coordinates”:[[[-3.881487,39.884537],[-3.881487,40.130601],[-3.513540,40.130601],[-3.513540,39.884537]]]},”attributes”:{}},”contributors”:null,”retweet_count”:0,”entities”:{“hashtags”:[],”urls”:[],”user_mentions”:[{“screen_name”:”cabiedas”,”name”:”nerea “,”id”:557292576,”id_str”:”557292576″,”indices”:[0,9]},{“screen_name”:”Cuckisalam”,”name”:”Ipene”,”id”:374222901,”id_str”:”374222901″,”indices”:[10,21]},{“screen_name”:”MarinaBlancas”,”name”:”Marina Mandarina”,”id”:444668780,”id_str”:”444668780″,”indices”:[22,36]}]},”favorited”:false,”retweeted”:false,”filter_level”:”medium”}

    {“created_at”:”Wed Feb 27 14:24:12 +0000 2013″,”id”:306771720226881536,”id_str”:”306771720226881536″,”text”:”Mon CPE veut que je monte sur scene au spectacle de mon coll\u00e8ge avec ma gratte j’h\u00e9site :p”,”source”:”web”,”truncated”:false,”in_reply_to_status_id”:null,”in_reply_to_status_id_str”:null,”in_reply_to_user_id”:null,”in_reply_to_user_id_str”:null,”in_reply_to_screen_name”:null,”user”:{“id”:494426511,”id_str”:”494426511″,”name”:”MandyX \u2665 “,”screen_name”:”Amandiine81″,”location”:”France”,”url”:”http:\/\/www.youtube.com\/user\/Amandaaa81?feature=mhee”,”description”:”Salut moi c’est Amandine,Bernadette pour les intimes,j’ai 15 ans et je joue de la guitare depuis quelques ann\u00e9es =) \u2665 (FollowBackQueSurDemande)”,”protected”:false,”followers_count”:353,”friends_count”:352,”listed_count”:0,”created_at”:”Thu Feb 16 22:01:34 +0000 2012″,”favourites_count”:35,”utc_offset”:7200,”time_zone”:”Athens”,”geo_enabled”:true,”verified”:false,”statuses_count”:3977,”lang”:”fr”,”contributors_enabled”:false,”is_translator”:false,”profile_background_color”:”C0DEED”,”profile_background_image_url”:”http:\/\/a0.twimg.com\/profile_background_images\/793402427\/7d31005e211397910873672b680ed0a7.jpeg”,”profile_background_image_url_https”:”https:\/\/si0.twimg.com\/profile_background_images\/793402427\/7d31005e211397910873672b680ed0a7.jpeg”,”profile_background_tile”:true,”profile_image_url”:”http:\/\/a0.twimg.com\/profile_images\/3306268838\/f2c894ca14bea5f91950d5dad98eaa58_normal.jpeg”,”profile_image_url_https”:”https:\/\/si0.twimg.com\/profile_images\/3306268838\/f2c894ca14bea5f91950d5dad98eaa58_normal.jpeg”,”profile_banner_url”:”https:\/\/si0.twimg.com\/profile_banners\/494426511\/1361121861″,”profile_link_color”:”0084B4″,”profile_sidebar_border_color”:”FFFFFF”,”profile_sidebar_fill_color”:”DDEEF6″,”profile_text_color”:”333333″,”profile_use_background_image”:true,”default_profile”:false,”default_profile_image”:false,”following”:null,”follow_request_sent”:null,”notifications”:null},”geo”:null,”coordinates”:null,”place”:{“id”:”a3d6ae94e2cd1ecf”,”url”:”http:\/\/api.twitter.com\/1\/geo\/id\/a3d6ae94e2cd1ecf.json”,”place_type”:”city”,”name”:”Nav\u00e8s”,”full_name”:”Nav\u00e8s, Tarn”,”country_code”:”FR”,”country”:”France”,”bounding_box”:{“type”:”Polygon”,”coordinates”:[[[2.190748,43.544830],[2.190748,43.583220],[2.243599,43.583220],[2.243599,43.544830]]]},”attributes”:{}},”contributors”:null,”retweet_count”:0,”entities”:{“hashtags”:[],”urls”:[],”user_mentions”:[]},”favorited”:false,”retweeted”:false,”filter_level”:”medium”}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s