I feel interested when the lecture mentioned about Social Media Analytics on Twitter Data. It said that Twitter data referring to cruise travel were collected and analysed. Therefore, I am going to collect Twitter data referring to the game Hearthstone to see what will happen.
1.) I start searching Mining twitter data. Luckily I found a website who teach users to mine twitter data step by step. And then I followed the instruction from that website.
2.) The first step is to register an application in Twitter. And then under the python console, install Tweepy API in order to collect the data from Twitter.
3.) Then I collected the Twitter status of my account. As my account only follow Hearthstone, it only shows the data about it. The below two figures, one represents the detail shown in twitter, another one represents the result of the python program.
4.) After that, I start streaming and listening to the Twitter with the hashtag “#Hearthstone” and then stored it into a file call “hearthstone.json”. The time interval were between Feb 22 19:31:44 +0000 2017 to Feb 23 00:19:50 +0000 2017, about 5 hours.
5.) And then I perform Text Pre-processing, that is tokenising the input. And then I start to count the top-5 frequency words appeared from the Twitter-data collected.
6.) However, without using filtering with stop words. The result is meaningless.
7.) After applying stop words, the result is shown below.
“#Hearthstone” refers to the hashtag that we care capturing. It can be refer as the number of twit we collected that is 241. “7.1” is the update-version of the game. “Arena” is an in-game object. “Https” may refer that there may be hyper link to website or photo in the twit. “@hearthstone_exp” may refer that there are people who help others to gain exp in hearthstone. It is much more meaningful compared to the result above.
That is what I have done in mining twitter data. Since the limitation of time, I performed words count only. And I believe that there is room for me to do much more for example like term co-occurrences, sentimental analysis, Geolocation and Interactive Maps, etc. I feel that it is quite similar to our project. This is my first-step on handling data from Social Media. I believe that analysis on it can facilitate and improve human life as making things more and more convenient.