/random/blog/

My random Ramblings

Month: December, 2011

Hacking Twitter Data – Fun with visualizing twitter data

So then, after playing around with NetworkX Graph library and UbiGraph Visualization, I couldn’t help but feel excited about wanting to visualize twitter data. Twitter has a pretty neat API that lets you have access to a wide variety of Information. Suppose you want to search the twitter universe for a list of posts that have the word ‘Metallica’, no problem , you can easily programmatically query and retrieve the relevant results.

Tweets can be retweeted and this can be a measure of how much impact a tweet has. Also, we can identify tweeters who are higly influential in the twitter sphere. So pretty recently, this song “Why this Kolaveri di” has gone viral beyond imaginable proportions and I just wanted to visualize tweeters who have tweeted about this song (Not that I am a fan of this song :)). So, I used the twitter API to search for some 10000 tweets that had the phrase “kolaveri” and extracted all tweets that had been retweeted. In Twitter a tweet which has been retweeted can be easily identified and retweet relationship can be easily extracted by a simple regex. (Eg:   RT@David). I then modelled the retweet relationship using the NetworkX graph library in python.

So then , lets visualize what we have extracted and see if we can find something interesting from it. I had been blogging about how awesome Ubigraph is for visualizing complex networks and how easy it is for integration with python based models. I wrote a simple script for integrating the Ubigraph server (It runs as an XML-RPC server) with Networkx Library. So the Visualized Tweeter-Retweeter relationship looked something like this :)))

Twitter Data Visualization

So looks pretty cool huh :)). Now if we focus our attention to the top most section of the graph, we can actually see a pretty dense node, having a lot of edges to it. The Tweet actually originates from a user called (@fakingnews). It is a famous satire website, and we can see that many users have actually retweeted fakingnews’s original tweet. In our case, however the extracted number of nodes are pretty less when compared to Twitter’s huge microblogosphere. However, we can see how much fun it was visualizing complex networks as it gives a great insight into the underlying interacttions among continually evolving networks. Hopefully, in the future , I would post about some statistical measures for measuring influence among users in twitter networks.

God Bless Python :)))

Advertisements

Playing around with Ubigraph-NetworkX Library in Python

I have been interested in social media visualization for quite some time. I was looking for some great visualization tools for python based graphs. I was extremely impressed by Ubigraph’s Visualization tools (http://ubietylab.net/ubigraph/). It has a pretty awesome graph visualization UI. the coolest thing which python users will love about will be its integration to another awesome graph library in Python (http://networkx.lanl.gov/). It is a great tool for modeling complex networks.These two tools are gonna make hacking social media data so much fun 🙂

In my next post we’ll see a use a simple script to visualizing twitter data as interactive graphs