An overview of the AAS221 tweet stream

Here are a few features of the tweets from the 221st meeting of the American Astronmical Society.

The data collection has now stopped.

Time range (PST): to

There have been tweets and re-tweets made by people, of which only sent retweets ().

The most retweeted tweets

The retweet numbers are close to the values that Twitter has for the tweets, which means that I can't have messed up too much (I wouldn't expect perfect agreement since Twitter does not guarantee that any search will return all matching tweets).

A look at what's popular

The most re-tweeted accounts.
Twitter account Number of re-tweets
The accounts that are mentioned the most.
Twitter account Number of mentions

It is not really that surprising that the most re-tweeted accounts also appear in the most-mentioned table.

The accounts that are replied to the most (this excludes retweets)
Twitter account Number of replies received
The accounts that have replied to the most tweets (this excludes retweets)
Twitter account Number of replies made
What are the most-used hash tags?
HashTag Number of mentions

Note that #aas221 and #hackaas have an advantage here since the search I used was for 'aas221', 'aas 221', or 'hackaas'.

What URLs are mentioned the most?
URL Number of mentions

The counts in this table jumped significantly - at least for the top URL - on February 12th since I updated my code to calculate the "true" URL - i.e. after following through all the link-shorteners you find on Twitter. Since there are some common query strings seen in URLs on Twitter which can lead to "missing" counts, I removed the following query terms for most URLS: utm_source, utm_medium, utm_campaign, goback and cid. For YouTube links I dropped the feature term, which is why the "Fund Me, Maybe" video just sneaks in at number 6. The choice was made after reviewing the URLs.

What program or web site was used to tweet?
Publisher Number of tweets

Note that this table is created from the tweets () that included this information.

How many different programs were used by a user?
Number of programs Number of tweets

I have not (yet?) looked into the "multi-program" cases to see if it is people with multiple devices - perhaps a smart phone during the conference and a browser/desktop application back in the hotel room - or something else.

How chatty were people?

Below I show the distribution of the number of times an account tweeted (this excludes retweets made by the account). The graph has been cut off to focus on the low-numbers since there are some users with upwards of 100 tweets. The number of tweets used in this analysis is and the graph accounts for of this total.

This plot shows the number of retweets by a user; as can be seen most people made 1 retweet, although there is a significant fraction () who made none. There are users that have been excluded by the cut off at ; the maximum number of retweets by a user is .

How popular is retweeting?

Below I show the distribution of the number of times a tweet was re-tweeted. The graph has been cut off to focus on the low-numbers; as shown above there are tweets that have more than 70 retweets, but it is only of the population. The number of tweets used in this analysis is .

How long did it take for retweets to occur?

Here we look at the distribution of "retweet times" - i.e. the time between the original tweet and when it was re-tweeted. There are two graphs; the first is limited to one day, and so excludes of the distribution. The second graph shows all the tweets, but uses a much-larger bin size. The number of retweets used in this analysis is .

Would people be interested in seeing the same distribution but for replies to tweets (i.e. conversations)?

This is not a popularity contest

I am interested in seeing whether we can split up the population, so I wonder if the distribution of number of followers vs followed for each person may tell us something. Given the Twitter etiquette of reciprocity, I expected there to be a trend along the y=x line, here shown as the diagonal line. I have excluded the accounts who either have no follows or followers. Since the number of follows and followers for an account varies with time, I have used the maximum value for each account reported by the Twitter search.

The soft limit of 2000 followers can be seen as the upper bound on most of the users; I have added a blue rectangle showing the area where the number of follows is more than 2000 but the number of followers is less than 2000 to highlight this area. The limit is explained in the Twitter documentation on Are there additional limits if you are following 2000+ accounts?.

Circle area is proportional to the number of


This visualization was created using the d3.js JavaScript library.