经济学领域最具影响的100个Twitter用户

新闻动态 · 2015-04-09

作者：佚名

How is it possible to measure online influence on Twitter? And how to identify the top 100 most influential Twitter accounts related to economics? To address this issue, and following a methodology inspired by the working paper "Measuring User Influence in Twitter: The Million Follower Fallacy", we develop a pretty simple program in Python to extract data about followership relation on Twitter, and we use an algorithm close to the "Google PageRank" to classify and rank account by influence. We cluster data using Force Atlas algorithm and we use Gephi to draw the wonderful graph you'll see at the end of this article. But how does it work more precisely? Let's try to explain this step-by-step, using "nongeek" lexicon (if you just want to see the graph and/or the final list, you can directly go to the end of this article).

Step 1 - Identify a list of five influential economists on Twitter: We first need to define manually (subjectively) a closed list of 5 influential economists. We try to be as objective as possible, and we finally choose the five following accounts (3 Nobel Prize winners, and two economists with respectively 70K+ and 300K+ followers). This list can be criticized, but we find that our results are robust to the initial list used.

·Paul Krugman (@NYTimeskrugman)

·Joseph Stiglitz (@JosephEStiglitz)

·Robert Shiller (@RobertJShiller)

·Justin Wolfers (@JustinWolfers)

·Nouriel Roubini (@Roubini)

Step 2 - Extract all the account followed by those five accounts: We use Twitter API to extract and insert into a database the Twitter ID of all the accounts followed by users from the list defined above... For example, Justin Wolfers follows 587 other users, so we add all those users to our database. And we've done the same thing for Krugman (who follows only 2 users), Stiglitz (78), Shiller (23) and Roubini (381).

Step 3 - Identify the nineteen "most commonly followed users": We make the hypothesis that when influential users commonly follow another user, then this user should also be influential. We identify the nineteen "most commonly followed Twitter Account" and add those accounts to our user database. Why 19 ? Simply because we will use 5 iterations (adding 5 times the 19 most commonly followed accounts), and 19*5 + 5 = 100. For example, Branko Milanovic (@BrankoMilan) is followed by Stiglitz, Roubini and Wolfers, so we add him to the list. Other influential accounts identified this way during the first iteration include Bradford DeLong (@delong), Austan Goolsbee (@Austan_Goolsbee), Richard Thaler (@R_Thaler), Jason Furman (@CEAChair), Project-Syndicate (@Prosyn) and the National Bureau of Economic Research (@nberpubs). The graph below shows the links between the 24 users after the first iteration. Links are directed (showing when an account follows another one) and the size of the node depends on entrant links.

Step 4 - Go back to Step 2 and extract information about the nineteen new accounts

Step 5 - Go back to Step 3 and add the "nineteen most commonly followed users": Using the list from the previous graph of 24 accounts, instead of the initial list of 5 accounts. And again and again, until reaching 100 accounts

Step 6 - Open Gephi to use a "PageRank-like" classification, to clusterize data and to create a graph: And it's over! So here is the final graph, and below the list of the 100 most influential accounts identified with our methodology. The size of the node represents influence and the distance between two nodes depends on the similarity between accounts. It's far from perfect of course (our initial goal was to identify "economists", and we end up with quite a lot of journalists talking about economics and finance... but journalists are very active and influent on Twitter so it's not a big surprise), but actually we are pretty happy with the final list, as it is consistent with our (limited and biased) views of the "Twitter economic network". And the winners are ....