Elon Musk has partly delivered on his promise of making Twitter’s algorithm open source. On April 1, the micro-blogging site’s recommendation code was made public. Developers can now modify and make improvements to the code. Mr. Musk’s long-awaited decision has brought him both boos and cheers from different corners of the developer community.
At a high level, Twitter is a social media platform where millions of users post tweets, like them, share, and often, reply. The company makes money from selling ad spaces to advertisers. But a deeper look into the platform reveals a complex system at play.
At the heart of Twitter’s business model is the recommendation algorithm. It is a set of rules that enables the platform to deliver content based on users’ interests and preferences. It is with the help of this system that advertisers promote their brands. Content flows into a user’s timeline through two channels. One pipeline channels content posted by people that the user follows and the other is filled with posts flowing from accounts that could potentially interest the user.
The latter timeline, called ‘For You’, acts as a predictive tool to suggest feeds that a user may be interested in. It helps the micro-blogging site find answers to questions like — what is the probability a user will engage with another user in the future, what communities on Twitter might a user be interested in and what tweets are trending within them. Answers to such questions help the platform recommend relevant content.
The recommendation system
Twitter’s recommendation algorithm runs on a three-step process. First, it fetches tweets from multiple recommendation sources. The platform calls this process ‘candidate sourcing’. After sourcing these tweets, a machine learning model ranks them. After ranking, tweets are filtered to remove those that a user may have blocked, or has already seen.
At the sourcing stage, the algorithm mixes tweets that eventually flow into a user’s ‘For You’ timeline. To mix, it picks ‘candidates’ from people a user follows, and from those they do not. It calls these two sources as ‘in-network’ and ‘out-network’, and together they make up 50/50 of the mix. Twitter says that for each request it attempts to extract the top 1,500 tweets from a pool of hundreds of millions. The in-network part is quite easy to build as information is picked in real-time from people a user follows. But out-network sourcing is trickier as the platform must pick content from candidates the user does not follow.
To do this, Twitter uses what it calls social graph and embedded spaces. The former creates a stream of candidates based on what content followers of a user engage with. And the latter matches the profile of a user with a cluster that exhibits similar interests and preferences as the user.
Once this is done, using a 48M parameter neural network that is continuously trained on tweet engagement, the platform starts ranking feeds.
“This ranking mechanism takes into account thousands of features and outputs ten labels to give each tweet a score, where each label represents the probability of an engagement,” Twitter notes in its blogpost.
After Twitter open sourced its recommendation algorithm, many people flocked to GitHub to view the code. Some see this reveal as “a step in the right direction for the future of humanity.” Others note that the code does not reveal much about how it is used by the platform. They also highlight that important bits of information have been left out. For instance, the absence of information on the data it uses to build these pipelines prevents one from having a complete picture of the platform’s recommendation system. A report by Fortune, citing a former Twitter executive, points out that open sourcing any algorithm requires its training set to be open sourced as well. And that is impossible for Twitter to do. “Every effort in open-sourcing the algorithm without the data is completely dishonest,” the executive said.
Musk’s business plan?
Mr. Musk is no philanthropist. He is gradually building Twitter as a place for privileged users who can pay for verification tick marks and get additional features, including a higher ranking in the feeds. Additionally, he is making these changes at a time when he has fired most of the company’s technical staff.
Social media platforms need experienced developers to keep building new features and deploying them successfully. So, perhaps, Mr. Musk thinks opening the source code to external developers could potentially solve the human resource bottleneck. But it will be a tough road as Mr. Musk has damaged Twitter’s reputation in the open-source community. As Will Norris, Twitter’s former open-source lead told ZDNet, “They’ve lost all credibility as a serious engineering organisation, I don’t care how much you call yourself “hardcore.” Open-source communities are built on relationships and trust, and now Twitter has neither with these groups. They’ve lost any ability to participate meaningfully in those communities.”