It’s a few words restricted to 160 characters, but the reach and impact of a single tweet goes way beyond. With 313 million monthly active users, and 1 billion unique monthly visits to sites with embedded tweets in over 40 languages, its influence is phenomenal to say the least.
But given the laissez faire nature of the medium, and its effect on impressionable minds, there’s the potential danger of it being misused, misconstrued and even worse, creating communal animosity.
IIIT-Hyderabad’s Informational and Retrieval Extraction Lab (IREL) has been researching this quandary by applying natural language processing and semantics to develop an automated system using Artificial Intelligence chatterbots that can detect hate speech in tweets. As of now they are able to detect abusive language, sexist and racist speech and flag offence content. This is especially useful in not only automatically filtering such content, but also analysing public sentiment to get to the root of the problem through user generated content.
To detect hate speech, they use a popular approach in machine learning called supervised learning. Essentially a computer algorithm is fed many examples of text from each form of hate, which can be categorised as ‘racist’ or ‘sexist’ tweets. The algorithm is designed in such a way that it ‘learns’ as it sees the data, and after the algorithm terminates, the programme is smart enough to recognise racism or sexism in text, if it sees one. The algorithm uses neural networks, more popularly called as deep learning. These algorithms are inspired from the human brain, and they try to simulate how humans learn from examples.
This line of research is highly relevant in the current wave of social media, but has its unique challenges and complexities. For example, how does natural language processing decipher the various forms of hatred, identify the targets of such hatred and deconstruct the double entrendes of a language.
Vasudeva Varma, Professor and Dean (R & D) at IIIT-Hyderabad, his students Pinkesh Badjatiya, Shashank Gupta and adjunct faculty Manish Gupta worked on this topic for close to a year and presented it at WWW2017 Perth earlier this month, where their poster on Deep Learning for Hate Speech Detection in Twitter was voted as the best poster presentation among 166 submissions from around the world.