This page outlines the high level questions we are exploring. Further related research can be found on the WikiDetox outline of related work.
We also have a page that provides a brief introduction to ML to help make this research easier to understand.
One of our contributions is to produce datasets to support research:
We also have various public github projects to support research into having better converstions online:
We also have some smaller projects that provide specific technical demonstration code for working with the Perspective API:
Lots more hacks built using our API can be found at the Perspective Hacks Gallery site, including:
The following are our key research questions.
We hypothesise that machine learning can help focus human attention more effectively. Our initial research is summerized in our blog post Algorithms And Insults: Scaling Up Our Understanding Of Harassment On Wikipedia.
In Ex Machina: Personal Attacks Seen at Scale, we outline how crowdsourcing and machine learning can be used to analyze personal attacks at scale, and applies the methods to help understand the challenge on Wikipedia.
Can machine learning methods understand the emotional impact of language?
How much of the structure of a conversation can machine learning approaches uncover?
What unintended and unfair biases might machine learning models contain? What impact might such biases have? What are the best ways to identify these biases? and what can be done mitigate them?
See Attacking discrimination with smarter machine learning for a great introduction to the problem. Challenges related machine learning bias, fairness and algorithmic bias are outlined further on our unintended bias page.
In Measuring and Mitigating Unintended Bias for Text Classification we have developed methods for measuing the unintended bias in a text classifier according to terms that appear in the text, as well as approaches to help mitigate them.
Word based models, including the CNN we developed for toxicty, can be tricked easily by creative misspellings. Using character level models can help address this, but require more data and their training suffers from the vanishing gradient problem.
Models based on character-level ngrams fed into feed-forward networks, like our TOXICITY_FAST model, can be easily gamed by adding additional ngrams after the initial comment that counter the signal from the problematic ngrams. This can be addressed by using RNNs and CNNs, like our TOXICITY model which take account of more of the textual context.
The practical impact of gaming ML models is an open research question, and is likely to depend on way the ML is applied. Moreover, there are different threat models for different applications of ML: gaming of an authorship experience is quite different to the recieving suggestions and considering retraining on them.
ML is a very general technology and can be used in many ways. Part of our research is into possible mis-uses of such technology, for example, we’ve published a survey paper on Network Traffic Obfuscation and Automated Internet Censorship .