Conversation AI

Conversation AI is a collaborative research effort exploring ML as a tool for better discussions online.

View My GitHub Profile

Back to Conversation AI Research Overview

Research resources

This page outlines the high level questions we are exploring. Further related research can be found on the WikiDetox outline of related work.

We also have a page that provides a brief introduction to ML to help make this research easier to understand.

Public datasets

One of our contributions is to produce datasets to support research:

Opensource Code

We also have various public github projects to support research into having better converstions online:

We also have some smaller projects that provide specific technical demonstration code for working with the Perspective API:

Lots more hacks built using our API can be found at the Perspective Hacks Gallery site, including:

Research questions

The following are our key research questions.

How might machine learning methods help online conversations?

We hypothesise that machine learning can help focus human attention more effectively. Our initial research is summerized in our blog post Algorithms And Insults: Scaling Up Our Understanding Of Harassment On Wikipedia.

Machine learning to help understand harassment at scale?

In Ex Machina: Personal Attacks Seen at Scale, we outline how crowdsourcing and machine learning can be used to analyze personal attacks at scale, and applies the methods to help understand the challenge on Wikipedia.

What role does toxic language have in reducing the number of viewpoints in a discussion?

What tools are needed to make robust open debate on important issues easier at scale?

How might machine learning based tool be used by communities, commenters and authors?

We’ve developed an opensource moderation tool to enable the New York Times to spend more moderator time supporting their community.

What aspects of a conversation can machine learning understand?

Can machine learning methods understand the emotional impact of language?

How much of the structure of a conversation can machine learning approaches uncover?

What are the risks and challenges of using machine learning to assist online conversations?

What unintended and unfair biases might machine learning models contain? What impact might such biases have? What are the best ways to identify these biases? and what can be done mitigate them?

How might machine learning based tool be gamed?

How might ML be misused to censor or reduce viewpoints in a conversation?

ML is a very general technology and can be used in many ways. Part of our research is into possible mis-uses of such technology, for example, we’ve published a survey paper on Network Traffic Obfuscation and Automated Internet Censorship .