On April 10, 2018, Facebook CEO Mark Zuckerberg testified before the U.S. Senate Judiciary and Commerce Committees prompted by the revelation that Cambridge Analytica, a political consulting firm linked to the Trump campaign, collected the data of an estimated 87 million Facebook users to psychologically profile voters during the 2016 election. During a rigorous questioning from lawmakers, Zuckerberg was forced to finally answer questions regarding Facebook users’ privacy, the proliferation of “fake news,” the Russian interference in the 2016 presidential election, and censorship of the conservative media. Additionally, one more highly debated, hot topic Zuckerberg was asked about: hate speech.
The topic of hate speech has been gaining more traction in international conversations as political elections become increasingly contentious (e.g. U.S. 2016 Clinton-Trump Presidential Election), as well as the rise of nationalist groups (e.g. July 2017 white supremacist gathering in Charlottesville, Virginia and anti-refugee language in Europe). In general, hate speech can be defined as speech that offends, insults, or degrades groups or individuals based on race, religion, nationality, sexual orientation, disability, gender, or other characteristics. 
The online world, and social media platforms in particular, have become breeding grounds for the spread of hate speech and harmful content because it gives users a global stage to effectively and widely distribute their messages for consumption. Facebook’s management is not blind to how the platform is being used and as a result of a line of questioning from Sen. Ben Sasse (R., Neb.), this hearing finally created a space for Zuckerberg to address Facebook’s hate speech policies and content removal practices.
Upon first being asked to define hate speech, Zuckerberg said it would be hard to pin down a specific definition, and mentioned speech “calling for violence” as something Facebook does not tolerate.  Well, Zuckerberg should be slightly miffed with whoever came up with his talking points because Facebook does indeed define what they identify as hate speech in their Community Standards. This definition includes “content that directly attacks people based on their: race, ethnicity, national origin, religious affiliation, sexual orientation, sex, gender, or gender identity, or serious disabilities and diseases.”
Zuckerberg continued to stumble and seemed to lack the necessary knowledge of his company’s hate speech policies and the complexities of its content removal process (shocker!). Again, this is another situation where Zuckerberg should be side-eyeing whoever prepped him because Facebook discusses this in its Statement of Rights and Responsibilities under Section Five labeled “Protecting Other People’s Rights.” This states that Facebook “can remove any content or information [users] post on Facebook if [Facebook] believes that it violates the statement of [their] policy. 
Now, I will cut Marky Mark a little slack—I mean he did start a company in his dorm room! However, he’s not the only one who doesn’t know all the facts. Like many other platforms, Facebook is not transparent about the specific guidelines they use in their hate speech removal endeavors. Understandably, this is in large part due to the company not wanting individuals who post online hate speech to know the specific guidelines that are in place so they can develop strategies to evade them and continue to post hateful content.
Thanks to an investigative report done by ProPublica, any curious Googlers can now learn how Facebook handles hate speech. Facebook’s formula for hate speech constitutes it as an attack on a protected category, which are the groups identified in its Community Standards. In the training guide given to content moderators, types of attacks defined by Facebook include: calling for violence, calling for exclusion, calling for segregation, degrading generalization, dismissing, cursing, and slurs.  Now this, is totally understandable, but as the report goes on things start to become a little fishy. These guidelines also note categories that are “unprotected” which includes, social class, occupation, political ideology, continental origin, appearance, age, religions, and countries.  So, if a post includes both a protected category and an unprotected category plus an attack, the post does not constitute as hate speech.
These guidelines are obviously flawed and there have been multiple cases identifying instances when the human reviewers had incorrectly flagged frank conversations about sexism or racism as hate speech. There was the time it banned comedian Marcia Belsky for 30 days after commenting “men are scum” in response to a friend’s photo album, and Boston poet and Black Lives Matter activist Didi Delgado for writing in a post, “All white people are racist. Start from this reference point, or you’ve already failed.” Because each of these posts address a protected category (gender and race), they are deemed hate speech. However, the statement “poor black people should still sit at the back of the bus,” is not deemed hate speech because of its mention of social class (protected category + unprotected category + attack) despite its clearly hateful racist sentiment.
So, what is the most logical solution to monitor and remove hate speech more effectively? Artificial intelligence, obviously. (This seems to be the answer to every single question these days.) In the April 10 hearing, Zuckerberg said he could see AI taking the primary role in automatically detecting hate speech on the platform in five to ten years. He states, “Hate speech—I am optimistic that over a five-to-10-year period we will have AI tools that can get into some of the linguistic nuances of different types of content to be more accurate, to be flagging things to our systems, but today we’re just not there on that.” 
How can this be done? My guess— deep learning and more specifically, Facebook’s text understanding engine DeepText. DeepText is a deep learning-based text understanding engine that uses several deep neural network architectures to “understand with near-human accuracy the textual content of several thousand posts per second, spanning more than 20 languages.” Facebook engineers taught DeepText to teach itself through unsupervised machine learning techniques in order to interpret the meaning of posts and comments. Unsupervised machine learning techniques applies neural networks to unlabeled data sets of posts and comments to understand their meaning and sentiment allowing Facebook to feed its computers raw text straight from a user’s comments and posts thus resulting in the machine’s ability to discern the relationships between the words on their own and to break down the intent, sentiment, and entities of human language.
Facebook has remained secretive about automated methods they are developing and using for identifying and removing hate speech. However, it can be reasonably speculated that the platform is utilizing DeepText to streamline its process for identifying and removing online hate speech. This assumption can be made because Instagram (which Facebook acquired in 2012) has revealed its use of DeepText technologies to combat the hateful and negative comments that are being posted on the platform. After seeing the success of utilizing DeepText to remove spam, Instagram hired comment reviewers to analyze and rate roughly two million comments to determine if the comment was “mean spirited” and violated the platform’s community guidelines. 
Now don’t get me wrong, I think AI and machine learning especially in the removal of hate speech is truly fascinating, but can we really trust a machine to make calls on something as delicate as the nuances of language? I’m sure the computational linguists in the world would tell me that it’s more than possible and to kindly go home, but I’m still not totally convinced. Although deep learning methods have been argued as evolving pass Polyani’s Paradox, I still think we need to really consider that machine learning systems often have low “interpretability,” especially with regards to extremely complex applications such as nuanced language. Someone who is much smarter and more capable than me also echoes this sentiment. In an article from The Washington Post, Data & Society researcher Robyn Caplan states, “AI can’t understand the context of speech, and, since most categories for problematic speech are poorly defined [by necessity], having humans determine context is not only necessary but desirable.” 
AI is amazing and its capabilities and ways it can be used are mind blowing. But, in some situations the computer scientists and folks from Silicone Valley are understating the problem (i.e. hate speech) and overstating AI’s abilities to aid in the situation. There are already reported instances of human content moderators censoring Facebook users and depending on how the algorithm is trained, there could be huge censorship consequences, especially for minority voices, on the platform that was initially lauded as a way to connect with others and share ideas. Maybe, just maybe, there are some things that AI can’t fix.
Editor’s Note: To anyone from Mark Zuckerberg’s team who has somehow found this and is wildly impressed by my research abilities and wants to offer me an employment opportunity: I am open to a conversation.
1. Deborah Solomon, “Mark Zuckerberg Testimony: Senators Question Facebook’s Commitment to Privacy,” New York Times (April 10, 2018).https://www.nytimes.com/2018/04/10/us/politics/mark-zuckerberg-testimony.html
2. Meg Leta Jones, “Silencing Bad Bots: Global Legal and Political Questions for Machine Communication,” (2018). Communications Law & Policy, 23(2).
3. Mairead Mcardle, “Sasse Grills Zuckerberg: If You’re Going to Police Hate Speech, Can You Define It?” National Review (April 10, 2018). https://www.nationalreview.com/news/ben-sasse-mark-zuckerberg-hate-speech-policy-definitions/
4. Community Standards, Facebook, https://www.facebook.com/communitystandards
5. Statement of Rights and Responsibilities, Facebook, https://www.facebook.com/terms.php; Last revised Jan. 30, 2015
6. Julia Angwin & Hannes Grassegger, “Facebook’s Secret Censorship Rules Protect White Men From Hate Speech But Not Black Children,” ProPublica, June 28, 2017, https://www.propublica.org/article/facebook-hate-speech-censorship-internal-documents-algorithms
8. Maya Kosoff, “Facebook is Losing the War on Hate Speech,” Vanity Fair (Jan. 2, 2018). https://www.vanityfair.com/news/2018/01/facebook-is-losing-the-war-on-hate-speech
9. Dave Gershgorn, “Mark Zuckerberg just gave a timeline for AI to take over detecting internet hate speech,” Quartz (April 10, 2018). https://qz.com/1249273/facebook-ceo-mark-zuckerberg-says-ai-will-detect-hate-speech-in-5-10-years/
10. Ahmad Abdulkader, Aparna Lakshmiratan, Joy Zhang, “Introducing DeepText: Facebook’s text understanding engine,” Facebook, June 1, 2016, https://code.facebook.com/posts/181565595577955/introducing-deeptext-facebook-s-text-understanding-engine/
11. Steven Max Patterson, “Understanding Deep Text, Facebook’s text understanding engine,” Network World (June 1, 2016) https://www.networkworld.com/article/3077998/internet/understanding-deep-text-facebooks-text-understanding-engine.html
13. Jonathan Vanian, “How Facebook Uses Artificial Intelligence to Teach Computers to Read,” Fortune, June 1, 2016, http://fortune.com/2016/06/01/facebook-artificial-intelligence/
14. Josh Constine & Kim-Mai Cutler, “Facebook Buys Instagram for $1 Billion, Turns Budding Rival into its Standalone Photo App,” TechCrunch, (Apr. 9, 2012), https://techcrunch.com/2012/04/09/facebook-to-acquire-instagram-for-1-billion/
15. Nicholas Thompson, “Instagram Unleashes an AI System to Blast Away Nasty Comments,” Wired, June 29, 2017, https://www.wired.com/story/instagram-launches-ai-system-to-blast-nasty-comments/
16. Drew Harwell, “AI Will Solve Facebook’s Most Vexing Problems, Mark Zuckerberg says. Just Don’t Ask When or How,” The Washington Post (April 11, 2018). https://www.washingtonpost.com/news/the-switch/wp/2018/04/11/ai-will-solve-facebooks-most-vexing-problems-mark-zuckerberg-says-just-dont-ask-when-or-how/?utm_term=.a1e6f7324901