Project Light

December 9, 2018

Hate Speech Detector
and Analytic Tool

This tool extracts comments from news articles based on filter keywords.
Comments are extracted and assigned weighted scores

Full Story

How Does it Work

The application runs behind the scenes against a collection of articles and associated comments. Each article is evaluated against a list of keywords indicating it may be a topic that generates comments of interest. For the identified articles it then extracts all comments and ranks them according to the keywords discovered. A list of articles ranked by the content of their comments is then returned to the end user on the results page.

Currently the application comes preconfigured to run against a set of articles and comments that have been scrapped from the web to allow the user to test the function

BM 25

How to Install

This application consists of several technologies customized to the needs of our target users. The installation instructions allow an end-user to download the source code and links to setup the application. :

Learn More

Adding Keywords

The "custom brew" of this application is the use of keyword weights. Normal search engines apply a calcualted score used to rank documents. For example BM25. With the use of weights we allow the end user to control the impact on a word by word basis. The weights are stored in the /text/weights.txt file in the folder structure. The format of this file is

{'race':100,
'hate':100,
'kill':100, ...}

Please view our team video on a demonstration on how to edit keywords, weights, queries and categories.
A link to the video can be found under the "how to install section" of this user guide.

How to Install

References

We are grateful to the people listed in our references for making their source code open to the public.
and we want to make a special call out to Nick Hirakawa for publishing what we used as the base of our BM25.

References

Matching Analytics

Coming Soon!

Hate Speech Detection and Analytics