Editor’s note: Examples of profanity in this article are represented by the word “profanity” in order to remain inclusive and appropriate for all audiences.
Detecting and filtering profanity is a task you are bound to run into while building applications where users post (or interact with) text. These can be social media apps, comment sections, or game chat rooms, just to name a few.
Having the ability to detect profanity in order to filter it out is the solution to keeping communication spaces safe and age-appropriate, if your app requires.
This tutorial will guide you on building a GraphQL API to detect and filter profanity with Python and Flask. If you are just interested in the code alone, you can visit this GitHub repo for the demo application source code.
Prerequisites
To follow and understand this tutorial, you will need the following:
- Python 3.7 or later installed on your machine
- Basic knowledge of Flask
- Basic knowledge of GraphQL
- A text editor
What is profanity?
Profanity (also known as curse words or swear words) refers to the offensive, impolite, or rude use of words and language. Profanity also helps to show or express a strong feeling towards something. Profanity can make online spaces feel hostile towards users, which is undesirable for an app designed for a wide audience.
Which words qualify as profanity is up to your discretion. This tutorial will explain how to filter words individually, so you have control over what type of language is allowed on your app.
What is a profanity filter?
A profanity filter is a software or application that helps detect, filter, or modify words considered profane in communication spaces.
Why do we detect and filter profanity?
- To foster healthy interactions between people, especially when children are involved
- To improve social experiences by creating a positive environment for people to communicate
- To add an extra layer of security to user communities
- To automatically block and filter unwanted content from communication spaces
- To reduce the need for manual user moderation in online communities
Common problems faced when detecting profanity
- Users might start using language subversions to get around filters
- Users might start manipulating the language by replacing letters with numbers and Unicode characters or creatively misspelling words to bypass filters
- Profanity filters might fail to consider context while filtering content
- Profanity filters often create false positives while filtering, e.g., the Scunthorpe problem
Detecting profanity with Python
Using Python, let’s build an application that tells us whether a given string is profane or not, then proceed to filter it.
Creating a word-list-based profanity detector
To create our profanity filter, we will create a list of unaccepted words, then check if a given string contains any of them. If profanity is detected, we will replace the profane word with a censoring text.
Create a file named filter.py
and save the following code in it:
def filter_profanity(sentence): wordlist = ["profanity1", "profanity2", "profanity3", "profanity4", "profanity5", "profanity6", "profanity7", "profanity8"] sentence = sentence.lower() for word in sentence.split(): if word in wordlist: sentence = sentence.replace(word, "****") return sentence
Testing our word-list-based filter
If you were to pass the following arguments to the function above:
filter_profanity("profane insult") filter_profanity("this is a profane word") filter_profanity("Don't use profane language")
You would get the following results:
******* ****** this is a ******* word Don't use ******* language
However, this approach has many problems ranging from being unable to detect profanity outside its word list to being easily fooled by misspellings or word paddings. It also requires us to regularly maintain our word list, which adds many problems to the ones we already have. How do we improve what we have?
Using the better-profanity Python library to improve our filter
Better-profanity is a blazingly fast Python library to check for (and clean) profanity in strings. It supports custom word lists, safelists, detecting profanity in modified word spellings, and Unicode characters (also called leetspeak), and even multi-lingual profanity detection.
Installing the better-profanity library
To get started with better-profanity, you must first install the library via pip
.
In the terminal, type:
pip install better-profanity
Integrating better-profanity into our filter
Now, update the filter.py
file with the following code:
from better_profanity import profanity profanity.load_censor_words() def filter_profanity(sentence): return profanity.censor(sentence)
Testing the better-profanity-based filter
If you were to pass the following arguments once again to the function above:
filter_profanity("profane word") filter_profanity("you are a profane word") filter_profanity("Don't be profane")
You would get the following results, as expected:
******* **** you are a ******* **** Don't be *******
Like I mentioned previously, better-profanity supports profanity detection of modified word spellings, so the following examples will be censored accurately:
filter_profanity("pr0f4ne 1n5ult") # ******* ****** filter_profanity("you are Pr0F4N3") # you are *******
Better-profanity also has functionalities to tell if a string is profane. To do this, use:
profanity.contains_profanity("Pr0f4ni7y") # True profanity.contains_profanity("hello world") # False
Better-profanity also allows us provide a character to censor profanity with. To do this, use:
profanity.censor("profanity", "@") # @@@@ profanity.censor("you smell like profanity", "&") # you smell like &&&&
Building a GraphQL API for our filter
We have created a Python script to detect and filter profanity, but it’s pretty useless in the real world as no other platform can use our service. We’ll need to build a GraphQL API with Flask for our profanity filter, so we can call it an actual application and use it somewhere other than a Python environment.
Installing the application requirements
To get started, you must first install a couple of libraries via pip
.
In the terminal, type:
pip install Flask Flask_GraphQL graphene
Writing the application’s GraphQL schemas
Next, let’s write our GraphQL schemas for the API. Create a file named schema.py
and save the following code in it:
import graphene from better_profanity import profanity class Result(graphene.ObjectType): sentence = graphene.String() is_profane = graphene.Boolean() censored_sentence = graphene.String() class Query(graphene.ObjectType): detect_profanity = graphene.Field(Result, sentence=graphene.String( required=True), character=graphene.String(default_value="*")) def resolve_detect_profanity(self, info, sentence, character): is_profane = profanity.contains_profanity(sentence) censored_sentence = profanity.censor(sentence, character) return Result( sentence=sentence, is_profane=is_profane, censored_sentence=censored_sentence ) profanity.load_censor_words() schema = graphene.Schema(query=Query)
Configuring our application server for GraphQL
After that, create another file named server.py
and save the following code in it:
from flask import Flask from flask_graphql import GraphQLView from schema import schema app = Flask(__name__) app.add_url_rule("/", view_func=GraphQLView.as_view("graphql", schema=schema, graphiql=True)) if __name__ == "__main__": app.run(debug=True)
Running the GraphQL server
To run the server, execute the server.py
script.
In the terminal, type:
python server.py
Your terminal should look like the following:
Testing the GraphQL API
After running the server.py
file in the terminal, head to your browser and open the URL http://127.0.0.1:5000. You should have access to the GraphiQL interface and get a response similar to the image below:
We can proceed to test the API by running a query like the one below in the GraphiQL interface:
{ detectProfanity(sentence: "profanity!") { sentence isProfane censoredSentence } }
The result should be similar to the images below:
Conclusion
This article taught us about profanity detection, its importance, and its implementation. In addition, we saw how easy it is to build a profanity detection API with Python, Flask, and GraphQL.
The source code of the GraphQL API is available on GitHub. You can learn more about the better-profanity Python library from its official documentation.
Monitor failed and slow GraphQL requests in production
While GraphQL has some features for debugging requests and responses, making sure GraphQL reliably serves resources to your production app is where things get tougher. If you’re interested in ensuring network requests to the backend or third party services are successful, try LogRocket.

LogRocket is like a DVR for web and mobile apps, recording literally everything that happens on your site. Instead of guessing why problems happen, you can aggregate and report on problematic GraphQL requests to quickly understand the root cause. In addition, you can track Apollo client state and inspect GraphQL queries' key-value pairs.
LogRocket instruments your app to record baseline performance timings such as page load time, time to first byte, slow network requests, and also logs Redux, NgRx, and Vuex actions/state. Start monitoring for free.