How do you handle removal of unwanted content (urls, html) from user input?

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

How do you handle removal of unwanted content (urls, html) from user input?

Mateusz Kurowski-2
I would like to remove URLs and HTML from user input. For HTML there is bleach library that someone linked today on IRC. I made a little formmixin for this.
But now i wonder, maybe you have better ways to do it? Some ready solutions? I would like to strip all HTTP and https links from the input as well. Should i just run a simple regex and remove all occurrences? Like:
result = re.sub(r"http\S+", "", text)

class UntrustedFormMixin:
Delete values from input fields.
html_strip_fields = [] # list of field names to clean html from
html_allowed_tags = []
html_allowed_attributes = {}
html_allowed_css_styles = []
html_allowed_protocols = []

def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.HTML_cleaner = bleach.Cleaner(

def clean(self):
for key in self.html_strip_fields:
value = self.cleaned_data.get(key)
if value is not None:
self.cleaned_data[key] = self.HTML_cleaner.clean(value)
self.cleaned_data[key] = self.cleaned_data[key]


You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at
To view this discussion on the web visit
For more options, visit