How do you handle removal of unwanted content (urls, html) from user input?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

How do you handle removal of unwanted content (urls, html) from user input?

Mateusz Kurowski-2
I would like to remove URLs and HTML from user input. For HTML there is bleach library that someone linked today on IRC. I made a little formmixin for this.
But now i wonder, maybe you have better ways to do it? Some ready solutions? I would like to strip all HTTP and https links from the input as well. Should i just run a simple regex and remove all occurrences? Like:
result = re.sub(r"http\S+", "", text)


class UntrustedFormMixin:
"""
Delete values from input fields.
"""
html_strip_fields = [] # list of field names to clean html from
html_allowed_tags = []
html_allowed_attributes = {}
html_allowed_css_styles = []
html_allowed_protocols = []

def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# https://bleach.readthedocs.io/en/latest/clean.html#bleach.sanitizer.Cleaner
self.HTML_cleaner = bleach.Cleaner(
tags=self.html_allowed_tags,
attributes=self.html_allowed_attributes,
styles=self.html_allowed_css_styles,
protocols=self.html_allowed_protocols,
)

def clean(self):
super().clean()
for key in self.html_strip_fields:
value = self.cleaned_data.get(key)
if value is not None:
self.cleaned_data[key] = self.HTML_cleaner.clean(value)
self.cleaned_data[key] = self.cleaned_data[key]

 

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/5d676cac-6a99-4797-bff8-f2eac3b67d83%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.