Thursday, September 6, 2012

Paper Reading #5: Profanity Use in Online Communities

Profanity Use in Online Communities

Sara Owsley Sood
Computer Science Dept., Pomona College
185 East Sixth Street
Claremont, CA 91711
sara@cs.pomona.edu

Judd Antin, Elizabeth F. Churchill
Yahoo! Research
4301 Great America Parkway
Santa Clara, CA 95054
<jantin, echu>@yahoo-inc.com

The researchers in this paper studied the statistic behind profanity use in online communities and how much profanity blocking software succeeded or failed.


Most online profanity is actually spelled in unconventional manners. Such as substituting an '@' for the letter 'A' in a curse word. This means that the profanity blocking software cannot just simply block words that match a word in a database of "bad words". There are too many ways for curse words to be spelled. Other challenges that the software faces is users misspelling words that end up being blocked when they shouldn't be or how evolutionary online profanity has become.

The team of researchers gathered a data set from Yahoo! Buzz (social news site) over a 3 month time frame. This data set contains 1,655,131 comments among 168,973 unique threads. Meta data on the comments were also collected. Most common profanity blocking software work off a list of profane words due to easiness in implementation. However, this method relies on a constantly updated list of profane terms. Internet abbreviations are constantly evolving. Because of this, the lists do not get updated until a profane word or phrase become common enough to be a problem. Usually a person has to review the list and be the one to make changes to it. The data set showed a large percentage of the use of curse words, even with a profanity blocker enabled. Because of this, they concluded that current profanity anti-measures do not hold up well enough in the current age.

I thought this work was interesting. This has been a longstanding issue ever since online communities such as chat rooms and forum boards have emerged. Being a constant internet user, I find myself coming across online profanity on a frequent basis. While it doesn't bother, it has always made me curious as to what a profanity blocker will and won't block. It was also interesting to note that some people put a decent amount of effort into using profane words. It would seem like that a message would get across just as fine without the cursing but people would "go out of their way" just to use profanity by substituting characters, purposely misspelling words, or even adding spaces to their message to confuse the blocker. To me, this seems like a lot of effort just to use profane words.

No comments:

Post a Comment