I wrote a shell script for automatically deleting CP spam when it's reported. Before I start using it, I want to check if anybody can find a way to make it misbehave.
It works like this: - Look through the report queue for a banned regex (typically the URL shortener they use) - If it's there, use the JSON interface to check the OPs of all reported threads for the regex - If a thread has the regex in its OP, delete it
What's to stop me from just reporting a thread I don't like as CP?
Jaxon Ward
What's the plan for when they just use a different shortener? Add them to the banlist one by one?
Nathan Stewart
It checks the text.
Austin Martinez
Fuck, I almost reported this thread
Ian Howard
What are you planning on using for said regex?
Luis Gomez
How do I add this on Holla Forums?
William Miller
For now, tr.im/. I can add other domains with \| as a separator using grep basic regex syntax.
If they stop using URLs in the post message (they did that with Lynxchan) I'll experiment with command line OCR tools. OCR got rid of them on Lynxhub and Endchan.
You run it with a cronjob on a server. If you can get me a volunteer account for Holla Forums I can run it there once it's tested.
Ryan Gray
I've been wanting to implement something like this to offer up for months and was too lazy to get to work. Thank you for this so much.
Adrian Phillips
The problem with this is when they don't post any link shorteners in the text, only in the image. It happens a lot, if you recall correctly.
Also this solution is only working around the fact that the site is so fucking broken that you can't moderate it properly, and they apparently have no interest in fixing it, even going to far to implement an overboard out of the site's scope.
Liam Brown
You wouldn't be the first one trying to crash this picture.
Jordan Martin
What a hot head. BLO BLO BLO BLO BLO BLO BLO BLO
Blake Carter
I noticed /just/ often has the spam links broken. You might want to get in touch with its BO for some common link shorteners.
Jack Hernandez
That's what OCR is for. Text recognition in images. For example: $ tesseract topbane.jpeg stdout Topbane . ruPSHC,PSHC-Big Guys Videos4U Yo - Mosqui‘ro Men
You're also relying on people to report every single image because the site is so fucking broken.
Samuel Scott
...
Cameron Taylor
I don't think they'd put that much effort in, after awhile they'd just give up.
Alexander James
I could detect it without reports, but then the worst-case scenario for a bug would be deleting every single thread in the catalog, and it would save less than five minutes on average.
Eli Russell
it's no effort really, it's called imagemagick.
Parker Lewis
No, the proper way to totally nigger rig this as a gvol is to check reports, see if it matches, find all posts by global ID (optionall check if it matches) and delete them all one by one.
Why the fuck not, we're already paving over every other deep flaw in this fucking software, might as well work around this because hotwheels can't have 2ch's database help with a big table migration.
Alexander Thompson
database guys*
Jonathan Allen
I wrote a OCR checker that checks all new threads. I haven't tested it on real spam images yet because I don't have any, but it works with the topbane fake.