God damn it, I am fucking tired of fucking around with this shit today. I know this pre-historic captcha type was already cracked by Google itself just with clever algorithms and no neural net but come on, just let a guy live for Christs sake
btw, these are the kind of captchas that turn up if anyone is wondering
Jack Bell
...
Christopher Carter
Hire some pajeets to enter captcha for you or try changing IPs
Wyatt Hughes
add sum proxies :^)
anyway why are you trying to build this database
Mason Brooks
How many captchas do you need to solve to complete the scan? If it's only like 1k or something, that's only a couple bux, just pay for it.
Ryan Taylor
It is probably triggered by you sending too many requests in short timespan. Try adding some delay between your requests.
Camden Wilson
lmao thats nothing there's freely available python code that bypasses just this
Isaac Kelly
you fucking moron why do you there is captcha in the first place ?
Angel Russell
10 million possible numbers on just one extension, and there are like 20 or 30 extensions for all different regions for landlines and cellphone providers I could easily wardrive the landlines and then look them up? But paying seems absurdly big waste of money.
Not to mention that this is just a coding challenge per say.
Atlhough I believe that they have done this hastily and that this could be done better.
Another method is implementing a deep convolutional neural net which I have a hard on for doing it. (I think I read a paper long time ago how Google succeeded in solving captachas without segmenting symbols and using image manipulation but with pure neural net)
I thought actually that the captachas would be just one off thing, but seems like once you trigger them then they continue to pop up. Moreover, it isn't triggered by too many requests in short timespan but too many requests in general. A normal person wouldn't look up 100 numbers one after another, no matter what amount of time it takes them. I guess I will start with changing my IP every few dozen tries and then on side fool around with neural net.
sure thing buddy
I think you're missing a verb there
Not really, just dissapointed at amount of pay off after fixing one bug after another for 3 or 4 hours, all ending with a cockblock. Also, public info so it isn't technically illegal ;)
Christopher Wilson
but I sincerely wanted to hear what you guys have been doing
Logan Nguyen
Why did you use selenium? Seems overkill/the wrong tool for the job. Anyways i wrote a python script to download images from nhentai only to discover my old perl script is 1000x times faster than this snakeshit
Owen Turner
I didn't want to go down to the nitty gritty of low level communication with site over socket and honestly don't know what other method is out there.
Speed isn't really my goal at the moment (I just want to see some results), but I will gladly take suggestions on how to speed it up.
What did you use user?
Cameron Torres
for the python shit? just bs/requests but i dont understand how its so much slower than perl
Noah Rivera
sudo nano /etc/proxychains.conf
comment strict chains
un comment random
scrap tons of proxies
save
proxychains python yourshit.py
keep us posted if you need a ton of proxies let me know. if you want to do it yourself look up 'proxybroker' on github or just get a proxy scraper.
10 million possible numbers (under one extension) divided by 10 numbers (I counted today) I will get under every IP before captcha kicks in gives me about 1 million IPs I need to cycle through to go through all 10 mill numbers. Absurdly big number. However, I have seen that after 30 minutes captcha goes away, which gives me idea that I could cycle through few thousand IPs before captcha cooldown for first IP kicks in I could cycle back to that IP, meaning I could reuse a list of IPs again and again? Seems like the best option atm.
I will keep you guys posted, this really is an interesting project.
Next time I will write my shit in requests or beautifulsoup it seems. Thanks user very much.
Carter Garcia
just a quick question does anyone know what this js means? $(function(){ $('#kaptchaImage').click(function () { $(this).attr('src', 'kaptcha.jpg?' + Math.floor(Math.random()*1000)); }) });
Does that mean that there are only 1000 captchas at their disposal? Does that mean that this js gives me ability to choose what captcha I solve?
Cooper Flores
Install Greasemonkey, and write a script to set the captcha to the image source to a specific number in that range. If you find that it indeed is loading the same image every time, you're golden.
Kevin Ross
tried it, nah it seems like kaptcha.jpg?+number is just some kind of internal code for "give me this new captcha" It seems to spit out new captcha with same request Although, found a source code for this captcha on some shitty chink site so I will look into how it works and try proxychain stuff with dynamic_chain instead of random one
Julian Sanders
Have you tried disabling JS in browser while scraping?
Benjamin King
This JS turns on only when you click the captcha image, so that you get a new captcha.
Also, disabling JS on the site will not allow me to search for numbers :/
Carson Williams
Are you fucking retarded? If it's using JS to search for numbers, and is also probably using JS to enforce rate limiting, then just find the endpoint it's hitting, you fucking moron. God damn, no wonder you're a third world nigger frog poster.
Jacob Nguyen
only other JS document in POST is adex.dotmetrics and DeviceInfo.dotmetrics both of which use devide id and session id to identify me. I guess you meant that? Doesn't make any sense since my did and sid is both tethered to my IP, right?
no need to calling each other names
Levi Ramirez
Man, it would be a lot easier for us to help you if you just posted link to website that you are trying to scrape instead of leaving us to guess where problems might be.
William Miller
Try using googlebot as user agent.
Daniel Thomas
a quick update I have downloaded a couple of proxy txt files containing thousands of proxies that don't work and made a quick python script to test if some of these are working by my calculations 60 proxies should be enough for cycle of 30 mins to work, although more than 60 are highly desirable
Which I assume doesn't smell too good. Anywho, does anyone know where I can get 100 reliable proxies? Don't mind if you guys do a Mitm attack on me, just want this project to be over tbh
Carson Powell
It's a placeholder page.
Daniel Lee
Did you even try changing your useragent? Why are you so hell bent on this proxy scheme?
Xavier Peterson
Post the shitty website if you want advice, user. If you were doing this shit from your actual IP, they already know. Or at least dump the form you think is requesting numbers.
Andrew Stewart
bumping the thread, coming back to my coding still have to get something like 100 proxies to succeed in my plan Feeling like sisyphus a little, a song to acompany me in these dark moments: