Announcement

Collapse
No announcement yet.

Preventing google's captcha-requests during "intitle:index.of" searches?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Preventing google's captcha-requests during "intitle:index.of" searches?

    Does anyone know if it's possible to prevent/circumvent Google's annoying habit of throwing up captcha-requests on legit "intitle:index.of" searches?

    ie.



    I'm working on a project that parses the HTML of "intitle:index.of" searches to return the URL of relevant websites via pattern-matching - which will work up until the point that Google decides the search appears to violate their TOS and throws up one of these captcha-requests. It's very annoying because these are legit searches.

    So far, the only thing I've found which will stop it, is to manually clear the browser cache and disconnect / reconnect my internet connection.

    I'm just wondering if anyone has ever come across a way to circumvent this problem, programmatically-speaking? A Lua-based solution would be preferable.

  • #2
    Edit,

    I'll just add this link:

    TL;DR A logic vulnerability, dubbed ReBreakCaptcha, which lets you easily bypass Google’s ReCaptcha v2 anywhere on the web. Overview Back in 2016, I started poking around to see how hard it w…


    ...which makes interesting reading for anyone else curious about this issue.
    Obviously, finding a decent solution is always going to be a bit of a cat-and-mouse game, so i won't get my hopes up.

    Comment


    • #3
      How are you sending the infomations to google if your using Ams's http then that's your problem as AMS's http function sends invailed or out of date headers I would say use curl or wget so you can set up your own headers and the headers google want, so you might have to spoof chromes or firefoxs headers.

      I have done things like this before with google services in the past and hit the same walls and it was down to headers but this might not be the case for you but I thought I say it just in case that is something you over looked.
      Plugins or Sources MokoX
      BunnyHop Here

      Comment


      • #4
        Hey Rex - thanks for responding. I'll get back yo you on this one - need to look into a couple of things first, but won't have time until late tomorrow.
        Cheers, mate.

        Comment


        • #5
          Funny enough after you posted this today in there chrome browser for no reason I got that very same error when searching for a psu very strange
          Plugins or Sources MokoX
          BunnyHop Here

          Comment


          • #6
            Originally posted by kingzooly View Post
            Funny enough after you posted this today in there chrome browser for no reason I got that very same error when searching for a psu very strange
            Ha-ha, the A.I. is watching you.
            Google's just plain EVIL.


            ...

            Okay. (And sorry for the delay in getting back to you, Rex). I've looked into a couple of things and run some more tests. But it's only got me more confused than ever. I really don't get what's going on with this because I can't seem to get consistent and replicate-able results.


            Originally posted by kingzooly View Post
            How are you sending the infomations to google
            Yes, I'm using AMS's HTTP.Submit function (and I noted what you said about outdated headers) but I keep getting different results when I run the same tests. It's really weird. Seems I'm no longer getting blocked with CAPTCHA requests, but instead am now getting blank strings returned.

            For example, if I run the following code:
            Code:
            tValues = {};
            gHTML = HTTP.SubmitSecure("https://www.google.com/search?q=intitle:index.of?pdf lua.programming", tValues, SUBMITWEB_GET, 20, 443, nil, nil);
            TextFile.WriteFromString(_DesktopFolder.."\\Test.htm", gHTML, false);
            File.Open(_DesktopFolder.."\\Test.htm", "", SW_SHOWNORMAL);
            ...it's works just fine for about a half-dozen or so executions. But then it starts returning a nil string. Huh???

            At first, I thought this was just Google blocking repeated requests from the same IP address. But if I reboot my computer, it's then good for another half-dozen or so executions. So, it's not an IP-address thing. Nor is it a cookies/browser-cache thing. Because even if I delete my browsing-history, it'll still makes no damm difference. WTF is going on here??? I'm really confused.

            ...

            Oh, and a couple of other things if I can ask:

            First, (and I'm almost embarrassed to ask this 'cause it sounds like such a dumb n00b question), but:

            when using HTTP.Submit like this
            Code:
            tValues = {};
            gHTML = HTTP.SubmitSecure("https://www.google.com/search?q=intitle:index.of?pdf lua.programming", tValues, SUBMITWEB_GET, 20, 443, nil, nil);
            ...the table of Values is completely superfluous for this (I just don't need it).
            So, how would I code an HTTP.Submit request without the Values? (Other than the way I'm already doing it) ???

            And lastly,
            Are you by chance familiar with XEvil? https://xevil.net/en



            Just thought I'd ask, because it's Lua-based. And presents some intriguing scripts in its program-folder for dealing with annoying CAPTCHA issues.

            Comment


            • #7
              Not heard of XEvil until now and you can just add {} to the function it's self you don't need to add tValues = {} in the table section just add {} but part from them your doing it how you should and that is sending it a blank table.

              I would say try using wget or curl see if that helps better, I am really sleeping well at the moment but when I am with the real world I give my examples a look see if I still have a example for curl or wget
              Plugins or Sources MokoX
              BunnyHop Here

              Comment


              • #8
                Yep, I'm familiar with integrating command-line tools like wGet & Curl (see my old 2007 thread here). But I'd still be keen to get a look at your examples Rex (if you still have something, & if you can spare the time) - thanks mate.

                And regarding the tValues question ...

                Originally posted by kingzooly View Post
                ...you can just add {} to the function it's self ...
                ...Can't believe I missed that one - told ya it was a dumb n00b question!


                Comment

                Working...
                X
                😀
                🥰
                🤢
                😎
                😡
                👍
                👎