Gscraper Question

Discussion in 'Black Hat SEO Tools' started by rodol, Mar 27, 2014.

  1. rodol

    rodol Regular Member

    Joined:
    Mar 10, 2010
    Messages:
    417
    Likes Received:
    83
    Location:
    Earth
    can someone tell me why my gscraper is scraping a lot of urls with webcache.googleusercontent.com on the url and how i can get rid of that, thanks.
     
  2. Groen

    Groen Regular Member

    Joined:
    Nov 7, 2009
    Messages:
    397
    Likes Received:
    222
    I wouldn't know why you would get such urls, but I guess you could use something like notepad++ to find and replace whatever text you wish to get rid of with nothing.
     
  3. rodol

    rodol Regular Member

    Joined:
    Mar 10, 2010
    Messages:
    417
    Likes Received:
    83
    Location:
    Earth
    Thanks, i was trying that but the url looks like this:
    numbers after =cache are all different and the keyword is also different so its complicated to clear the actual scraped url.
     
    Last edited: Mar 27, 2014
  4. BloodyDox

    BloodyDox Newbie

    Joined:
    Dec 5, 2015
    Messages:
    9
    Likes Received:
    1
    Nobody got solution on this?
     
  5. rere003

    rere003 Newbie

    Joined:
    Sep 22, 2012
    Messages:
    34
    Likes Received:
    17
    Location:
    New Java
    Use notepad++ to get rid "webcache.googleusercontent.com" using regex.
     
  6. JustUs

    JustUs Power Member

    Joined:
    May 6, 2012
    Messages:
    626
    Likes Received:
    590
    I could show you the basic method that GScraper uses to filter URL's, but it would not solve your problem. So we will spoon feed you.

    > Go to the filter tab

    [​IMG]

    > Select URL include.
    > In the textbox enter what you want to filter out (webcashe), you do not need the Googleusercontent . com, but you can include it if it makes you feel good.
    > Go to the bottom of that groupbox next to "it cant post reply," and click do.
    >Save list back to file.

    This method will filter out what you do not want.
     
  7. Unknown Overlord

    Unknown Overlord Junior Member

    Joined:
    Nov 7, 2009
    Messages:
    122
    Likes Received:
    51
    What footprint are you using for the scrape? And what JustUs posted will do the job but
    it seems like it is something you're using as a footprint.