1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Newbie needs help with scraping

Discussion in 'Black Hat SEO Tools' started by Mr Salt, Oct 1, 2013.

  1. Mr Salt

    Mr Salt Newbie

    Joined:
    Sep 17, 2013
    Messages:
    15
    Likes Received:
    0
    Occupation:
    self employed
    Location:
    Paradise
    So I am using irobot to scrape a website. Right now, it's just an experiment, (for learning) and I'm struggling to understand some key things.

    #1 - I am scraping tables. I want to put this data into a database. But I wonder what to use as a key identifier? Especially if say, there are duplicate values. (like a name) I don't know how to get this data from the website. From what I can see, the only unique field in these tables that I can access, is the URL of each person's name.


    #2 - I know how to scrape data from a specific URL - but I'm trying to automate the process. Seriously, I have no clue how to do that with table data. Ideally, I guess I'd get a list of links, and follow each of them individually to get the data, rather from the table they are compiled into.

    Thanks in advance.
     
  2. DarkPixel

    DarkPixel Jr. VIP Jr. VIP

    Joined:
    Oct 4, 2011
    Messages:
    1,328
    Likes Received:
    1,239
    Location:
    ↓↓↓↓
    Home Page:
    I have never used irobot but I will try to help you:

    1) Use an ID identifier. Increment the id, with each new table row.
    2) I would use regex for that (not sure if irobot supports it, but it should). You regex to find the big table, and then other regex search to find each row. Then easily scrape everything and do what you want with that info.
     
    • Thanks Thanks x 1
  3. leet0r

    leet0r Newbie

    Joined:
    Sep 27, 2013
    Messages:
    11
    Likes Received:
    1
    Hey, I don´t use irobot but I make alot with creating bots and I think,
    "Especially if say, there are duplicate values" can you set an element Offset? the x elment with the same identifier.
    "From what I can see, the only unique field in these tables that I can access, is the URL of each person's name. " Scrape what you want at the Location with this href.. if it´s possible ;)
    Hope it helps
     
  4. Mr Salt

    Mr Salt Newbie

    Joined:
    Sep 17, 2013
    Messages:
    15
    Likes Received:
    0
    Occupation:
    self employed
    Location:
    Paradise
    OK, I have to take a big step back here... Cause I'm in over my head, at the moment.

    First off, irobot is being run from my desktop. I'm just learning. There is nothing server side. Secondly, I'm really not well versed at all in SQL, or any programming language. We're gonna have to take this down to a dummy level. You guys sound like you really know what you're doing. Only problem is, I don't. :D

    Let me try to get on the same page, here. Without programming from scratch, or spending any money - is there some method I can use that would put me on an even keel with the 2 of you? Programming I can learn... But it would be best to be talking apples and apples.

    Thank you for taking the time!
     
  5. leet0r

    leet0r Newbie

    Joined:
    Sep 27, 2013
    Messages:
    11
    Likes Received:
    1
    If you want to do this with Irobot I don´t think so, I don´t know what´s possible with this tool and I don´t know the Website ;)
    fuq in 15 Posts.. can´t send you pm for Skype or something If it´s possible pm me and we can talk.
     
  6. Mr Salt

    Mr Salt Newbie

    Joined:
    Sep 17, 2013
    Messages:
    15
    Likes Received:
    0
    Occupation:
    self employed
    Location:
    Paradise
    Well, this is my 15th post... So I'm off to the races. LOL
     
  7. harubel

    harubel Newbie

    Joined:
    May 26, 2015
    Messages:
    18
    Likes Received:
    0
    I am also facing probs with Scrapebox, all scraper are same ...!
     
  8. andrew1978

    andrew1978 Regular Member Premium Member

    Joined:
    May 13, 2012
    Messages:
    410
    Likes Received:
    24
    Gender:
    Male
    Occupation:
    Internet Marketer
    Home Page:
    SB is great for scraping emails and search engines