1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Anyone scraped LinkedIn before? I have a few questions...

Discussion in 'Linkedin' started by jamie3000, Dec 9, 2016.

  1. jamie3000

    jamie3000 Supreme Member

    Joined:
    Jun 30, 2014
    Messages:
    1,311
    Likes Received:
    586
    Occupation:
    Finance coder looking for semi-retirement
    Location:
    uk
    So I'm in the early stages of coding a scraper for linkedin. Its actually for pretty legit reasons (not spammy in any way) but unfortunately due to amount of spamming in LinkedIn they're very anti scaper. They use sentinel and all kinds of things.

    My end goal is to be able to scrape all the people's names and positions that work for specific companies.

    So far I've noticed residential IP addresses seems to be able to view more information without logging in but its much easier to scrape the data I need by logging in.

    So my question is if I log in from a brand new account from a server IP with zero connections how much data can I scrape if I'm using something like PhantomJS before they limit my account?

    If anyone has any experience with scraping LinkedIn I'd really like to hear their thoughts.

    Thanks! :)
     
  2. rabbihamster

    rabbihamster Newbie

    Joined:
    Jan 16, 2011
    Messages:
    23
    Likes Received:
    0
    LinkedIn is completely useless without a profile with decent connections.
    if u search with a low or 0 connected profile.. u get generic results like "Linkedin Connection".. instead of full names.

    so basically u have to build out a well connected profile(S).. that takes time and care.

    P.S. What is sentinal?
     
  3. jamie3000

    jamie3000 Supreme Member

    Joined:
    Jun 30, 2014
    Messages:
    1,311
    Likes Received:
    586
    Occupation:
    Finance coder looking for semi-retirement
    Location:
    uk
    It's part of the technology they use for limiting users activities and checking for bots.

    I don't have to be connect to a company to browse it's employees do I?
     
  4. bartosimpsonio

    bartosimpsonio Jr. VIP Jr. VIP Premium Member

    Joined:
    Mar 21, 2013
    Messages:
    12,055
    Likes Received:
    10,833
    Occupation:
    WHEREZ MA
    Location:
    BITCOINS AT?
    Home Page:
    You're gonna need tons of LI accounts each one with its own private proxy looking like a legit user logging in from the same IP.

    Not an easy or cheap operation, but of course it's doable.
     
  5. rabbihamster

    rabbihamster Newbie

    Joined:
    Jan 16, 2011
    Messages:
    23
    Likes Received:
    0
    @https://www.blackhatworld.com/members/jamie3000.786484/ - you will need 3rd deg connections to those people. just Try and see.

    i dont think u will need TONS. ( depends on your desire for speedy results or not )
    just a handful should be enough ..if they are setup to all work together on 1 search campaign/query.
    but yes it is delicate work/setup.
     
  6. BloodyNinja

    BloodyNinja Power Member

    Joined:
    Oct 28, 2013
    Messages:
    583
    Likes Received:
    547
    Location:
    Deeptown
    close to zero.

    You will need some aged accounts with networks and have to figure out hot to get into the networks of your targets. Scraping and connection request spamming on LinkedIn go hand in hand.
     
  7. outscrape

    outscrape Jr. VIP Jr. VIP

    Joined:
    Nov 23, 2016
    Messages:
    118
    Likes Received:
    76
    Hi jamie,

    I've seen co's with recruiting operations using outsourcing to send thousands of messages per day and attempting to switch to automation of LI. Due to security, they were measuring every type of variable from time of login and useragent and IP to company and number of connections to try to determine how to send messages inside LI with new accounts without getting banned.

    if you're scraping more than a few hundred pieces of contact info you'll want to start by measuring everything - how many accounts you can hit, what your user agent is, when you made the account, how many connections it has, etc. I wish I could tell you exactly how many accounts you can view/scrape, but unfortunately that's too valuable for most to share even if they did know it. Still, you are on the right track I think.
     
  8. jamie3000

    jamie3000 Supreme Member

    Joined:
    Jun 30, 2014
    Messages:
    1,311
    Likes Received:
    586
    Occupation:
    Finance coder looking for semi-retirement
    Location:
    uk
    That's really interesting, thanks man. They seem far more sensitive to automation than any other website ive scraped and I've scraped a lot! It's actually only the publicly accessible company profile pages I need now but that's still got a lot of blocking on. Think a big pool of proxies and rotating user agents should work though :)
     
    • Thanks Thanks x 1
  9. rogerebert

    rogerebert Registered Member

    Joined:
    Feb 11, 2010
    Messages:
    79
    Likes Received:
    14
    The real trick @jamie3000 is to get someone with a LinkedIn Recruiter account to add you as a "hiring manager." This lifts most limits on the account and gives you free access to up to 50 pages of search results. This works even with the new UI changes.

    So you need to find companies that have LIR accounts. There are some unlisted groups like this one:
    https://www.linkedin.com/groups/6504658/members

    Recruiters will connect with many rare skilled people even if the profile is suspiciously shallow.
     
  10. Miiskit

    Miiskit Newbie

    Joined:
    Dec 3, 2015
    Messages:
    15
    Likes Received:
    3
    Please keep us posted on your results
     
  11. Dia77777

    Dia77777 Registered Member

    Joined:
    Mar 13, 2017
    Messages:
    67
    Likes Received:
    12
    Gender:
    Male
    You can scrape data if the information is open. Probably the easiest and best way to do it is by using a proper software but I don't have anything in particular to reccomend, tbh.