1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[RUBY] LinkedIn Scraper for getting more profile hits and connections

Discussion in 'Other Languages' started by plopster, Apr 26, 2014.

  1. plopster

    plopster Newbie

    Joined:
    Feb 13, 2012
    Messages:
    16
    Likes Received:
    3
    I use this script to search for other IT professionals in my area and view their profiles. Especially useful for visiting recruiters, etc who then go to view your profile and send you future job prospects. If you are part of a business that uses LinkedIn and lists their staff, it also helps raise your ranking with that.

    With the code below you need to replace the hxxp://xxx.linkedin.xxx part with the correct address because it wouldn't let me post links.

    Code:
    require 'rubygems'
    require 'mechanize'
    require 'open-uri'
    require 'json'
    
    user = 'linkedinusername or email'
    pass = 'sdsdsd'
    delay = 10
    
    keywords = [
      'information technology brisbane',
      'it brisbane',
      'cio brisbane',
      'cto brisbane'
    ]
    
    url = 'hxxp://xxx.linkedin.xxx/vsearch/f?type=all&trk=vsrp_people_sel&keywords='
    profileurl = 'hxxp://xxx.linkedin.xxx/profile/view?id='
    
    results = []
    loggedin = false
    
    a = Mechanize.new { |agent|
      agent.user_agent_alias = 'Mac Safari'
    }
    a.agent.http.verify_mode = OpenSSL::SSL::VERIFY_NONE
    
    a.get('hxxp://xxx.linkedin.xxx') do |page|
      loggedin_page = page.form_with(:id => 'login') do |form|
        form.session_key = user
        form.session_password = pass
      end.submit
      
      if loggedin_page.title == 'Welcome! | LinkedIn' then loggedin = true end
    end
    
    if not loggedin then print "Couldn't logon to LinkedIn.\n"; exit; end
    
    keywords.each { |x|
      a.get(url + URI::encode(x)) do |page|
        pp url + URI::encode(x)
        
        begin
        
          begin
          jsons = page.body.scan(/\{"person"\:\{.*?\}\}\}/i).each do |match|
            json = JSON.parse(match)
            if json["person"]["fmt_name"] == nil then next end
            
            print "Visiting profile for #{json["person"]["fmt_name"]}\n"
            a.get(profileurl + json["person"]["id"].to_s)
            print "Waiting #{delay} seconds before continuing.\n"
            sleep(delay)
          end
          rescue
          end
          
          is_next_page = false
          next_url = ""
          
          begin
          jsons = page.body.scan(/"resultPagination"\:\{.*?\}\]/i).each do |match|
            json = JSON.parse('{' + match + "}}")
            if json["resultPagination"]["nextPage"] != nil then
              is_next_page = true;
              next_url = "hxxp://xxx.linkedin.xxx" + json["resultPagination"]["nextPage"]["pageURL"]
            end
          end
          rescue
          end
    
          if is_next_page then
            sleep(delay)
            print "Getting next page: #{next_*****\n"
            page = a.get(next_url)
          end
        end while (is_next_page)
      end
    }
     
    Last edited: Apr 26, 2014
  2. roibert

    roibert Regular Member

    Joined:
    Mar 21, 2011
    Messages:
    243
    Likes Received:
    18
    Occupation:
    Telemarketer
    Location:
    Montreal
    Home Page:
    how do you use this script?
     
  3. srcnix

    srcnix Registered Member

    Joined:
    Oct 3, 2013
    Messages:
    51
    Likes Received:
    19
    Update the following vars with your information:

    Code:
    user = 'linkedinusername or email'
    pass = 'sdsdsd'
    delay = 10
    
    keywords = [
      'information technology brisbane',
      'it brisbane',
      'cio brisbane',
      'cto brisbane'
    ]
    
    url = 'hxxp://xxx.linkedin.xxx/vsearch/f?type=all&trk=vsrp_people_sel&keywords='
    profileurl = 'hxxp://xxx.linkedin.xxx/profile/view?id='
    Then run with ruby:

    Code:
    ruby ./filename.rb
     
  4. pxoxrxn

    pxoxrxn Supreme Member

    Joined:
    Dec 21, 2011
    Messages:
    1,397
    Likes Received:
    2,070
    1. Name the script link.rb
    2. Replace the variables that OP said
    2. Put it on your server that has Ruby installed
    3. In the command like run ruby ./link.rb

    Correct?
     
  5. rgambra

    rgambra Newbie

    Joined:
    Mar 12, 2015
    Messages:
    3
    Likes Received:
    0
    After chaning line 74 to
    Code:
    print "Getting next page:"
    Im just getting the keywords url posted in my console. Any ideas? (Links are ok, I changed the "xxx" to post here)

    C:\Sites>ruby C:\Users\usuario\Desktop\linkedinruby.rb
    "xxx.linkedin.xxx/vsearch/f?type=all&trk=vsrp_people_sel&keywords=information%20technology%20brisbane"
    "xxx.linkedin.xxx /vsearch/f?type=all&trk=vsrp_people_sel&keywords=it%20brisbane"
    "xxx.linkedin.xxx/vsearch/f?type=all&trk=vsrp_people_sel&keywords=cio%20brisbane"
    "xxx.linkedin.xxx/vsearch/f?type=all&trk=vsrp_people_sel&keywords=cto%20brisbane"
     
  6. ragnarkar

    ragnarkar Registered Member

    Joined:
    Aug 19, 2015
    Messages:
    59
    Likes Received:
    10
    I'm getting the following error when running on Linux Mint 17.1:

    $ ruby ./linkedinscript.rb
    ./linkedinscript.rb:28:in `block (2 levels) in <main>': undefined method `session_key=' for nil:NilClass (NoMethodError)
    from (eval):23:in `form_with'
    from ./linkedinscript.rb:27:in `block in <main>'
    from /usr/lib/ruby/vendor_ruby/mechanize.rb:434:in `get'
    from ./linkedinscript.rb:26:in `<main>'
     
  7. endoffice

    endoffice Newbie

    Joined:
    Jan 18, 2016
    Messages:
    2
    Likes Received:
    0
    Occupation:
    Server Hosting and Colocation
    Location:
    Boston

    The linkedin page has likely changed since the code was written, so you'll need to update that section of the program. This happens with a lot of scrapers eventually.