Anyone here advanced in JS?

Discussion in 'General Programming Chat' started by spectrejoe, May 4, 2016.

  1. spectrejoe

    spectrejoe Jr. VIP Jr. VIP

    Joined:
    Sep 25, 2013
    Messages:
    2,485
    Likes Received:
    647
    I wanted to create a script that would get the names + prices of items from one site and save them locally, then I would go to another site do the same and have a node calculate the price difference between them.

    How hard would this be?
     
  2. Frozen27

    Frozen27 Newbie

    Joined:
    Aug 8, 2013
    Messages:
    20
    Likes Received:
    4
    Gender:
    Male
    Occupation:
    Full-stack Developer
    Location:
    Bucharest
    Home Page:
    You need a correlation point (something that correlates the product from site A to equivalent product of site B). I've done this in the past (price tracker), not in JS though. PM me for further info.
     
  3. The Mentalist

    The Mentalist Power Member

    Joined:
    May 8, 2013
    Messages:
    740
    Likes Received:
    271
    Location:
    Inside your head
    This isn't terribly hard. The problem, as anyone who has done lots of web scraping knows, is finding the patterns to consistently and accurately extract data from pages.

    To illustrate, if I am comparing names of a product...

    A simple system will get text inside of this div and compare to the text in the span on another page to make sure the products are right.

    A robust system will shingle and compute the Jaccard similarity of the names, use an algorithm to find likely manufacturing numbers that correlate to the product on the page and compare those between pages. The end result would be a metric/probability that you've identified the same product. Based on a threshold you then decide whether it is or isn't the same product.

    That may seem overkill, but that's often what it takes to get a web scraper to reliably extract data. Because if you make buy/sell decisions from those prices just one bad extraction can cost you big $$$.

    I usually just pray that there is a public API available.
     
    • Thanks Thanks x 1
  4. MrBlue

    MrBlue Senior Member

    Joined:
    Dec 18, 2009
    Messages:
    975
    Likes Received:
    682
    Occupation:
    Web/Bot Developer
    What are the sites in question? This could be done using either server or client side JS.
     
  5. tryagain2day

    tryagain2day Registered Member

    Joined:
    Jan 26, 2015
    Messages:
    87
    Likes Received:
    4

    Newb question. I apologize in advance. How can you tell if a site has a public API? I have been manually listing a wholesaler's catalogue...its slowly dimming my soul
     
  6. The Mentalist

    The Mentalist Power Member

    Joined:
    May 8, 2013
    Messages:
    740
    Likes Received:
    271
    Location:
    Inside your head
    Well it depends on your skill level with web development. Not sure if this makes sense to you, but you can right click on a webpage, then select "Inspect" or "Inspect Element". Once you've done that you should see your browser's developer tools. You want to go to the "Network" section, and check for any XHR or AJAX calls. See where those calls go, and you may find an API endpoint for some of their data. If you don't mind sharing you could just post the URL here.
     
    • Thanks Thanks x 1
  7. tryagain2day

    tryagain2day Registered Member

    Joined:
    Jan 26, 2015
    Messages:
    87
    Likes Received:
    4
    XHR is blank and AJAX column is not even present
     
  8. neuris

    neuris Newbie

    Joined:
    Dec 4, 2017
    Messages:
    5
    Likes Received:
    0
    Gender:
    Male
    That means the site doesn't have a public API, or at least the page you're viewing isn't using one. Because it's not making any AJAX calls, the results are directly rendered into HTML.