1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

~~ Web Data Extracting Software Needed ~~

Discussion in 'Hire a Freelancer' started by youtalk, Dec 9, 2012.

  1. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    Here's what I would need. PLEASE DON'T PM ME.

    A software program that needs to take a batch of part numbers and quantities requested, go to each one of the assigned URL?s, log in (user name and password), and start taking the part numbers and quantities one by one and run a search on their website. The software needs to gather specific data (price, description, quantity on hand, price breaks, etc) and retrieve this data to be imported back into my intranet system. (not sure what format is available for this data to be complied into, but needs to be importable back into my site)

    The program needs to be able to run on my server, and needs to be able to be triggered to run by my other code that's in place.
    It has to be able to hold a batch of several thousand numbers and quantities, I rather it not have a limit on this.
    I also need this program to be editable to change any part of it. (user name, password, trigger to run, format the data is complied to, etc...)

    I'd like for whomever is interested in this project to tell me a few things:
    - what development tools would you use? (scraperwiki, visual web ripper, etc...)
    - Is your program editable by my developers?
    - Expandable options and Limitations of the program?
    - Anything that you feel I might be missing for this program?
    - How many or long have you been working with this type of program? (maybe provide me some samples?)

    Now what I ask from everyone else is, to please review the responses, and critique what others are proposing.
    I need everyone's opinion and input on this subject.

    Thank you all in advance.
     
  2. OMGWTFISTHIS

    OMGWTFISTHIS Power Member

    Joined:
    Sep 20, 2011
    Messages:
    575
    Likes Received:
    960
    Occupation:
    IM! :D
    Location:
    Florida
    iMacros could most likely do this. If you know how to script with it, and are able to also combine it with other coding languages, you will be golden.

    I personally don't know how to get into the extensive detailing of scripting for it, but I'm sure others can.
     
  3. Shirko

    Shirko Junior Member

    Joined:
    Aug 11, 2012
    Messages:
    193
    Likes Received:
    172
    Location:
    adding monkeys to my papal
    I've just sent you a PM. I'm sorry, that's the way I work, let me know if you need my help.
     
  4. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    What format does the data on something like these programs usually return?
     
  5. buynsell

    buynsell Junior Member

    Joined:
    Mar 23, 2009
    Messages:
    112
    Likes Received:
    10
    Worked on such type of bots alot,
    Youtube auto like/comment
    Email responder cycle
    Proxy leecher and traffic sender.

    More info appreciated with url's from where data have to be extracted and some sample input data !
     
  6. tompots

    tompots Elite Member Premium Member

    Joined:
    Dec 11, 2011
    Messages:
    4,352
    Likes Received:
    3,955
    Gender:
    Male
    Occupation:
    Full Time Bot Developer
    Location:
    Professional Botters
    Home Page:
    I am a iMacros programer I can extract your data and
    return that data to a csv file on your local machine and
    upload the data to a form on your site if needed iMacros
    also can be set up to run on its own please let me know if
    you are interested in the type of scripting. hear are some of my
    bots http://www.blackhatworld.com/blackhat-seo/search.php?searchid=2547899
     
  7. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    PM you.

     
  8. qrazy

    qrazy Senior Member

    Joined:
    Mar 19, 2012
    Messages:
    1,113
    Likes Received:
    1,712
    Location:
    Banana Republic
    What's your current server platform and the lanuage your current code is written in? This info will be required if you want to integrate the software with your code/setup and scale it for the future as well.

     
  9. jkwilson78

    jkwilson78 Regular Member Premium Member

    Joined:
    Jun 24, 2010
    Messages:
    224
    Likes Received:
    311
    What you want to do is probably possible but there is a lot of missing information we would need.

    How many different sites do you need to scrape?
    If multiple sites we then need to create different rules for scraping each one
    Are there rate limits per IP, account, day, hour. etc?
    Do you need to solve captchas?
    Do you need to rotate proxies or accounts or both?
    What is your current system written in that you want to be able to call the scraper from?
    Your server setup, windows or linux?

    There are a lot more of these types of questions that need to be answered and considered.

    Then work depends also on how automated you want to go. If you just want to slowly scrape a list of urls and dump the data to a csv and then import into a database that requires one level of work.

    If you want to let the whole deal run on autopilot and do everything behind the scenes with no interevention or manual work on your end to complete certain steps then that takes things to a while other level.
     
  10. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    The current code is in php, mememichi, apache, and some other stuff. I'm not a developer, so I'm not totally sure what else our system uses.


     
  11. qrazy

    qrazy Senior Member

    Joined:
    Mar 19, 2012
    Messages:
    1,113
    Likes Received:
    1,712
    Location:
    Banana Republic
    If you want the program to be in the control(tightly coupled) of your system, it's essential to understand your current platform and code, otherwise it may be difficult to enhance anything in the future.

     
  12. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    Great questions, that's what I like.

    I need it to hit 3-4 sites.
    I don't believe there are any limits on the sites. I've used this type of program in the past, and never had an issue.
    No captchas.
    No need for proxies.
    There really isn't a current system. Just the database that's written in php.
    The server is a VPS and I'm not sure which is being used. It's with host gator, level 6.
    And yes I would want the entire function to run on autopilot. How I envision the system to work would be that the program the program every 10-15 minutes goes and looks in a batch file, grabs the data, and then starts doing its thing.
    The batch setup it would be looking to, my developer will be creating shortly. So I would need to know how the creator would want the data formatted
     
  13. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    My developer would create a batch file for whatever format that would be needed.
    The batch setup it would be looking to, my developer will be creating shortly. So I would need to know how the creator would want the data formatted.

    Does that help?
     
  14. zelma143

    zelma143 Power Member

    Joined:
    Jun 25, 2010
    Messages:
    571
    Likes Received:
    37
    Occupation:
    PHP programmer,Bot maker,iMacro script maker
    not much getting. but i guess you need to get scraped data from other place and put on your site or server..

    if so I can do it in php

    sent you pm

    thnaks
     
  15. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    OK......... Just did some more research on what I have.

    I need this program to run on a CentOS Linux VPS Server.

    Hope this helps everyone help me... haha...
     
  16. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    It needs to be run on linux from a command file that takes arguments I supply it
     
  17. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    Is anyone good with Linux programming?
     
  18. saturnx08

    saturnx08 Jr. VIP Jr. VIP

    Joined:
    Nov 18, 2012
    Messages:
    332
    Likes Received:
    34
    Gender:
    Male
    PM me
     
  19. youtalk

    youtalk Regular Member

    Joined:
    Jul 5, 2012
    Messages:
    337
    Likes Received:
    6
    Occupation:
    Owner
    Location:
    I don't even know anymore
    It seems as if this hasn't been done on CentOS Linux server.

    Everyone is only familiar with Windows server setups. Is there a way to create what I need, and then create code that would convert the program to run on my server?
     
  20. olm75

    olm75 Power Member

    Joined:
    Jan 14, 2009
    Messages:
    746
    Likes Received:
    82
    can u extract images also from websites with imacros...