~~ Web Data Extracting Software Needed ~~

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
Here's what I would need. PLEASE DON'T PM ME.

A software program that needs to take a batch of part numbers and quantities requested, go to each one of the assigned URL?s, log in (user name and password), and start taking the part numbers and quantities one by one and run a search on their website. The software needs to gather specific data (price, description, quantity on hand, price breaks, etc) and retrieve this data to be imported back into my intranet system. (not sure what format is available for this data to be complied into, but needs to be importable back into my site)

The program needs to be able to run on my server, and needs to be able to be triggered to run by my other code that's in place.
It has to be able to hold a batch of several thousand numbers and quantities, I rather it not have a limit on this.
I also need this program to be editable to change any part of it. (user name, password, trigger to run, format the data is complied to, etc...)

I'd like for whomever is interested in this project to tell me a few things:
- what development tools would you use? (scraperwiki, visual web ripper, etc...)
- Is your program editable by my developers?
- Expandable options and Limitations of the program?
- Anything that you feel I might be missing for this program?
- How many or long have you been working with this type of program? (maybe provide me some samples?)

Now what I ask from everyone else is, to please review the responses, and critique what others are proposing.
I need everyone's opinion and input on this subject.

Thank you all in advance.
 

OMGWTFISTHIS

Power Member
Joined
Sep 20, 2011
Messages
575
Reaction score
964
iMacros could most likely do this. If you know how to script with it, and are able to also combine it with other coding languages, you will be golden.

I personally don't know how to get into the extensive detailing of scripting for it, but I'm sure others can.
 

Shirko

Regular Member
Joined
Aug 11, 2012
Messages
200
Reaction score
175
I've just sent you a PM. I'm sorry, that's the way I work, let me know if you need my help.
 

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
What format does the data on something like these programs usually return?
 

buynsell

Junior Member
Joined
Mar 23, 2009
Messages
112
Reaction score
10
Worked on such type of bots alot,
Youtube auto like/comment
Email responder cycle
Proxy leecher and traffic sender.

More info appreciated with url's from where data have to be extracted and some sample input data !
 

tompots

Jr. VIP
Jr. VIP
Premium Member
Joined
Dec 11, 2011
Messages
5,185
Reaction score
4,445
Website
Auto-Bot-Solutions.com
I am a iMacros programer I can extract your data and
return that data to a csv file on your local machine and
upload the data to a form on your site if needed iMacros
also can be set up to run on its own please let me know if
you are interested in the type of scripting. hear are some of my
bots http://www.blackhatworld.com/blackhat-seo/search.php?searchid=2547899
 

qrazy

Senior Member
Joined
Mar 19, 2012
Messages
1,114
Reaction score
1,734
What's your current server platform and the lanuage your current code is written in? This info will be required if you want to integrate the software with your code/setup and scale it for the future as well.

The program needs to be able to run on my server, and needs to be able to be triggered to run by my other code that's in place.
 

jkwilson78

Regular Member
Joined
Jun 24, 2010
Messages
224
Reaction score
313
What you want to do is probably possible but there is a lot of missing information we would need.

How many different sites do you need to scrape?
If multiple sites we then need to create different rules for scraping each one
Are there rate limits per IP, account, day, hour. etc?
Do you need to solve captchas?
Do you need to rotate proxies or accounts or both?
What is your current system written in that you want to be able to call the scraper from?
Your server setup, windows or linux?

There are a lot more of these types of questions that need to be answered and considered.

Then work depends also on how automated you want to go. If you just want to slowly scrape a list of urls and dump the data to a csv and then import into a database that requires one level of work.

If you want to let the whole deal run on autopilot and do everything behind the scenes with no interevention or manual work on your end to complete certain steps then that takes things to a while other level.
 

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
The current code is in php, mememichi, apache, and some other stuff. I'm not a developer, so I'm not totally sure what else our system uses.


What's your current server platform and the lanuage your current code is written in? This info will be required if you want to integrate the software with your code/setup and scale it for the future as well.
 

qrazy

Senior Member
Joined
Mar 19, 2012
Messages
1,114
Reaction score
1,734
If you want the program to be in the control(tightly coupled) of your system, it's essential to understand your current platform and code, otherwise it may be difficult to enhance anything in the future.

The current code is in php, mememichi, apache, and some other stuff. I'm not a developer, so I'm not totally sure what else our system uses.
 

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
What you want to do is probably possible but there is a lot of missing information we would need.

How many different sites do you need to scrape?
If multiple sites we then need to create different rules for scraping each one
Are there rate limits per IP, account, day, hour. etc?
Do you need to solve captchas?
Do you need to rotate proxies or accounts or both?
What is your current system written in that you want to be able to call the scraper from?
Your server setup, windows or linux?

There are a lot more of these types of questions that need to be answered and considered.

Then work depends also on how automated you want to go. If you just want to slowly scrape a list of urls and dump the data to a csv and then import into a database that requires one level of work.

If you want to let the whole deal run on autopilot and do everything behind the scenes with no interevention or manual work on your end to complete certain steps then that takes things to a while other level.

Great questions, that's what I like.

I need it to hit 3-4 sites.
I don't believe there are any limits on the sites. I've used this type of program in the past, and never had an issue.
No captchas.
No need for proxies.
There really isn't a current system. Just the database that's written in php.
The server is a VPS and I'm not sure which is being used. It's with host gator, level 6.
And yes I would want the entire function to run on autopilot. How I envision the system to work would be that the program the program every 10-15 minutes goes and looks in a batch file, grabs the data, and then starts doing its thing.
The batch setup it would be looking to, my developer will be creating shortly. So I would need to know how the creator would want the data formatted
 

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
If you want the program to be in the control(tightly coupled) of your system, it's essential to understand your current platform and code, otherwise it may be difficult to enhance anything in the future.

My developer would create a batch file for whatever format that would be needed.
The batch setup it would be looking to, my developer will be creating shortly. So I would need to know how the creator would want the data formatted.

Does that help?
 

zelma143

Power Member
Joined
Jun 25, 2010
Messages
573
Reaction score
37
not much getting. but i guess you need to get scraped data from other place and put on your site or server..

if so I can do it in php

sent you pm

thnaks
 

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
OK......... Just did some more research on what I have.

I need this program to run on a CentOS Linux VPS Server.

Hope this helps everyone help me... haha...
 

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
It needs to be run on linux from a command file that takes arguments I supply it
 

youtalk

Regular Member
Joined
Jul 5, 2012
Messages
353
Reaction score
6
It seems as if this hasn't been done on CentOS Linux server.

Everyone is only familiar with Windows server setups. Is there a way to create what I need, and then create code that would convert the program to run on my server?
 

olm75

Senior Member
Joined
Jan 14, 2009
Messages
928
Reaction score
110
can u extract images also from websites with imacros...
 
Top