New Tool I'm Building

meatro

BANNED
Joined
Nov 21, 2009
Messages
568
Reaction score
1,001
Well, I've been looking for a specific tool for a long time now and have been unable to find it. I've found things that are close, but no cigar.

The ScrapeBox learning mode, iMacros and some others. However, they just don't do what I'm looking for. ScrapeBox comes close, but it's learning mode is just too limited (4 fields).

So here's what I'm building.. Not sure what I'll call it yet, but here's the idea so far:

FORM TRAINING
1. First, the user will need to navigate to the page they'd like to train (registration.)
2. The bot will read all form fields on that page
3. It will then cycle through them all and display the "name" of the current field (and focus on that field, so you can SEE which it's referring to.)
4. While it's displaying a name, the user will be able to designate either a content type or piece of custom content for that field. (Name, email, captcha, etc). Then show the bot where the submit button is.

This will export a line to a text file such as:

Code:
site.com name="username",username name="password",password name="submit",submit

So when it gets to site.com, it knows to type username into name="username", etc and fill the form out... Creating an account. :)

FORM FILLING
This is the magical bit... I believe I've come up with a way to not only create the accounts, but fill them! The form filler will have the same "trainer" as the account creator, with a couple of extra options.. "Click" and "navigate".

1.) The bot will read all form fields on that page.
2.) The bot will display the name attribute of each field (and focus.)
3.) Designate a content type to the form field (about me, url, avatar, etc.)
4.) Show the bot where the submit button is.
5.) Tell the bot where to click or navigate!!!!

It should put out a line like this:

Code:
site.com name="username",username name="password",password name="submit",submit navigate="site.com/edit.php",navigate name="url",homepage name="avatar",avatar name="about",about click="innertext(Save Profile)",click

This means... navigate to site.com, enter the username and password (pulled from the file created by the account creator!) then login. Next, navigate to /edit.php and enter the homepage/avatar and click the "Save Profile" link!

If this all works the way I'm thinking, I could be making a very powerful trainable form filler! This means that you could show this thing how to create an account on virtually any website, then show it how to edit the profile and save what it learns to a text file.

What's that mean? Anytime you or a friend wants to create a profile on that site, fill the profile and get your back link, all you've got to do is run the bot! I'm very excited, I was worried that UBot wouldn't be able to pull something like this off, but I believe the trick is simply text instructions.

Some of the functions I'm integrating...

Spinable usernames "username{0|1|2}{3|4|5}" = username14
Spintax custom fields "{01|02|03}/{29|30}/19{7|8}{1|2} = 01/30/1972
Random/set dropdown/radio name="radio",radio:random,NY,CA,FL or name="radio",radio:NY

Yes.. In any field you'll be able to enter spinable custom text and it will save that to the account creator. So if you get a form field that's not covered by normal signup content (name, email, password, etc) then you can insert your own text.. SPUN!

PS: If you have any ideas for registration form fields (name, email, password) and form field types (text box, check box, radio, dropdown), please let me know. I'd like for this to be as universal as possible.
 
Last edited:
I was working on something very much like this. The plan was to integrate this same idea into a protable DB and keep track/control over the web2.0 profiles so that you could post at anytime to that account. Further more control the linking structure of your tier'ed accounts.

The problem I ran into was this was kind of a massive undertaking. With RL work and IM work (still a noob) it was hard to focus time to the project.

The frame work of the project:

1. Built a tool to manually capture web elements needed for a registration put that in xml format.

2. Build a account creator that goes to the site and uses the respective xml format to pull auto generated user information and populate the page.

3. Run the account creator in a custom webbrowser component(this can be done off screen) so that you can handle different spam protections and to give java script protections a place to run. Simple http request with string parsing does not allow the java script to run and the registration fails on sites with this.

4. Integrate the account creator to use a proxy list. This would help prevent sites from mass banning same IP accounts.

There was a lot more that I had in the works on this and was even going to pull from the web2.0 site that phpbuilt created. The main issue as to why I stopped was because I ran into some lower level coding that I was not prepared for. I am still new in the coding world and along with the magnitude of the project in my head gave me second thoughts.

My work was done in C# and it is very rough in its current state. I may scrap it and start again with less of a "Grand Idea". However sounds like you are on the right track with yours and I wish you the best of luck.

P.S. - To give you my thought on the form fillout part. You have to deal with 3 aspects of a needed web element.
  1. Web Element Type - textbox, radial button, button, blah blah
  2. Web Element Name - You need this to know how to find it on the page.
  3. Web Element Population - You need to know what this element represents.
The element type you have to know in order to deal with it correctly. Populating a textbox is different than populating a radial button or a drop down list. At least in C# there is a whole API on dealing with web elements. Don't remember it off the top of my head but MSDN can help you there.

Because I had already pregenerated a fake person I already had the data for the elements. In testing I hardcoded the elements to the respective data of the user. This is a crappy solution because you need Dynamic user resolution because not all sites are built the same. Name on one site may mean first name and another site it means full name. You don't want to hard code that in the main app so you need to map that distinction in your page layout tool.

Thats it for now as the post is getting a bit long.
 
Last edited:
I don't mind long posts. :)

The bot basically does the same thing as that, it scrapes the page for specific elements using regular expressions. So far I have this:

Code:
((?<=<input\s)(.*?)(?=(>|\s/>))|(?<=<)select\s(.*?)(?=(>|\s/>))(\n|.)+(</|</\s)select>|<button(.*?)(?=(>|\s/>)).+(?<=(</|</\s)button>))

This will return any <input>, <select> and <button> tags (the reason why I'm asking for additional field types.) I'm dealing with different element types using simple IF-THEN statements..

If <button> or then display <button*>THIS TEXT</button>. If <select> then add options to a list, pop up, display options. It's pretty well on track so far.
 
ahh ok I see what you mean. I took a bit different approach. Instead of my tool scraping for elements I had a browser component tool that I manually captured the elements. This was a one time operation which was good until the page elements changed.

I would select the element in the browser with mouse and hit a hot key that would grab the element name or id. Then could link the element to a datasource which returned the needed user information.

I basically built a element handler class that I passed the element object into. The class looked at what the element type was then it handled it the way it should. Helped me because I could extend this on each new element without to much change.

With automating of finding elements on the page I was worried about getting fields that were not needed. Also I am not sure every element that needs input is marked for input. I could be wrong (still a noob). LOL now you have me wanting to go rip open a WEB2.0 site to look a bit deeper at it.
 
This sounds like something that would be very useful... I don't remember, did Magic Submitter have something similar in one of its modules?
 
So if someone uses a customised theme, then this bot will not work because the field name will be different.

How about doing this for specific platforms? Like vBulletin, phpBB, SMF, Drupal etc?

You can create separate modules for each, thus rendering more power and accuracy to each one of them.
 
ahh ok I see what you mean. I took a bit different approach. Instead of my tool scraping for elements I had a browser component tool that I manually captured the elements. This was a one time operation which was good until the page elements changed.

I would select the element in the browser with mouse and hit a hot key that would grab the element name or id. Then could link the element to a datasource which returned the needed user information.

I basically built a element handler class that I passed the element object into. The class looked at what the element type was then it handled it the way it should. Helped me because I could extend this on each new element without to much change.

With automating of finding elements on the page I was worried about getting fields that were not needed. Also I am not sure every element that needs input is marked for input. I could be wrong (still a noob). LOL now you have me wanting to go rip open a WEB2.0 site to look a bit deeper at it.

That's what I was hoping for, but UBot doesn't support things like that (I don't think.)

I was hoping to simply scrape all the form fields and
If -> mouse focus = "field its scraped"
Then -> assign content to "field"

It can't recognize that focus, though. So instead it will just run through a list of the fields on the page and assign a value based on what the user selects from a dropdown.. (Username, Password, Random, etc)
 
did you buy the pro version? are you actually able to get it to really do multi thread? still no pop3 support right?

How do you plan to include the custom sites in it? From what I understand ubot needs all the structure implemented in the bot before and can just read inputs.

So you basically try to find every possible form element for tons of sites and have it loop through all of them and fill the info from a file accordingly?
 
wholly crap.. What geeks................!!(intended as a compliment)
 
I don't know anything about programming and that staff. But wouldn't this be easier if you could save "in the cloud" that form archive? So if anyone creates it for a website, it could be used by another user in the same website.
 
There has to be a way other than proxies to register without allowing javascript, even on sites that demand it...and I'm all ears if someone has any info because nothing's worked so far and if this tool gets it...then insta-buy
 
did you buy the pro version? are you actually able to get it to really do multi thread? still no pop3 support right?

How do you plan to include the custom sites in it? From what I understand ubot needs all the structure implemented in the bot before and can just read inputs.

So you basically try to find every possible form element for tons of sites and have it loop through all of them and fill the info from a file accordingly?

Haha, I'm big cheesing with the 14 day free trial. :)

1.) I will allow the bot to read inputs by using regular expressions...

site.com name="username",username name="password",password

What I will do is... The profile filler will search for the URL using RegEx... Then add everything else in the line to a list, so I'll end up with this:

site.com
name="username",username
name="password",password

Here I will use RegEx to find the form fields (find *=* will result in name="password") and then use RegEx to find everything on that line AFTER the form field (password) which will be the content variable.

And this way... It knows that on site.com, look for the form field name="password" and in this form field, type in password (C:\passwords.txt)

:)

2.) Sort of. Instead of finding ALL the form fields for a bunch of sites, it will simply contain instructions... What to do with THIS field at THIS url.

site.com name="username",username name="password",password
site2.com name="avatar",avatar name="submit",submit
site3.com name="homepage",custom:{Write|Say|Type} {something|words}.

The bot will read these instructions and know to type a username from the list and a password from the list on site.com. On site2.com, it knows to enter an avatar in the avatar field and click submit. On site3.com, it knows to enter the Write something spintax in the homepage field.

:)
 
There has to be a way other than proxies to register without allowing javascript, even on sites that demand it...and I'm all ears if someone has any info because nothing's worked so far and if this tool gets it...then insta-buy

Something like that I think would require some low level coding which is kind of where I got stuck. It is also why I choose to build the app on a browser component (which gives the java script a home).

My stumble was in multi-threading this operation while allowing the browser to use multiple different proxies. You don't want multiple web2.0 accounts showing up under the same IP else you run the risk of them getting deleted. By default the webbrowser uses a single point of reference for its proxy. There are some low level C++ libraries that you have to update some structures and flags to change that instance of browser proxy. That was above my head at the time. Working on understanding that now though.

To try and avoid the sites that have script protections example "script that reports a browser type request has come in". If you are just making string type http requestes the java script will fail because you have no environment to house it in and hence the registration fails.

Now waiting on 20 year coder to blast me for being a noob coder.
 
Something like that I think would require some low level coding which is kind of where I got stuck. It is also why I choose to build the app on a browser component (which gives the java script a home).

My stumble was in multi-threading this operation while allowing the browser to use multiple different proxies. You don't want multiple web2.0 accounts showing up under the same IP else you run the risk of them getting deleted. By default the webbrowser uses a single point of reference for its proxy. There are some low level C++ libraries that you have to update some structures and flags to change that instance of browser proxy. That was above my head at the time. Working on understanding that now though.

To try and avoid the sites that have script protections example "script that reports a browser type request has come in". If you are just making string type http requestes the java script will fail because you have no environment to house it in and hence the registration fails.

Now waiting on 20 year coder to blast me for being a noob coder.

Haha, I'm not a coder whatsoever. I had a custom bot built in C# a couple years ago, I configured the GUI and that's it. :P

Wouldn't UBot be able to do all of what you guys are referring to, though? It browses using a preset agent (Chrome/IE/FF) and also can execute javascript commands... Couldn't you set up a javascript that shut down all other javascripts?

Iunno.
 
Yeah that gets into some areas that I am simply to new in to answer. If UBot uses a browser then you are fine and don't have to worry about it. That is the whole point of using a browser was to over look the javascript stuff.

I have not read UBot or know anything about it. From the sound of it they automate the browser maybe in the open(meaning the browser is up on screen). There is certainly nothing wrong with that. I choose to have stuff more hidden until you need it.

Yeah on the coding experience, I got into coding because I wanted to write bots for MMOs. Have not really written anything yet. So maybe about 4 months out of school experience. Some of my online friends were into internet marketing so I am giving that a shot omg I suck at it.... Anyway there are certainly some coding oppurtunities here for bots/scripts apps. Sounds like you may well be on your way to having something that others want.
 
indeed :D couldn't understand a thing...

Haha... Here you go...

I'm making a tool that you can navigate to a URL (registration page) and "teach" to fill the form out (show it which field is the name, email, etc) and it will "remember" how to do this.

So if you want another account on that site.. Simply run the tool.

One better, it will also be able to "remember" how to edit a profile (to insert your bio, homepage, etc) even if it requires navigating multiple pages to do.

It 'remembers' this by saving a set of commands to a text file... Here's the great thing, if you and a friend are both cruising around "teaching" this thing different websites, you guys can combine your text files.

Poof, now you both have fully edited profiles on both websites. :)

[edit]

Form fields on registration pages that it currently recognizes:
username, first name, last name, email, password, captcha, custom.

Again, it will type in whatever you tell it to in a custom field. Either a set string like "John" or spintax like {John|Jake|Joe}. :)

[edit]

Hmm.. And now I'm wondering if the bot should just be ONE function, fed by TWO files. Especially now that I'm running into a few multi-page registrations. (Fill form, submit, fill some more forms, submit.)

Really there's no need for a "account" creator bot if the bot itself is capable of being trained on and performing forms. The same bot that can edit profiles can register accounts. Hmm. *just wasted a day... kinda* :P
 
Last edited:
Well, thanks to Mediacom's wonderful internet service which hasn't worked at my house for 2 days.. I had time last night to come up with a whole new platform for this. :)

I came up with some very cool stuff that I'll be implementing...

Training Fields
This will be the smaller part of the tool, just a very simple screen with 4-5 commands

[DROPDOWN] Field names
- This will contain the "name=" attribute of every form field on the page. You can dropdown and select any of them, it will focus on that field for confirmation.

[DROPDOWN] Content
- This will assign a specific content type to a form field (name="username" assign to "type in the username.") It will include (so far), firstname, lastname, username, password, email, about me, avatar, company, captcha, custom.
- Pay attention to the "custom" one, it gets very cool.

[TEXT BOX] Custom Content
- When "custom" is selected from the content dropdown, then the text in this box will be assigned to the selected form field. You can type in any spintax content here or A CUSTOM VARIABLE!!
var=Twitter @blackhat
var=Link2 <a href="site.com/link2.html">Link 2</a>

- Just wait, I'll tell you where these variables are going. :)

[DROPDOWNS] Misc.
- URLs - This dropdown will contain all <a> link tags.
- Commands
----- Click, this will click the <a> tag selected from the URL dropdown. (Or activate radio/checkbox selections.)
----- Navigate, this will navigate to the current URL loaded.
- Select Dropdown - This will contain the selections if a dropdown field name is selected.
- Captcha - This will contain the most likely tags that contain the captcha image. (15 attribute tags to the right and left of the field you designate as the captcha text.)

Filling Out Content
This part will appear a bit more complex, but like any SEO tool, once you have it set up correctly, you'll be able to basically leave the tool alone.

1. Assign your content types to specified text files. (Firstnames.txt, lastnames.txt, etc).
2. Tell the file where to find your text file created by the training tool.
----- Show it where to find the file for creating accounts. (Signup)
----- Show it where to find the file for posting content. (Edit Profile)
3. Show it where to find your Proxies.txt file.
4. Tell it what user-agent to browse with (Chrome, IE, Firefox).
5. Specify minimum and maximum browsing intervals (time spent on pages.)

Once you've set up these options, you'll be able to "Save Profile" so you don't have to set them again. :)

(remember the custom variables???)

Also, your profile will store your custom variables (var=Twitter @blackhat), anytime the tool comes across a page where it's instructed to enter var=Twitter, it will type in your Twitter name... You can use this to set up just about any type of custom content that you find yourself entering repeatedly.. Twitter, Facebook, specific links, some other obscure weird thing that you enter regularly.

This thing is going to be awesome... Time to get started. :)
 
You're on the right track ;) The guys over at IgniteSeo did sort of the same thing you are suggesting already. Basically have a bunch of prebuilt fields that contain certain reg ex values that the program checks against, and even allows the user to add their own too with a learning mode.

Head over to igniteresearch and watch their vids on how they do what they do. It might give you some inspiration
 
There is an important thing to manage with platforms like WordPress :
For example with WordPress, you must find the root URL of the website to find the wp-comment.php page (where you use the POST with parameters).

Beny
 
Back
Top