Scrapejet Videos - How Tos and more

Cool link lock feature.
Link lock is limited to 3 keywords or I can just put in the form without loading a file:

Code:
http://www.website.com {kw1|kw2|kw3|kw4|kw4|kw5|...|kw20}
?
 
Cool link lock feature.
Link lock is limited to 3 keywords or I can just put in the form without loading a file:

Code:
http://www.website.com {kw1|kw2|kw3|kw4|kw4|kw5|...|kw20}
?

linklock is not limited, add as many kw as u like..
 
it seems that SJ is now integrated with captcha sniper.
But how do you set it up? SJ has a check box that activates when it detects CS.

The question is, what settings do we choose in CS (beta):
There are 3 possible settings:
- use host file redirect
- use clean image algorithms
- Use CSSE

:confused:
 
it seems that SJ is now integrated with captcha sniper.
But how do you set it up? SJ has a check box that activates when it detects CS.

The question is, what settings do we choose in CS (beta):
There are 3 possible settings:
- use host file redirect
- use clean image algorithms
- Use CSSE

:confused:

In tests using CS 1.5 Beta, we did not enable any of the 3 you mentioned.
Jet communicate with CS directly, without host file redirect.

Also note that there have been new platforms added (downloadable).
 
Not using private proxies for scraping is a common misconception, you just have to do it right. Keep your connections at 20% or less of your proxies and you can pretty much scrape forever. Its faster then dealing with checking public proxies and then dealing with all the failure rates and low speed.

QFT. 1.0.0.17 is a KILLER update for Scrapejet. It's the one that breaks the bank. importerd proxy sopport (so Goblin and Multiply can be used) Captcha sniper supported (excellent) some extra functionality on captchas (the "add two numbers together" one - solved within the software) add to this the availability of footprints and platforms, SJ is now a sodding powerhouse for second and third tier linking

AS for the above quote - it's true.

If you have 10 proxies - use 2 threads. It works just fine. You'll still get 30-40 URL's a second with good keywords.
However, I would add I get GREAT results running 1 thread per proxy through Yahoo and just ignoring Google altogether. I realize that's not popular - and I maybe get 20% fewer results, but when you run with 500 keywords and get close to a half a million results...well I get all the ones that matter..enough anyway. and 450,000 can take well over an hour even when using 50 threads. I imagine it would take a half a day or so using 3 or 5.

SJ is now powering on with this, and as an alternative to those $97 dollar a month SEO deals that promise you "1000 links a day"...buying SJ is a no-brainer - essp with Sweetfunnys rep for keeping software working.

Great thread. I'll keep popping back

Scritty
 
Last edited:
In tests using CS 1.5 Beta, we did not enable any of the 3 you mentioned.
Jet communicate with CS directly, without host file redirect.

Also note that there have been new platforms added (downloadable).

It seems the captcha-sniper integration does not work for me, as no captchas are being solved. But that's when I resume projects. In order for me to use captcha-sniper to I have to start a project from beginning?
 
It seems the captcha-sniper integration does not work for me, as no captchas are being solved. But that's when I resume projects. In order for me to use captcha-sniper to I have to start a project from beginning?

Make sure you have the corresponding platforms checked in CS, like WP, Lifetype (there's a typo in CS, I thinks its lifetime there).

Does ScrapeJet recognize CS and enabled the checkbox on the config screen?

CS integration might not work with some platforms yet, because the definition files have not been updated to include the captcha identifier.

WordPress def includes currently sicaptcha and recaptcha, other captcha types will be added over time (as soon they are encountered).
Same is valid for BlogEngine and other platforms. There might be no captcha identifier yet in the def file, but they will be added over time (Captchas CS support and can solve).
Important was to integrate the CS communication first. Adding captchas is then done by just tuning the def files.

In some tests yesterday, using LifeType platform, we could double the success rate due to CS integration.
 
Make sure you have the corresponding platforms checked in CS, like WP, Lifetype (there's a typo in CS, I thinks its lifetime there).

Does ScrapeJet recognize CS and enabled the checkbox on the config screen?

CS integration might not work with some platforms yet, because the definition files have not been updated to include the captcha identifier.

WordPress def includes currently sicaptcha and recaptcha, other captcha types will be added over time (as soon they are encountered).
Same is valid for BlogEngine and other platforms. There might be no captcha identifier yet in the def file, but they will be added over time (Captchas CS support and can solve).
Important was to integrate the CS communication first. Adding captchas is then done by just tuning the def files.

In some tests yesterday, using LifeType platform, we could double the success rate due to CS integration.

Yes SJ recognized CS and it's checked.

And the correct checkboxes for blog and pixelpost is checked. Maybe it's the def files not having a captcha identifier.
 
Yes SJ recognized CS and it's checked.

And the correct checkboxes for blog and pixelpost is checked. Maybe it's the def files not having a captcha identifier.

I assume it is the def file. You can check it by looking for "imagecaptchaidentifier" in the def file.

For a simple test, try just the lifetype platform. Having CS and Jet side by side, you should see how the captchas get submitted.
 
I assume it is the def file. You can check it by looking for "imagecaptchaidentifier" in the def file.

For a simple test, try just the lifetype platform. Having CS and Jet side by side, you should see how the captchas get submitted.

I have it only on wordpress which is not blogengine or movable. Do I have to re-download the updated .def files from somewhere?

But even wordpress hasn't solved any captchas after running SJ over night.

Maybe it doesn't work when resuming project? Maybe I need to start the project from beginning?
 
I have it only on wordpress which is not blogengine or movable. Do I have to re-download the updated .def files from somewhere?

But even wordpress hasn't solved any captchas after running SJ over night.

Maybe it doesn't work when resuming project? Maybe I need to start the project from beginning?

No, it does not matter you resume or restart.

If ever there's a new def file, it will appear in the list which popup when you click "Install definitions" on the config page.

Just to the following test:

Download the lifetype definition file
Enable only the lifetype def file on the config screen, uncheck all other
Make sure CS is running and detected by Scrapejet
Create a project, use a simple keyword like "test"
Add the project and run it. Having CS and Jet side by side, you should see lots of captchas sent to CS.

We will add a "Captcha Test" function soon, which allow to send some test captchas to CS to verify the integration is working.
 
No, it does not matter you resume or restart.

If ever there's a new def file, it will appear in the list which popup when you click "Install definitions" on the config page.

Just to the following test:

Download the lifetype definition file
Enable only the lifetype def file on the config screen, uncheck all other
Make sure CS is running and detected by Scrapejet
Create a project, use a simple keyword like "test"
Add the project and run it. Having CS and Jet side by side, you should see lots of captchas sent to CS.

We will add a "Captcha Test" function soon, which allow to send some test captchas to CS to verify the integration is working.

Ok will try it now. Thanks for all the support!

I also see that pixelpost doesn't have a captcha-identifies. I guess those things will be added in the future.

Strange this is, I ran wordpress over-night and no captchas at all got sent to CS from ScrapeJet for wordpress captchas...
 
I know ScrapeJet developers have so many things already in their todo list, and I just want to make their life a little harder.

So... Feature request:
Would be gread "Assign A folder with lists to post". Instead of using a 500.000 list, we can split that in 1.000 lists and ScrapeJet will post / verify links for each 1.000 list.
Advantages: less crush using small lists, quicker results in "Backlinks" folder.

Thank you :D

Edit:
Ahh.. I forgot anothe one: add a little tiny option to use "bad words" list only to filter urls, not page content. I try to avoid links from adult sites to my child related sites, but I don't really care about neighbor links.
Now if I will experience hiccups i'll know why :)
 
Last edited:
I know ScrapeJet developers have so many things already in their todo list, and I just want to make their life a little harder.

So... Feature request:
Would be gread "Assign A folder with lists to post". Instead of using a 500.000 list, we can split that in 1.000 lists and ScrapeJet will post / verify links for each 1.000 list.
Advantages: less crush using small lists, quicker results in "Backlinks" folder.

Thank you :D

Will be added to the todo list

Edit:
Ahh.. I forgot anothe one: add a little tiny option to use "bad words" list only to filter urls, not page content. I try to avoid links from adult sites to my child related sites, but I don't really care about neighbor links.
Now if I will experience hiccups i'll know why :)

If I do not misunderstand you, the blacklist (not bad words list) does exactly what you want.
 
You are great!

For bad words I'm a little confused too. I can't remember now where I read this option is skipping pages containing in content "bad words". If it's just filtering urls containing "bad words" it's what I need.
 
Very informative thread, and i would also like to know answer to above question
 
You are great!

For bad words I'm a little confused too. I can't remember now where I read this option is skipping pages containing in content "bad words". If it's just filtering urls containing "bad words" it's what I need.

Beside the checkboxes for bad words and black list is a image/icon. Hover the mouse over it and it will tell you for what the function is.

But to make it short:
- Bad Words list will filter pages where the content contains one or more of the words in the bad words list.

- Black list will filter urls containing any of the words listed in the black list (thats what you want).
 
Nice thread with lots of info. I wonder if anyone else would find this useful for a suggestion in scrapejet.

1. Submit to x number of blogs per day (limit amount it posts in a 24hr period)
2. submit to a domain x number of times. (so you can define only posting to a domain 1 or 2 times rather than 5 or 10+

Would this be something that would make scrapejet even better?
thanks
 
Just bought SJ a day or two ago. As I'm not a heavy user and only want to use it moderately as a supplemental backlink builder... is it necessary to have CaptchaSniper ?
 
Back
Top