Is it possible to use ScraperAPI, ScrapingBee to scrape PAAs from Google via passing JavaScript instructions?

Daemon

Regular Member
Joined
Dec 31, 2009
Messages
209
Reaction score
79
This article says that you can use ScraperAPI to execute JavaScript via a headless browser. I believe JS rendering costs 10 credits.

The ScrapingBee request builder allows you to pass JavaScript instructions. JavaScript rendering costs 5 credits, and if you want to scrape Google you need to pass the custom_google=True parameter, which costs 20 credits.

I'm not entirely sure of the limits of what's possible with these scraping APIs. It is possible to scrape People Also Asked with these APIs? Like SEO Minion at 2:30 to 5:32 of this video.

Is scraping PAAs in bulk something you can do using these scraping APIs?

JS Scenario.jpg
 
It’s also possible without executing the js at all. This technique would be too expensive for paa.

Also, I used scrapingbee before for scraping instagram and I regretted it. I use a different service now.
 
It’s also possible without executing the js at all. This technique would be too expensive for paa.
Is it possible to expand? I know you can get the original 4 questions, but can you expand without executing JS? I didn't think that was possible.
Also, I used scrapingbee before for scraping instagram and I regretted it. I use a different service now.
Would you be willing to share any information on this service, and your regrets with ScrapingBee? Thanks :)
 
Is it possible to expand? I know you can get the original 4 questions, but can you expand without executing JS? I didn't think that was possible.

Would you be willing to share any information on this service, and your regrets with ScrapingBee? Thanks :)
It is also possible to expand the section, but since I am a lazy guy, I just decided to dig deeper into the related searches in a tree structure and since most of the PAAs complete each other in related searches, I get more or less a similar result. I also wanted to scrape related searches with that script so the output was just to-the-point for me.

I can share the service with you, I just didn't want to look like I'm advertising or something. The name is ScrapeDo, I met the lead guy behind that service after I subscribed so I have some technical info about their system, which I liked as a developer.

Most of my ScrapingBee requests failed in my project (not for scraping google), and the response time was slow as hell. It was impossible for me to run that project with ScrapingBee. That's why I didn't like it. I can't say ScrapeDo is the best out there (since I didn't try all providers out there) but In my case, it provided me with the most robust solution since they check when was the last time the proxy was used to ping that specific website before assigning it to the user request, so it's not random proxy per request. It contributes to my success rate. It is also slightly cheaper so It contributed to my numbers game in that project, as well. But I believe you should try the free version before buying, scraping proxy services (all of them) may perform weirdly with some websites.

But for scraping google, I simply use an AWS function that acts as my rotating proxy and pay ridiculously less for my PAAs.
 
It is also possible to expand the section, but since I am a lazy guy, I just decided to dig deeper into the related searches in a tree structure and since most of the PAAs complete each other in related searches, I get more or less a similar result. I also wanted to scrape related searches with that script so the output was just to-the-point for me.

I can share the service with you, I just didn't want to look like I'm advertising or something. The name is ScrapeDo, I met the lead guy behind that service after I subscribed so I have some technical info about their system, which I liked as a developer.

Most of my ScrapingBee requests failed in my project (not for scraping google), and the response time was slow as hell. It was impossible for me to run that project with ScrapingBee. That's why I didn't like it. I can't say ScrapeDo is the best out there (since I didn't try all providers out there) but In my case, it provided me with the most robust solution since they check when was the last time the proxy was used to ping that specific website before assigning it to the user request, so it's not random proxy per request. It contributes to my success rate. It is also slightly cheaper so It contributed to my numbers game in that project, as well. But I believe you should try the free version before buying, scraping proxy services (all of them) may perform weirdly with some websites.

But for scraping google, I simply use an AWS function that acts as my rotating proxy and pay ridiculously less for my PAAs.
Wow great post! Thanks a lot for taking the time to make a very helpful response :)

Can you share some more about this AWS function? I googled and found these pages, but I have no idea if these have anything to do with what you are referring to:

Something about AWS Lambda proxy integrations?:
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-set-up-simple-proxy.htmlhttps://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html
Something about Amazon RDS proxy?:
https://aws.amazon.com/rds/proxy/
 
Wow great post! Thanks a lot for taking the time to make a very helpful response :)

Can you share some more about this AWS function? I googled and found these pages, but I have no idea if these have anything to do with what you are referring to:

Something about AWS Lambda proxy integrations?:
https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-set-up-simple-proxy.htmlhttps://docs.aws.amazon.com/apigateway/latest/developerguide/set-up-lambda-proxy-integrations.html
Something about Amazon RDS proxy?:
https://aws.amazon.com/rds/proxy/
Well, it will not show up when you search for lambda proxy or such stuff, as it isn't an official feature of the lambda. RDS proxy is something entirely different, and Lambda proxy integration is just a misleading term in our context that refers to the api gateway integration part.

I simply use lambda to send a request to the desired url, and return the response as the function's own response. Since it uses a different IP every time it cold-starts, I get to use a rotating proxy with high concurrency for cheap prices. I can provide more info via direct message, it would be more appropriate.
 
Back
Top