I can bypass distil-network anti scraper - What's next?

Status
Not open for further replies.

GrandScraperG

BANNED
Joined
Jan 12, 2019
Messages
52
Reaction score
5
Hi Guys!

I am a security researcher that currently have a solution to bypass all any scraping protection made by distil.

ticketmaster
Skyscanner
edreams
finnair
lufthansa
whitepages
ticketmaster
similarweb
Skyscanner

AND MUCH MUCH MUCH more sites (thousands in numbers)
It's a worldwide solution and i can bypass it in very big scale.

What's next? What should i do with it? and mostly since this solution is expensive, how can i monetise it ?
 

GrandScraperG

BANNED
Joined
Jan 12, 2019
Messages
52
Reaction score
5
Add to this ;-)


How will i gain money? this solution cost me a lot of time and research.

I was thinking that i can make a website that compare between flights tickets since they don't allow to scrape this data.
But it's not me thing and i have no clue how much will i earn if at all.
 

Arc717

Newbie
Joined
Aug 7, 2018
Messages
17
Reaction score
10
jamie3000 I've seen you on that other thread for distill for a while. You haven't gotten a solution for scraping them yet? Does that thing I posted in Python not work for you? or are you looking for a cleaner solution?

Also, GrandScraperG, TripAdvisor already does that. Not saying it isn't possible, just saying it probably isn't a one man job.
 

GrandScraperG

BANNED
Joined
Jan 12, 2019
Messages
52
Reaction score
5
jamie3000 I've seen you on that other thread for distill for a while. You haven't gotten a solution for scraping them yet? Does that thing I posted in Python not work for you? or are you looking for a cleaner solution?

Also, GrandScraperG, TripAdvisor already does that. Not saying it isn't possible, just saying it probably isn't a one man job.
What's important is not just to have a solution, this solution should be handle high load of requests without being detected.
 

Arc717

Newbie
Joined
Aug 7, 2018
Messages
17
Reaction score
10
Selenium can handle a high load of requests without being detected. I believe all you have to do is give each browser a unique profile and a unique proxy and you are set. Granted selenium is a bit heavy on the ram and processing (even in headless) for making basic requests, but it gets the job done. Reversing JS isn't something worth it in my situation. Many people may just need the ability to web-scrape a Distil site, they may not need a high load of requests.

Personally Im after about 50/sec. I haven't gotten around to running it yet b/c I've been busy w/ other work, but I think that's perfectly doable through Python w/ multiprocessing (or through any other language with more than one thread).

Distil Passes halla JavaScript onto your browser. It may not be worth reverse engineering it all, unless you already have the ability for it, and there is major incentive to do it.

Selenium and browsers aren't ideal, but I think they are good enough here.
 

dankepika

Jr. VIP
Jr. VIP
Joined
Nov 22, 2017
Messages
118
Reaction score
36
At what scale? How does cost of operation scales with number of requests with your method?
 

portina

Newbie
Joined
Feb 16, 2019
Messages
6
Reaction score
3
I posted in another thread to you, but my feeling is that you can take this knowledge and apply it in a couple of ways:

- pick a niche and get the data from it, do some analysis and then sell that on to people who would find it useful. Naturally requires marketing/sales to do that
- teach others what you have done and how and they can benefit and move onto another challenge
- 'rent' out your skills to get data for people

I'm personally interested in this as we have been trying to get the data from a Distil-protected site for some time and failing. Hence why I'm even on this forum (though now I'm here its actually very interesting). I'd really like to talk if we can
 
Status
Not open for further replies.
Top