Can they see the Python source code?

BlueWhale27

Registered Member
Joined
May 10, 2011
Messages
63
Reaction score
6
A hypothetical situation…
Let's say that YOU are a Huge company, with Lots of available money to pay programmers for Anything.
I am a little guy, and I have a Python script that scrapes data from Your website.
You do NOT want me to scrape records from Your website.
Is there a way for You to see the source code of my Python script, as it is running?
Hypothetically, of course…
Thanks.
BlueWhale27
 

Ecodor

Senior Member
Joined
Nov 5, 2017
Messages
858
Reaction score
200
If the owner can track your connection than he can do whatever with your PC if he have the knowledge so be careful :)
 

Aprium

BANNED
Joined
Feb 23, 2018
Messages
66
Reaction score
44
A hypothetical situation…
Let's say that YOU are a Huge company, with Lots of available money to pay programmers for Anything.
I am a little guy, and I have a Python script that scrapes data from Your website.
You do NOT want me to scrape records from Your website.
Is there a way for You to see the source code of my Python script, as it is running?
Hypothetically, of course…
Thanks.
BlueWhale27

They can only see the http(s) requests your Python script sends, and of course the ip where it originated from.

Nothing else, of course they can figure out if your using a Python script if you are not careful and you leave the default user-agent from a library like Requests.

Programmers aren't magicians, and just because they can add more programmers does not mean that things get build twice as fast with higher quality.

Trust me, get a good source or proxies, and you should be good. Scrape away, and good luck!
 

davids355

Super Moderator
Staff member
Moderator
Jr. VIP
Joined
Apr 25, 2011
Messages
18,509
Reaction score
24,349
Website
www.blackhatworld.com
If the owner can track your connection than he can do whatever with your PC if he have the knowledge so be careful :)

I’m sure the big company are not going to employ hackers to illegally break into the computer of a lone web scraper though.

So I’d say no, in reasonable circumstances they’re not going to be able to see the code behind your script.
 

BlueWhale27

Registered Member
Joined
May 10, 2011
Messages
63
Reaction score
6
Wow. Thanks for the fast replies, people.
"Track my connection"... Do you mean... simply knowing my IP? Or, would he need additional info?
Actually, I wasn't asking about how You (a Great programmer) could get everything on my PC, like my credit card info...
I'm asking if You can see the Source code of my script when it is running?

So they could NOT see that source code, unless they get into my entire computer?
AND That's only IF I don't use a proxy or some other method of hiding...?
 

Javardo69

Junior Member
Joined
Jul 19, 2014
Messages
117
Reaction score
9
they would have to hack your computer, check your scripts if you have a url into their website. I believe only linkedin tries to sue scrapers, but its only by ip.
 

Kuem

Power Member
Joined
Dec 29, 2017
Messages
676
Reaction score
220
Wow. Thanks for the fast replies, people.
"Track my connection"... Do you mean... simply knowing my IP? Or, would he need additional info?
Actually, I wasn't asking about how You (a Great programmer) could get everything on my PC, like my credit card info...
I'm asking if You can see the Source code of my script when it is running?

So they could NOT see that source code, unless they get into my entire computer?
AND That's only IF I don't use a proxy or some other method of hiding...?
Just your ip. If you want a secure vpn mullvad is the way to go. $5 a month, quick and i don't think they log
 

Aprium

BANNED
Joined
Feb 23, 2018
Messages
66
Reaction score
44
Wow. Thanks for the fast replies, people.
"Track my connection"... Do you mean... simply knowing my IP? Or, would he need additional info?
Actually, I wasn't asking about how You (a Great programmer) could get everything on my PC, like my credit card info...
I'm asking if You can see the Source code of my script when it is running?

So they could NOT see that source code, unless they get into my entire computer?
AND That's only IF I don't use a proxy or some other method of hiding...?

The "hiding" part depends on how many layers you have between the proxy/ip you make the request and you. The source code part is completely irrelevant, that part doesn't even come into the equation. You have to essentially picture that when you are talking to a website, you are just sending information over the wire, mostly http(s) requests. Nothing else, doesn't matter if you send them with a browser, a bot, or some other application.

In other words, all these programmers or companies see are HTTP(s) requests, nothing else. They have servers that receive connections from clients(you and other people) that send HTTP(s) requests. It's completely impossible to know the source code or what was the tool that send those HTTP(s) requests.

Only way is if you have some sort of foot prints that can identify the source, such as your User-Agent, and if that tool is open-sourced in a public repository where the source code can be examined.

If none of that applies to your specific Python script, then you have nothing to worry. Like I said, just worry about picking a good source of proxies, and you should be fine.
 

BlueWhale27

Registered Member
Joined
May 10, 2011
Messages
63
Reaction score
6
CRAP! Now, THAT opens up an entirely New question...
With ONLY my IP, a Great programmer can view / download Any or ALL of the files on my PC?
REALLY?
 

Ecodor

Senior Member
Joined
Nov 5, 2017
Messages
858
Reaction score
200
CRAP! Now, THAT opens up an entirely New question...
With ONLY my IP, a Great programmer can view / download Any or ALL of the files on my PC?
REALLY?
I would say 90% you should not have problems
 

Aprium

BANNED
Joined
Feb 23, 2018
Messages
66
Reaction score
44
CRAP! Now, THAT opens up an entirely New question...
With ONLY my IP, a Great programmer can view / download Any or ALL of the files on my PC?
REALLY?

Do you have a learning deficit or something?

Almost 7 years ago you were seeking for a web scraper:
https://www.blackhatworld.com/seo/i-need-a-bot-php-perl-script-will-pay.324175/

2432 days later you are still asking some of the most basic questions?

To me seems like a troll more like than a genuine poster.
 

BlueWhale27

Registered Member
Joined
May 10, 2011
Messages
63
Reaction score
6
No, Aprium,... I am a Serious poster.
I've been working on the same project for about 11 years, now.
I've read and learned a lot in that time, but I've also had the project on hold (doing nothing on it) for several years.
I'm now getting back into it, BECAUSE I have always thought that it IS a million-dollar idea.
And the monthly expenses (about $45) to Maintain the atmosphere are a draw... should be Stopped IF I cannot get this thing Operational.
In learning more, over time, I keep discovering that there are more roadblocks that Need to be overcome.
I'm Serious.
 

Alfon

Regular Member
Joined
Jan 6, 2013
Messages
327
Reaction score
126
Website
127.0.0.1
Short answer: NOPE.

P.S. If they are the NSA then they can track your public ip address, access to your computer through a backdoor and fuck your ass if they want too.
 

BlueWhale27

Registered Member
Joined
May 10, 2011
Messages
63
Reaction score
6
A Correction... I've been working on this project for ONLY about 8 years.
Regardless, it's about time to sh1t or get off the pot.
 

BlueWhale27

Registered Member
Joined
May 10, 2011
Messages
63
Reaction score
6
With what I've read / learned about this stuff, Here's what I need…
While hiding the User-Agent, SEARCH for the records that I want from a Huge/Rich company website.
That Search will find several hundred records… not really a lot.
While Scraping info from each of those records, I need to change the IP that I am using,
AND change the MAC of my PC, after the Scrape of EACH individual record.
Yes,… that will slow down the process. That's irrelevant. I want this Bulletproof.
Verify that the new IP (and MAC) being used Has Not been used in the previous 20 Scrapes.
I think that should sufficiently allow the bot to Not be discovered… but I dunno'...
Any ideas / suggestions / PMs / recommendations / etc… are Eagerly Welcome.
PLEASE!... Thanks very much, people.
BlueWhale27
 
Top