Physical Phone Farm — Best Software for Stealth Automation?

saphiron · Dec 19, 2025

Hi everyone,

Setting up a 20-device Physical Phone Farm for my Marketing Agency. Goal: High-Quality Content Distribution (Music/Artists) on TikTok, Reels & Shorts — not spam. Long-term account health is the priority.

2 questions for the veterans:

1. Stealth vs Convenience My team wants tools like xiowei or GenFarmer. I refused — I suspect TikTok scans for these package names and flags the device.Currently forcing QtScrcpy (pure ADB) with no agent app installed.→ Am I being too paranoid, or is raw ADB the only safe way?

2. Best Automation SoftwareWhat are you guys actually using for physical farms at scale? Looking for:

TikTok/Reels/Shorts compatible
Minimal detection risk
Centralized control (20+ devices)

If you're running farms targeting Tier 1 (EU/US), what's your stack?

Thanks!

xishuaionline · Dec 19, 2025

I’m running a different approach, mainly because I don’t want any software-level automation footprint on the target devices.
Instead of installing automation tools (Xiowei, GenFarmer, or other agent-based apps), I keep all phones 100% stock — no extra packages, no accessibility services, no resident automation software, and no modified ROMs.
The core idea is to move automation completely outside the phone.
A separate control system handles logic and scheduling
Cameras visually observe each phone’s screen (similar to human eyesight)
Robotic actuators physically tap, swipe, and interact with the devices
From the device and app perspective, it’s just real hardware with real touch input.
On top of basic computer vision, I’m experimenting with using an LLM (e.g. ChatGPT or similar) at the reasoning layer. The camera feed is combined with the LLM to better interpret what’s on the screen, understand UI context, and make more appropriate action decisions, rather than relying purely on fixed rules or templates.
The controlled phones can be any model — Android or iOS — with different OS versions and screen sizes. Since everything is driven by visual recognition and physical interaction, the same control logic works across all devices without writing separate automation scripts for each model or OS version.
Only the master device needs to be rooted or jailbroken. It acts as the “brain,” where all automation logic, decision-making, and learning live.
The master device observes the screens via cameras, decides what action to take, and sends commands to robotic actuators — very similar to how a human brain sends instructions to the hand to operate a touchscreen.
Because nothing runs on the controlled phones themselves, there’s no need to maintain different software stacks for Android vs iOS, or worry about OS updates breaking automation logic.
Just sharing my current thinking — happy to hear how others are approaching this.

Noah123 · Dec 19, 2025

xishuaionline said:
I’m running a different approach, mainly because I don’t want any software-level automation footprint on the target devices.
Instead of installing automation tools (Xiowei, GenFarmer, or other agent-based apps), I keep all phones 100% stock — no extra packages, no accessibility services, no resident automation software, and no modified ROMs.
The core idea is to move automation completely outside the phone.
A separate control system handles logic and scheduling
Cameras visually observe each phone’s screen (similar to human eyesight)
Robotic actuators physically tap, swipe, and interact with the devices
From the device and app perspective, it’s just real hardware with real touch input.
On top of basic computer vision, I’m experimenting with using an LLM (e.g. ChatGPT or similar) at the reasoning layer. The camera feed is combined with the LLM to better interpret what’s on the screen, understand UI context, and make more appropriate action decisions, rather than relying purely on fixed rules or templates.
The controlled phones can be any model — Android or iOS — with different OS versions and screen sizes. Since everything is driven by visual recognition and physical interaction, the same control logic works across all devices without writing separate automation scripts for each model or OS version.
Only the master device needs to be rooted or jailbroken. It acts as the “brain,” where all automation logic, decision-making, and learning live.
The master device observes the screens via cameras, decides what action to take, and sends commands to robotic actuators — very similar to how a human brain sends instructions to the hand to operate a touchscreen.
Because nothing runs on the controlled phones themselves, there’s no need to maintain different software stacks for Android vs iOS, or worry about OS updates breaking automation logic.
Just sharing my current thinking — happy to hear how others are approaching this.

View attachment 494481

this looks so cool nice work bro

saphiron · Dec 19, 2025

xishuaionline said:
I’m running a different approach, mainly because I don’t want any software-level automation footprint on the target devices.
Instead of installing automation tools (Xiowei, GenFarmer, or other agent-based apps), I keep all phones 100% stock — no extra packages, no accessibility services, no resident automation software, and no modified ROMs.
The core idea is to move automation completely outside the phone.
A separate control system handles logic and scheduling
Cameras visually observe each phone’s screen (similar to human eyesight)
Robotic actuators physically tap, swipe, and interact with the devices
From the device and app perspective, it’s just real hardware with real touch input.
On top of basic computer vision, I’m experimenting with using an LLM (e.g. ChatGPT or similar) at the reasoning layer. The camera feed is combined with the LLM to better interpret what’s on the screen, understand UI context, and make more appropriate action decisions, rather than relying purely on fixed rules or templates.
The controlled phones can be any model — Android or iOS — with different OS versions and screen sizes. Since everything is driven by visual recognition and physical interaction, the same control logic works across all devices without writing separate automation scripts for each model or OS version.
Only the master device needs to be rooted or jailbroken. It acts as the “brain,” where all automation logic, decision-making, and learning live.
The master device observes the screens via cameras, decides what action to take, and sends commands to robotic actuators — very similar to how a human brain sends instructions to the hand to operate a touchscreen.
Because nothing runs on the controlled phones themselves, there’s no need to maintain different software stacks for Android vs iOS, or worry about OS updates breaking automation logic.
Just sharing my current thinking — happy to hear how others are approaching this.

View attachment 494481

This is genuinely impressive — and probably the cleanest approach I've seen for avoiding detection.

Moving automation entirely outside the device is smart. No packages, no accessibility abuse, no fingerprint. From TikTok's perspective, it's just a human tapping a screen.

The LLM + camera vision layer is interesting too. I assume it handles edge cases better than hard-coded rules (pop-ups, UI changes, etc.)?

Quick questions:

What's your latency like between screen observation → decision → physical tap?
Are you using off-the-shelf robotic arms or a custom rig?
At what scale are you running this currently?

I went the ADB route for simplicity, but your method is definitely more future-proof. Would love to see how this evolves.

Thanks for sharing!

xishuaionline · Dec 24, 2025

saphiron said:
This is genuinely impressive — and probably the cleanest approach I've seen for avoiding detection.

Moving automation entirely outside the device is smart. No packages, no accessibility abuse, no fingerprint. From TikTok's perspective, it's just a human tapping a screen.

The LLM + camera vision layer is interesting too. I assume it handles edge cases better than hard-coded rules (pop-ups, UI changes, etc.)?

Quick questions:

What's your latency like between screen observation → decision → physical tap?

Are you using off-the-shelf robotic arms or a custom rig?

At what scale are you running this currently?

I went the ADB route for simplicity, but your method is definitely more future-proof. Would love to see how this evolves.

Thanks for sharing!

1. Almost simultaneously
2. custom rig
3. 20 phones

nzeamos256 · Feb 7, 2026

xishuaionline said:
I’m running a different approach, mainly because I don’t want any software-level automation footprint on the target devices.
Instead of installing automation tools (Xiowei, GenFarmer, or other agent-based apps), I keep all phones 100% stock — no extra packages, no accessibility services, no resident automation software, and no modified ROMs.
The core idea is to move automation completely outside the phone.
A separate control system handles logic and scheduling
Cameras visually observe each phone’s screen (similar to human eyesight)
Robotic actuators physically tap, swipe, and interact with the devices
From the device and app perspective, it’s just real hardware with real touch input.
On top of basic computer vision, I’m experimenting with using an LLM (e.g. ChatGPT or similar) at the reasoning layer. The camera feed is combined with the LLM to better interpret what’s on the screen, understand UI context, and make more appropriate action decisions, rather than relying purely on fixed rules or templates.
The controlled phones can be any model — Android or iOS — with different OS versions and screen sizes. Since everything is driven by visual recognition and physical interaction, the same control logic works across all devices without writing separate automation scripts for each model or OS version.
Only the master device needs to be rooted or jailbroken. It acts as the “brain,” where all automation logic, decision-making, and learning live.
The master device observes the screens via cameras, decides what action to take, and sends commands to robotic actuators — very similar to how a human brain sends instructions to the hand to operate a touchscreen.
Because nothing runs on the controlled phones themselves, there’s no need to maintain different software stacks for Android vs iOS, or worry about OS updates breaking automation logic.
Just sharing my current thinking — happy to hear how others are approaching this.

View attachment 494481

How can get in touch with you I am a newbie and I need your guidance

kingkong2222 · Feb 9, 2026

xishuaionline said:
I’m running a different approach, mainly because I don’t want any software-level automation footprint on the target devices.
Instead of installing automation tools (Xiowei, GenFarmer, or other agent-based apps), I keep all phones 100% stock — no extra packages, no accessibility services, no resident automation software, and no modified ROMs.
The core idea is to move automation completely outside the phone.
A separate control system handles logic and scheduling
Cameras visually observe each phone’s screen (similar to human eyesight)
Robotic actuators physically tap, swipe, and interact with the devices
From the device and app perspective, it’s just real hardware with real touch input.
On top of basic computer vision, I’m experimenting with using an LLM (e.g. ChatGPT or similar) at the reasoning layer. The camera feed is combined with the LLM to better interpret what’s on the screen, understand UI context, and make more appropriate action decisions, rather than relying purely on fixed rules or templates.
The controlled phones can be any model — Android or iOS — with different OS versions and screen sizes. Since everything is driven by visual recognition and physical interaction, the same control logic works across all devices without writing separate automation scripts for each model or OS version.
Only the master device needs to be rooted or jailbroken. It acts as the “brain,” where all automation logic, decision-making, and learning live.
The master device observes the screens via cameras, decides what action to take, and sends commands to robotic actuators — very similar to how a human brain sends instructions to the hand to operate a touchscreen.
Because nothing runs on the controlled phones themselves, there’s no need to maintain different software stacks for Android vs iOS, or worry about OS updates breaking automation logic.
Just sharing my current thinking — happy to hear how others are approaching this.

View attachment 494481

is your setup good enough to gen account?

bilbo09 · Mar 3, 2026

xishuaionline said:
I’m running a different approach, mainly because I don’t want any software-level automation footprint on the target devices.
Instead of installing automation tools (Xiowei, GenFarmer, or other agent-based apps), I keep all phones 100% stock — no extra packages, no accessibility services, no resident automation software, and no modified ROMs.
The core idea is to move automation completely outside the phone.
A separate control system handles logic and scheduling
Cameras visually observe each phone’s screen (similar to human eyesight)
Robotic actuators physically tap, swipe, and interact with the devices
From the device and app perspective, it’s just real hardware with real touch input.
On top of basic computer vision, I’m experimenting with using an LLM (e.g. ChatGPT or similar) at the reasoning layer. The camera feed is combined with the LLM to better interpret what’s on the screen, understand UI context, and make more appropriate action decisions, rather than relying purely on fixed rules or templates.
The controlled phones can be any model — Android or iOS — with different OS versions and screen sizes. Since everything is driven by visual recognition and physical interaction, the same control logic works across all devices without writing separate automation scripts for each model or OS version.
Only the master device needs to be rooted or jailbroken. It acts as the “brain,” where all automation logic, decision-making, and learning live.
The master device observes the screens via cameras, decides what action to take, and sends commands to robotic actuators — very similar to how a human brain sends instructions to the hand to operate a touchscreen.
Because nothing runs on the controlled phones themselves, there’s no need to maintain different software stacks for Android vs iOS, or worry about OS updates breaking automation logic.
Just sharing my current thinking — happy to hear how others are approaching this.

View attachment 494481

wow... looks good this farm

NomixGuy · Mar 3, 2026

We use iPhones with SIM cards and containerization. Reddit and IG.

xishuaionline said:
I’m running a different approach, mainly because I don’t want any software-level automation footprint on the target devices.
Instead of installing automation tools (Xiowei, GenFarmer, or other agent-based apps), I keep all phones 100% stock — no extra packages, no accessibility services, no resident automation software, and no modified ROMs.
The core idea is to move automation completely outside the phone.
A separate control system handles logic and scheduling
Cameras visually observe each phone’s screen (similar to human eyesight)
Robotic actuators physically tap, swipe, and interact with the devices
From the device and app perspective, it’s just real hardware with real touch input.
On top of basic computer vision, I’m experimenting with using an LLM (e.g. ChatGPT or similar) at the reasoning layer. The camera feed is combined with the LLM to better interpret what’s on the screen, understand UI context, and make more appropriate action decisions, rather than relying purely on fixed rules or templates.
The controlled phones can be any model — Android or iOS — with different OS versions and screen sizes. Since everything is driven by visual recognition and physical interaction, the same control logic works across all devices without writing separate automation scripts for each model or OS version.
Only the master device needs to be rooted or jailbroken. It acts as the “brain,” where all automation logic, decision-making, and learning live.
The master device observes the screens via cameras, decides what action to take, and sends commands to robotic actuators — very similar to how a human brain sends instructions to the hand to operate a touchscreen.
Because nothing runs on the controlled phones themselves, there’s no need to maintain different software stacks for Android vs iOS, or worry about OS updates breaking automation logic.
Just sharing my current thinking — happy to hear how others are approaching this.

View attachment 494481

Hell, this is impressive!

When we started our own approach for iPhone farms we first tried robo-hands and the idea was something similar to what you’ve done. But we checked the numbers and it’s incredibly expensive! I never expected to see this implemented irl. I wonder how much you’ve spent on this setup!

We ended up with AccessibilityTouch + screen recognition. Nothing runs on controlled phones too, we use external hardware. And no need for different stacks for Android/iOS too. And it’s much cheaper in production.

Did you consider this approach before starting your setup? Any specific reasons why you decided to invest in robotic actuators?

Physical Phone Farm — Best Software for Stealth Automation?

saphiron

Registered Member

xishuaionline

Junior Member

Noah123

Junior Member

saphiron

Registered Member

xishuaionline

Junior Member

nzeamos256

Newbie

kingkong2222

Junior Member

bilbo09

Junior Member

NomixGuy

Junior Member

Main Menu

Marketplace

Making Money

BlackHat World