Human-Like Mouse Movement Simulation for CAPTCHA Evasion
At Icebergdata.co, our goal was to develop a sophisticated scraping bot capable of mimicking human mouse movements to evade CAPTCHA detection. This involved creating a system that generates mouse movements indistinguishable from those of real users.
- image taken from https://lnkd.in/eZYPDkWN
Metodology
- Gaussian Smoothed Random Movements:
We started by generating basic random movements. These movements were initially based on a simple random walk model, producing erratic and jagged paths. To smooth these movements, we applied a Gaussian filter. This filter convolves the random path with a Gaussian kernel, resulting in fluid and natural-looking movements.
2.Training Distribution on Human Movements:
We collected a dataset of 10,000 human mouse movements, focusing on tasks that required interaction with CAPTCHA elements. A thorough statistical analysis of this data was conducted to understand the distribution of movement deltas (the change in position between consecutive points). We modeled these deltas using a Gaussian Mixture Model (GMM) to accurately represent the natural variation in human movements.
3.Morphing the Distribution:
Using the initial Gaussian-smoothed random movements, we generated a series of movement deltas.These generated deltas were then morphed to match the trained distribution of human movements. This transformation ensured that the statistical properties of our generated movements (mean, variance, etc.) aligned with those observed in human data.
Techniques such as probability integral transform and inverse transform sampling were employed to achieve this morphing.
4.Application to CAPTCHA Responses:
The final step was simulating the mouse movements towards CAPTCHA elements. This involved: Starting from a random screen position and moving towards the CAPTCHA target in a manner that mimicked human behavior. Introducing time-based variations, including pauses, speed changes, and small, jittery adjustments near the target to simulate human-like precision adjustments.Simulated interactions included not only mouse movements but also clicks and drag-and-drop actions, performed with realistic timing to avoid detection.
Implementation
This is just a simplified version
import numpy as np
from scipy.ndimage import gaussian_filter1d
import matplotlib.pyplot as plt
import pyautogui
import time
# Generate random movements using a random walk model
def random_walk(length, stddev):
return np.cumsum(np.random.normal(0, stddev, length))
# Apply Gaussian smoothing to the random movements
def gaussian_smooth(data, sigma):
return gaussian_filter1d(data, sigma)
# Morph the movement distribution to match the human-like distribution
def morph_distribution(data, target_mean, target_std):
return (data - np.mean(data)) / np.std(data) * target_std + target_mean
# Step 1: Generate random movements
length = 100
stddev = 10
random_x = random_walk(length, stddev)
random_y = random_walk(length, stddev)
# Step 2: Smooth the movements
smooth_x = gaussian_smooth(random_x, sigma=2)
smooth_y = gaussian_smooth(random_y, sigma=2)
# Step 3: Morph to human-like distribution
human_mean_x, human_std_x = 5, 2 # Example values from trained data
human_mean_y, human_std_y = 5, 2 # Example values from trained data
morphed_x = morph_distribution(smooth_x, human_mean_x, human_std_x)
morphed_y = morph_distribution(smooth_y, human_mean_y, human_std_y)
# Combine into final mouse path
mouse_path = list(zip(morphed_x, morphed_y))
# Function to move the mouse to (x, y)
def move_mouse_to(x, y):
pyautogui.moveTo(x, y)
# Function to perform a click
def click_mouse():
pyautogui.click()
# Simulate interaction
for x, y in mouse_path:
move_mouse_to(x, y) # Move the mouse to (x, y)
time.sleep(np.random.uniform(0.01, 0.05)) # Random sleep to simulate human delay
click_mouse() # Perform a click at the end of the path
# Draw the mouse movement in a PNG file
plt.figure(figsize=(10, 6))
plt.plot(morphed_x, morphed_y, marker='o', linestyle='-', color='b', markersize=5)
plt.title('Simulated Mouse Movement Path')
plt.xlabel('X Coordinate')
plt.ylabel('Y Coordinate')
plt.grid(True)
plt.savefig('mouse_movement_path.png')
plt.show()
The project successfully created a bot that simulates human-like mouse movements, significantly improving its ability to bypass CAPTCHA challenges.
By employing Gaussian smoothing and statistical morphing techniques, we achieved a high degree of realism in our simulated interactions, making detection by CAPTCHA systems more difficult.