Step-by-Step PyAutoGUI Tutorial: Automate Mouse Recording and Replay for Repetitive GUI Tasks

Key Takeaways
- You can automate repetitive graphical user interface (GUI) tasks by building a simple mouse recorder and replayer with Python.
- This project uses two key libraries:
pynputto listen for and record mouse/keyboard events, andPyAutoGUIto execute and replay those actions.- The core concept involves recording actions (like clicks and coordinates) into a file and then reading that file to perfectly replicate the actions on command.
I once spent an entire afternoon manually resizing and watermarking 300 product images for a client. Three hours of my life vanished into a haze of clicks, drags, and the soul-crushing "Save As..." dialog. By image #250, my brain was numb, and I realized I was just a biological robot performing a task a few lines of code could do in minutes.
That's the moment I dove headfirst into GUI automation. It's not just about saving time; it's about reclaiming your focus for work that actually matters. Today, I'm going to show you how to build your own mouse recorder and replayer using my favorite simple automation tool: PyAutoGUI.
Why Automate GUI Tasks with PyAutoGUI?
What is PyAutoGUI?
Think of PyAutoGUI as a digital puppeteer for your computer. It’s a Python library that lets you write scripts to control your mouse and keyboard. You can tell it to move, click, type text, or press keyboard shortcuts—basically, anything you can do with your hands, PyAutoGUI can do with code.
This kind of automation is a fundamental building block of what the corporate world calls Robotic Process Automation (RPA). While our script will be simple, the rabbit hole goes deep. If you're curious about where this technology is heading when combined with AI, I explored some fascinating possibilities in my post on Smart RPA 2.0: How Python OCR and Deep Learning Will Automate Complex Document Workflows.
The Problem: Repetitive Clicks and Drags
Every day, we perform countless digital chores: filling out the same web forms, exporting reports from legacy software, organizing files into folders. These tasks are mindless, prone to human error (I definitely mis-clicked a few of those 300 images), and a massive drain on productivity.
The Solution: A Record-and-Replay Script
What if you could perform a task once, record every mouse click, and then have a script replay those exact actions perfectly, every single time? That's exactly what we're building today. We'll create two small Python scripts:
recorder.py: Listens to your mouse actions and saves them to a file.replayer.py: Reads that file and perfectly mimics your actions.
Prerequisites: Setting Up Your Automation Environment
Before we write a single line of code, let's get our tools in order.
Ensuring Python is Installed
You'll need Python 3 installed on your system. If you don't have it, head over to the official Python website and grab the latest version.
Installing PyAutoGUI: The Core Engine
This is our main library for executing the automation. Open your terminal or command prompt and run:
pip install pyautogui
Installing Pynput: The Event Listener
PyAutoGUI is great at doing things, but it can't listen for mouse or keyboard events on its own. For that, we need another library called pynput, which will be the "ears" of our recording script.
pip install pynput
CRITICAL: Understanding and Enabling the PyAutoGUI Failsafe
I can't stress this enough. An automation script gone wrong can be a nightmare, clicking uncontrollably and making it hard to regain control.
PyAutoGUI has a built-in safety feature. If you quickly slam your mouse cursor into the top-left corner of the screen, it will trigger a failsafe and stop the script.
This feature is enabled by default and you should never disable it. It's your emergency brake.
Step 1: Building the Mouse Action Recorder
First, we need to capture your actions. This script listens for mouse clicks and key presses, recording them to a file.
The Logic: How to Listen for Mouse Events
We'll use the pynput library to create a "listener" that runs in the background. When you click, the listener will trigger a function that captures the (x, y) coordinates and the type of click. We'll also set it up so that pressing the Esc key will stop the recording.
Full Python Code for 'recorder.py'
# recorder.py
import json
from pynput import mouse, keyboard
# This list will store all the recorded actions
recorded_actions = []
def on_click(x, y, button, pressed):
"""Callback function to handle mouse click events."""
if pressed:
action = {
'type': 'click',
'x': x,
'y': y,
'button': str(button) # e.g., 'Button.left', 'Button.right'
}
recorded_actions.append(action)
print(f"Recorded click at ({x}, {y}) with {button}")
def on_press(key):
"""Callback function to handle key press events."""
# Stop listener if the Escape key is pressed
if key == keyboard.Key.esc:
print("Stopping recorder...")
return False # This stops the listener
# --- Main execution ---
print("Starting mouse recorder. Press 'Esc' to stop.")
# Set up listeners
mouse_listener = mouse.Listener(on_click=on_click)
key_listener = keyboard.Listener(on_press=on_press)
# Start listening in the background
mouse_listener.start()
key_listener.start()
# Wait for the listeners to stop (i.e., when 'Esc' is pressed)
mouse_listener.join()
key_listener.join()
# Save the recorded actions to a file
if recorded_actions:
with open('mouse_actions.json', 'w') as f:
json.dump(recorded_actions, f, indent=4)
print(f"Successfully saved {len(recorded_actions)} actions to mouse_actions.json")
else:
print("No actions were recorded.")
Code Breakdown: Capturing Clicks and Positions
import json, pynput: We import the necessary libraries.jsonis for saving our data in a clean format.recorded_actions = []: An empty list to store each action (which will be a dictionary).on_click(x, y, button, pressed): This is our core function.pynputcalls it every time a mouse button is pressed or released.on_press(key): This function listens for keyboard presses. If the key isEsc, it returnsFalse, which tells the listener to stop.mouse.Listener(...)&keyboard.Listener(...): These lines create the listener objects, telling them which functions to call when an event occurs.listener.start()&listener.join(): This starts the listeners and tells the script to wait until they are stopped before moving on.with open(...): After the recording stops, this block saves ourrecorded_actionslist into a file namedmouse_actions.json.
Running the Recorder and Saving Actions to a File
- Save the code above as
recorder.py. - Run it from your terminal:
python recorder.py. - You'll see the "Starting..." message. Now, perform the mouse clicks you want to automate.
- When you're done, press the
Esckey. - A new file,
mouse_actions.json, will appear in the same directory!
Step 2: Building the Mouse Action Replayer
Now for the fun part. Let's make the computer do our bidding.
The Logic: Reading and Executing Saved Actions
This script is the reverse of the recorder. It will open mouse_actions.json, read each action one by one, and use PyAutoGUI to execute it. We'll add a small delay between actions to ensure the target application has time to respond.
Full Python Code for 'replayer.py'
# replayer.py
import pyautogui
import json
import time
# CRITICAL: PyAutoGUI's failsafe is enabled by default.
# Move the mouse to the top-left corner of the screen to stop the script.
pyautogui.FAILSAFE = True
def replay_actions(file_path='mouse_actions.json'):
"""Reads actions from a file and replays them using pyautogui."""
try:
with open(file_path, 'r') as f:
actions = json.load(f)
except FileNotFoundError:
print(f"Error: The file '{file_path}' was not found.")
return
print(f"Starting replay of {len(actions)} actions in 3 seconds...")
time.sleep(3) # Give yourself time to switch to the target window
for action in actions:
if action['type'] == 'click':
# PyAutoGUI needs 'left', 'right', 'middle' instead of 'Button.left'
button_name = action['button'].split('.')[-1]
pyautogui.moveTo(action['x'], action['y'], duration=0.25)
pyautogui.click(x=action['x'], y=action['y'], button=button_name)
print(f"Clicked {button_name} at ({action['x']}, {action['y']})")
time.sleep(0.5) # A small delay after each click
print("Replay finished.")
# --- Main execution ---
if __name__ == "__main__":
replay_actions()
Code Breakdown: Reading the File and Triggering PyAutoGUI
import pyautogui, json, time: We import our automation engine, the JSON library, and thetimelibrary for delays.replay_actions(...): The main function that handles the logic.time.sleep(3): A 3-second pause so you have time to click on the window you want to automate before the script takes over.for action in actions:: The script loops through each recorded action.button_name = ...: A small data cleaning step.pynputsaves"Button.left", butpyautoguineeds just"left".pyautogui.moveTo(...): Moves the mouse to the correct location, making the automation look more human.pyautogui.click(...): This command performs the actual click at the saved coordinates.
Running the Replayer and Watching the Magic Happen
- Save the code as
replayer.py. - Get your target application ready (the one you recorded the actions on).
- Run the script from your terminal:
python replayer.py. - Quickly click on the target application window.
- After 3 seconds, sit back and watch as your mouse magically retraces its steps!
Enhancing Your Automation Script
What we've built is a fantastic starting point, but it's brittle. Here are a few ways to make it more robust.
Adding Delays for Reliability
Sometimes, an application needs a moment to load or a window needs to pop up. For real-world tasks, you might need longer, more strategic delays (time.sleep(2)) after actions that trigger slow processes.
Considering Different Screen Resolutions
A huge weakness of this script is that it relies on exact (x, y) coordinates. If you run it on a different monitor or if a window is in a different position, the script will fail.
The more advanced solution is to use PyAutoGUI's image recognition feature. You can take a screenshot of a button and have PyAutoGUI find that button on the screen, no matter where it is. This is the next level of robust automation.
Expanding to Include Keyboard Events
Our script only records the mouse. You could easily expand it by adding keyboard recording to the recorder.py and keyboard playback to the replayer.py using functions like pyautogui.typewrite().
Conclusion: You're Now a GUI Automator
Recap of What We Built
In just a few dozen lines of Python, you built a powerful automation duo: a script that records your mouse movements and another that replays them flawlessly. You learned how to use pynput to listen for events and pyautogui to execute them.
Next Steps and Further PyAutoGUI Features to Explore
This is just the beginning. I highly encourage you to explore the official PyAutoGUI documentation to automate almost anything with features like pyautogui.typewrite(), pyautogui.hotkey(), and pyautogui.screenshot().
Go find a boring, repetitive task you do every day and try to automate it. You'll be amazed at how much time and mental energy you can get back.
Recommended Watch
💬 Thoughts? Share in the comments below!
Comments
Post a Comment