Step-by-Step PyAutoGUI Tutorial: Automate Mouse Recording and Replay for Repetitive GUI Tasks



Key Takeaways

  • You can automate repetitive graphical user interface (GUI) tasks by building a simple mouse recorder and replayer with Python.
  • This project uses two key libraries: pynput to listen for and record mouse/keyboard events, and PyAutoGUI to execute and replay those actions.
  • The core concept involves recording actions (like clicks and coordinates) into a file and then reading that file to perfectly replicate the actions on command.

I once spent an entire afternoon manually resizing and watermarking 300 product images for a client. Three hours of my life vanished into a haze of clicks, drags, and the soul-crushing "Save As..." dialog. By image #250, my brain was numb, and I realized I was just a biological robot performing a task a few lines of code could do in minutes.

That's the moment I dove headfirst into GUI automation. It's not just about saving time; it's about reclaiming your focus for work that actually matters. Today, I'm going to show you how to build your own mouse recorder and replayer using my favorite simple automation tool: PyAutoGUI.

Why Automate GUI Tasks with PyAutoGUI?

What is PyAutoGUI?

Think of PyAutoGUI as a digital puppeteer for your computer. It’s a Python library that lets you write scripts to control your mouse and keyboard. You can tell it to move, click, type text, or press keyboard shortcuts—basically, anything you can do with your hands, PyAutoGUI can do with code.

This kind of automation is a fundamental building block of what the corporate world calls Robotic Process Automation (RPA). While our script will be simple, the rabbit hole goes deep. If you're curious about where this technology is heading when combined with AI, I explored some fascinating possibilities in my post on Smart RPA 2.0: How Python OCR and Deep Learning Will Automate Complex Document Workflows.

The Problem: Repetitive Clicks and Drags

Every day, we perform countless digital chores: filling out the same web forms, exporting reports from legacy software, organizing files into folders. These tasks are mindless, prone to human error (I definitely mis-clicked a few of those 300 images), and a massive drain on productivity.

The Solution: A Record-and-Replay Script

What if you could perform a task once, record every mouse click, and then have a script replay those exact actions perfectly, every single time? That's exactly what we're building today. We'll create two small Python scripts:

  1. recorder.py: Listens to your mouse actions and saves them to a file.
  2. replayer.py: Reads that file and perfectly mimics your actions.

Prerequisites: Setting Up Your Automation Environment

Before we write a single line of code, let's get our tools in order.

Ensuring Python is Installed

You'll need Python 3 installed on your system. If you don't have it, head over to the official Python website and grab the latest version.

Installing PyAutoGUI: The Core Engine

This is our main library for executing the automation. Open your terminal or command prompt and run:

pip install pyautogui

Installing Pynput: The Event Listener

PyAutoGUI is great at doing things, but it can't listen for mouse or keyboard events on its own. For that, we need another library called pynput, which will be the "ears" of our recording script.

pip install pynput

CRITICAL: Understanding and Enabling the PyAutoGUI Failsafe

I can't stress this enough. An automation script gone wrong can be a nightmare, clicking uncontrollably and making it hard to regain control.

PyAutoGUI has a built-in safety feature. If you quickly slam your mouse cursor into the top-left corner of the screen, it will trigger a failsafe and stop the script.

This feature is enabled by default and you should never disable it. It's your emergency brake.

Step 1: Building the Mouse Action Recorder

First, we need to capture your actions. This script listens for mouse clicks and key presses, recording them to a file.

The Logic: How to Listen for Mouse Events

We'll use the pynput library to create a "listener" that runs in the background. When you click, the listener will trigger a function that captures the (x, y) coordinates and the type of click. We'll also set it up so that pressing the Esc key will stop the recording.

Full Python Code for 'recorder.py'

# recorder.py
import json
from pynput import mouse, keyboard

# This list will store all the recorded actions
recorded_actions = []

def on_click(x, y, button, pressed):
    """Callback function to handle mouse click events."""
    if pressed:
        action = {
            'type': 'click',
            'x': x,
            'y': y,
            'button': str(button) # e.g., 'Button.left', 'Button.right'
        }
        recorded_actions.append(action)
        print(f"Recorded click at ({x}, {y}) with {button}")

def on_press(key):
    """Callback function to handle key press events."""
    # Stop listener if the Escape key is pressed
    if key == keyboard.Key.esc:
        print("Stopping recorder...")
        return False # This stops the listener

# --- Main execution ---
print("Starting mouse recorder. Press 'Esc' to stop.")

# Set up listeners
mouse_listener = mouse.Listener(on_click=on_click)
key_listener = keyboard.Listener(on_press=on_press)

# Start listening in the background
mouse_listener.start()
key_listener.start()

# Wait for the listeners to stop (i.e., when 'Esc' is pressed)
mouse_listener.join()
key_listener.join()

# Save the recorded actions to a file
if recorded_actions:
    with open('mouse_actions.json', 'w') as f:
        json.dump(recorded_actions, f, indent=4)
    print(f"Successfully saved {len(recorded_actions)} actions to mouse_actions.json")
else:
    print("No actions were recorded.")

Code Breakdown: Capturing Clicks and Positions

  • import json, pynput: We import the necessary libraries. json is for saving our data in a clean format.
  • recorded_actions = []: An empty list to store each action (which will be a dictionary).
  • on_click(x, y, button, pressed): This is our core function. pynput calls it every time a mouse button is pressed or released.
  • on_press(key): This function listens for keyboard presses. If the key is Esc, it returns False, which tells the listener to stop.
  • mouse.Listener(...) & keyboard.Listener(...): These lines create the listener objects, telling them which functions to call when an event occurs.
  • listener.start() & listener.join(): This starts the listeners and tells the script to wait until they are stopped before moving on.
  • with open(...): After the recording stops, this block saves our recorded_actions list into a file named mouse_actions.json.

Running the Recorder and Saving Actions to a File

  1. Save the code above as recorder.py.
  2. Run it from your terminal: python recorder.py.
  3. You'll see the "Starting..." message. Now, perform the mouse clicks you want to automate.
  4. When you're done, press the Esc key.
  5. A new file, mouse_actions.json, will appear in the same directory!

Step 2: Building the Mouse Action Replayer

Now for the fun part. Let's make the computer do our bidding.

The Logic: Reading and Executing Saved Actions

This script is the reverse of the recorder. It will open mouse_actions.json, read each action one by one, and use PyAutoGUI to execute it. We'll add a small delay between actions to ensure the target application has time to respond.

Full Python Code for 'replayer.py'

# replayer.py
import pyautogui
import json
import time

# CRITICAL: PyAutoGUI's failsafe is enabled by default.
# Move the mouse to the top-left corner of the screen to stop the script.
pyautogui.FAILSAFE = True

def replay_actions(file_path='mouse_actions.json'):
    """Reads actions from a file and replays them using pyautogui."""
    try:
        with open(file_path, 'r') as f:
            actions = json.load(f)
    except FileNotFoundError:
        print(f"Error: The file '{file_path}' was not found.")
        return

    print(f"Starting replay of {len(actions)} actions in 3 seconds...")
    time.sleep(3) # Give yourself time to switch to the target window

    for action in actions:
        if action['type'] == 'click':
            # PyAutoGUI needs 'left', 'right', 'middle' instead of 'Button.left'
            button_name = action['button'].split('.')[-1]

            pyautogui.moveTo(action['x'], action['y'], duration=0.25)
            pyautogui.click(x=action['x'], y=action['y'], button=button_name)

            print(f"Clicked {button_name} at ({action['x']}, {action['y']})")
            time.sleep(0.5) # A small delay after each click

    print("Replay finished.")

# --- Main execution ---
if __name__ == "__main__":
    replay_actions()

Code Breakdown: Reading the File and Triggering PyAutoGUI

  • import pyautogui, json, time: We import our automation engine, the JSON library, and the time library for delays.
  • replay_actions(...): The main function that handles the logic.
  • time.sleep(3): A 3-second pause so you have time to click on the window you want to automate before the script takes over.
  • for action in actions:: The script loops through each recorded action.
  • button_name = ...: A small data cleaning step. pynput saves "Button.left", but pyautogui needs just "left".
  • pyautogui.moveTo(...): Moves the mouse to the correct location, making the automation look more human.
  • pyautogui.click(...): This command performs the actual click at the saved coordinates.

Running the Replayer and Watching the Magic Happen

  1. Save the code as replayer.py.
  2. Get your target application ready (the one you recorded the actions on).
  3. Run the script from your terminal: python replayer.py.
  4. Quickly click on the target application window.
  5. After 3 seconds, sit back and watch as your mouse magically retraces its steps!

Enhancing Your Automation Script

What we've built is a fantastic starting point, but it's brittle. Here are a few ways to make it more robust.

Adding Delays for Reliability

Sometimes, an application needs a moment to load or a window needs to pop up. For real-world tasks, you might need longer, more strategic delays (time.sleep(2)) after actions that trigger slow processes.

Considering Different Screen Resolutions

A huge weakness of this script is that it relies on exact (x, y) coordinates. If you run it on a different monitor or if a window is in a different position, the script will fail.

The more advanced solution is to use PyAutoGUI's image recognition feature. You can take a screenshot of a button and have PyAutoGUI find that button on the screen, no matter where it is. This is the next level of robust automation.

Expanding to Include Keyboard Events

Our script only records the mouse. You could easily expand it by adding keyboard recording to the recorder.py and keyboard playback to the replayer.py using functions like pyautogui.typewrite().

Conclusion: You're Now a GUI Automator

Recap of What We Built

In just a few dozen lines of Python, you built a powerful automation duo: a script that records your mouse movements and another that replays them flawlessly. You learned how to use pynput to listen for events and pyautogui to execute them.

Next Steps and Further PyAutoGUI Features to Explore

This is just the beginning. I highly encourage you to explore the official PyAutoGUI documentation to automate almost anything with features like pyautogui.typewrite(), pyautogui.hotkey(), and pyautogui.screenshot().

Go find a boring, repetitive task you do every day and try to automate it. You'll be amazed at how much time and mental energy you can get back.



Recommended Watch

📺 Python Automation - Pyautogui Convenient Mouse Click Program
📺 PyAutoGUI - Locate anything on your screen | Simple Pyautogui project

💬 Thoughts? Share in the comments below!

Comments