A Strategy That Builds Strategies: My Automated Crypto Strategy Generator (Dev Diary + How to Run It)
I didn’t start this project because I believed I could “outsmart the market” with one perfect idea.
I started it because I kept catching myself doing the same dangerous thing: falling in love with a backtest curve.
You know the pattern.
You tweak a threshold. You change one indicator window. You rerun the backtest. The curve improves.
And quietly, without noticing, you stop doing research and start doing cosmetic surgery on your results.
At some point I wanted a system that would argue with me.
A system that would force every strategy idea to survive outside the exact period it was tuned on.
So instead of writing one strategy, I built a program that can generate, mutate, evaluate, reject, and evolve strategies automatically.
This is my developer diary—how I built the loop, what it’s meant to prevent, and how you can run it yourself.
The core idea: stop “handcrafting” and start “evolving”
This project is an AI-assisted strategy evolution framework for crypto research.
At a high level it does this:
Start from a candidate strategy file (strategy_candidate.py)
Evaluate it with walk-forward testing (train → test, multiple folds)
Score it by rewarding ROI and punishing drawdown (and filtering fragile samples)
Keep an elite pool of the best candidates
Mutate the strategy logic (optionally with AI)
Repeat and save the best output as best_strategy.py
The goal is not one lucky run.
The goal is a process that makes overfitting hard.
Why walk-forward testing became non-negotiable for me
I used to trust “one big backtest.” Now I don’t.
A single backtest can hide all kinds of illusions:
a strategy that only works in one market regime
parameters tuned to a specific volatility pattern
a lucky sequence of trades that never repeats
Walk-forward testing is my antidote:
Split history into train and test
Tune only on train (optional)
Lock the configuration
Measure performance on test
Roll forward and repeat
If a strategy can’t survive that loop, it’s not robust—it’s just decorated.
What makes this system realistic (and not just a toy)
Backtests aren’t useful unless they include the boring, painful stuff.
So the engine intentionally models real execution friction:
fees
slippage
leverage and margin accounting
limits on maximum open positions (portfolio risk control)
optional “one position per symbol” constraint
This matters because many strategies look great only because they assume magical fills.
I’m not trying to create perfection.
I’m trying to create research that doesn’t collapse the moment it meets reality.
How the evolver works (the loop I actually run)
The evolver iterates like this:
1) Verify the strategy
Before anything runs, the system checks:
the strategy file is valid Python
the allowed edit block is intact
required features are declared correctly
nothing unsafe is being imported or executed
This is the part that keeps the system from slowly destroying itself as it evolves.
2) Evaluate with walk-forward (and optionally regime split)
Then it runs walk-forward evaluation across multiple folds.
Optionally, it can split evaluation by market regime (up / down / range), forcing strategies to survive more than one type of market.
3) Score (ROI vs drawdown, plus “don’t trust tiny samples”)
A strategy isn’t “good” because it has high ROI.
It’s good if ROI is high without unacceptable drawdown, and if it traded enough times to be meaningful.
So scoring is designed to:
reward return
penalize max drawdown
discard strategies with too few trades (tiny samples are how you fool yourself)
4) Maintain an elite pool
Instead of only keeping “the current best,” the system maintains a top-K elite pool.
This matters because evolution can get stuck in local optima.
Sometimes the next breakthrough comes from mutating the 3rd-best candidate, not the #1.
5) Mutate and repeat
The next iteration is generated by mutation:
parameter tweaks
logic variations
optional AI rewrite of only the editable block
The best candidate is saved as best_strategy.py, and the run artifacts go into a timestamped folder under runs/.
AI rewriting: I constrain it on purpose
I don’t let the AI edit everything.
That’s not “safety theater,” it’s survival.
In strategy_candidate.py, there’s a clearly marked section:
# === AI_EDIT_START ===
# === AI_EDIT_END ===
If AI is enabled, it is allowed to rewrite only that section, under strict constraints:
Python code only (no commentary)
no redefining the entire Strategy class
only use supported feature names
keep indentation correct
if it introduces parameters, it must also define the parameter space
Then the program validates the output before it’s accepted.
This is the difference between “AI as a tool” and “AI as chaos.”
About sharing: I removed my personal strategy logic
Yes—my original repository included parts of my own strategy logic (entry/exit rules, parameter ranges, etc.).
For sharing, I created a public template:
The framework remains intact
The strategy file is replaced with an empty shell (no trading logic)
No run artifacts are included
API keys are never included (environment variables only)
That way you can explore the system without inheriting my edge or my private research trail.
How to run it (friendly setup guide)
Step 1) Install Python and create a virtual environment
Inside the project folder:
python -m venv .venv
Activate it:
Windows (PowerShell):
.venv\Scripts\activate
macOS/Linux:
source .venv/bin/activate
Install dependencies:
pip install -r requirements.txt
Step 2) Set environment variables (only if you use AI mutation)
You only need this if you want the AI to mutate the strategy block.
Windows (PowerShell):
setx OPENAI_API_KEY "YOUR_KEY_HERE"
macOS/Linux:
export OPENAI_API_KEY="YOUR_KEY_HERE"
(If you don’t set this, you can still run without AI—see below.)
Step 3) Configure run_settings.py
Open run_settings.py and adjust:
data source: binance or csv
timeframe (e.g., 1h)
number of symbols by volume (Top N)
total history days loaded
walk-forward windows (train / test / step)
scoring weights and risk caps
This file is the control panel.
Everything else should “just run.”
Step 4) Run the evolver
The simplest run:
python run_main.py
Each run creates a new folder like:
./runs/evolve_YYYYMMDD_HHMMSS/
Inside you’ll find logs, evaluation summaries, and the best discovered candidate:
best_strategy.py
Step 5) Run without AI (deterministic mode)
If you want pure evaluation and selection without AI rewriting:
python -m hf_lite.cli --no-ai --base-dir . --source binance --timeframe 1h --top 30 --iters 30
CSV mode (controlled experiments):
python -m hf_lite.cli --no-ai --base-dir . --source csv --csv-folder ./my_csv --csv-symbols BTCUSDT,ETHUSDT --timeframe 1h --iters 20
If you want the project files, leave a comment
I’m happy to share the public template.
If you’re interested:
leave a comment below
include your email address (or comment and email me directly if you don’t want to post it publicly)
I’ll send the ZIP + a short setup checklist.
What I learned building this
The biggest change wasn’t technical. It was psychological.
I stopped asking:
“Can I make a backtest look great?”
And started asking:
“Can I build a process that makes it hard to lie to myself?”
That’s the entire point of this program.
It doesn’t guarantee profit.
It doesn’t eliminate risk.
But it’s my attempt to build research that can survive reality.
(Not financial advice. Research and engineering only.)
Comments
Post a Comment