It could be helpful to specify which regex flavor/style Curlex supports in the documentation (Curlex - Beeminder Help). It’s likely Perl-style, but making this explicit might help users, especially with GPT requests.
Speaking about GPT, the process to create your own regexes might be relatively straightforward, @shanaqui.
- Go to the site you care about, I used your first example (Eirian Evanna | FINAL FANTASY XIV, The Lodestone), but clicked on “Mounts,” so this is the URL I used: Eirian Evanna | FINAL FANTASY XIV, The Lodestone.
- Right click and Inspect the number you care about. It should then look something like this:
- Now, right click on an outer tag that includes your target number and select copy outer HTML. You have to apply your best judgement to know which tag to use.
is probably not enough context, but class="minion__sort__total"
kind of looks like we have enough context to unambiguously identify the position.
- It might then look like this when you paste it:
<p class="minion__sort__total">Total: <span>201</span></p>
- Now, you can ask GPT to create the regex for you:
Please create a Perl-style regex that matches the following HTML code. Please include a match group that matches the number that is part of the HTML. The number might change and there should be only one match group for that number.
<p class="minion__sort__total">Total: <span>201</span></p>
Claude 3.5 Sonnet then gives me this:
<p class="minion__sort__total">Total: <span>(\d+)</span></p>
And GPT 4o this:
/<p class="minion__sort__total">Total: <span>(\d+)<\/span><\/p>/
Here, you will have to remove the slashes before pasting it into Beeminder.
As a sanity check, I will try your second example: Eirian Evanna | FINAL FANTASY XIV, The Lodestone.
Right click, inspect:
Select marked HTML tag for enough context, and copy outer:
<div class="select-pulldown en-us">
<form action="?">
<select name="order" onchange="this.form.submit()" class="select-pulldown__open">
<option value="1">Sort by most recent</option>
<option value="2">Sort by oldest</option>
<div class="parts__total">2117 Total</div>
Paste into Claude with prompt; I guess it didn’t agree that we need that much context:
<div class="parts__total">(\d+) Total</div>
Test in Beeminder:
For achievement points: <p class="achievement__point">(\d+)</p>
Also: obligatory disclaimer. I don’t endorse “parsing” HTML with regexes and this shouldn’t be used for anything serious. (See first answer here:
html - RegEx match open tags except XHTML self-contained tags - Stack Overflow)