It could be helpful to specify which regex flavor/style Curlex supports in the documentation (Curlex - Beeminder Help). It’s likely Perl-style, but making this explicit might help users, especially with GPT requests.
Speaking about GPT, the process to create your own regexes might be relatively straightforward, @shanaqui.
- Go to the site you care about, I used your first example (Eirian Evanna | FINAL FANTASY XIV, The Lodestone), but clicked on “Mounts,” so this is the URL I used: Eirian Evanna | FINAL FANTASY XIV, The Lodestone.
- Right click and Inspect the number you care about. It should then look something like this:
- Now, right click on an outer tag that includes your target number and select copy outer HTML. You have to apply your best judgement to know which tag to use.
<span>
is probably not enough context, but class="minion__sort__total"
kind of looks like we have enough context to unambiguously identify the position.
- It might then look like this when you paste it:
<p class="minion__sort__total">Total: <span>201</span></p>
- Now, you can ask GPT to create the regex for you:
Please create a Perl-style regex that matches the following HTML code. Please include a match group that matches the number that is part of the HTML. The number might change and there should be only one match group for that number.
<p class="minion__sort__total">Total: <span>201</span></p>
Claude 3.5 Sonnet then gives me this:
<p class="minion__sort__total">Total: <span>(\d+)</span></p>
And GPT 4o this:
/<p class="minion__sort__total">Total: <span>(\d+)<\/span><\/p>/
Here, you will have to remove the slashes before pasting it into Beeminder.
As a sanity check, I will try your second example: Eirian Evanna | FINAL FANTASY XIV, The Lodestone.
Right click, inspect:
Select marked HTML tag for enough context, and copy outer:
<div class="select-pulldown en-us">
<form action="?">
<select name="order" onchange="this.form.submit()" class="select-pulldown__open">
<option value="1">Sort by most recent</option>
<option value="2">Sort by oldest</option>
</select>
</form>
</div>
<div class="parts__total">2117 Total</div>
</div>
Paste into Claude with prompt; I guess it didn’t agree that we need that much context:
<div class="parts__total">(\d+) Total</div>
Test in Beeminder:
Edit:
For achievement points: <p class="achievement__point">(\d+)</p>
Also: obligatory disclaimer. I don’t endorse “parsing” HTML with regexes and this shouldn’t be used for anything serious. (See first answer here:
html - RegEx match open tags except XHTML self-contained tags - Stack Overflow)