Beeminder Forum

TaskRatchet Programming Language Wars

This is like saying you haven’t quite decided whether to build your new bicycle out of aluminum or duct tape.

1 Like

That’s actually super fair. :stuck_out_tongue_winking_eye:

I think this is mostly just me trying to escape the mess which seems to be Python’s package publishing tooling… :expressionless:

I may well end up writing my own CLI (for my own use, but maybe I’ll open source it if there is interest), so it’s likely that I personally won’t be affected if you take this advice or not, but I do have quite a bit of experience in building CLI tools, and I have a very strong suggestion to make:

Python is a great language. I like it a lot. But I have learning not to write CLI tools in it, at least not short-running ones which are supposed to be instantaneous. (Things like TUIs or background services can be OK.)

The reason not to is the startup time. It’s not quite as bad as some alternatives (e.g. Java), but Python’s startup time is, well, not the best.

This almost never matters. Who cares if your program takes an amount of extra time measured in milliseconds to start? If it were a performance-critical domain you wouldn’t be writing it in Python in the first place, after all. Python deliberately trades off performance for ease-of-use, and that’s almost always a good thing: the kind of things you write in Python are typically bottlenecked not by raw execution speed, but by things like I/O (including network I/O.) And even if it takes milliseconds more, most often that doesn’t matter. Even hundreds of milliseconds more.

All this is true! Normally this doesn’t matter at all! But for command line tools it does. 200-300ms is a pretty typical startup time with a reasonable set of imports. That’s nothing for a long running program. But for a command line tool? It’s definitely more than enough to make it feel jerky. Try running sleep 0.25: a quarter second is definitely a noticeable pause.

Does it really take a quarter second? Yes, approximately that, if you’re importing anything moderately sized. Here are some benchmarks I’ve done, importing just the requests library, and running no other code:

Screenshot%20from%202019-07-13%2013-33-25

That’s a quarter second all on it’s own. Note that this is with 30 warmup runs first to warm up the disk cache! On a cold cache it can take significantly longer.

This is not to say you should write it in Bash, which is, not to put it too finely, not the nicest language to write programs in.

But this is the reason why I tend away from Python (and similar languages) when writing CLI tools, unless there is a strong reason otherwise (such as, e.g. it being a frontend to an existing Python library.) In the olden days maybe you’d write it in Python (or Perl, Ruby, etc), if only because the alternative would be what, C? You really want to avoid that if you can. You almost certainly don’t want to write something like this in C. (Not that C is a bad language per se, and of course many CLI tools historically were written in C, but it’s a lot less suited as a language for things like this.)

Nowadays we’ve got alternatives, though. There’s Go, Rust, and a number of others along those lines that make a good alternative to scripting languages for things like this, and produce fast-starting binaries. (And fast-running, etc, but as discussed that really isn’t remotely noticeable here. And better type-safety and so forth, but that’s a whole 'nother issue.)

Anyway: if you really do want to write it in Python, there are things you can do to make it a bit better. See also this email on the Python mailing list for more discussion of the issue. It’s a thorny issue, with real UX consequences, and it hasn’t really got a good solution so far. Python is a good language, all in all, so it’s a shame to do something as drastic as avoiding it completely for writing CLI tools. And it’s not that I never write any CLI tools in Python: as mentioned before, things like a frontend to an existing Python library, or things where the startup time matters less can happily be written in Python. But for most CLI tools I tend to shy away from it: the UX hit is annoying, and the reason I like CLI tools so much is that the user experience for things I do a lot is quite important to me.

1 Like

Love the thought you put into this!

I’m open to considering using Rust or Go, though I don’t know much about those languages. Currently, my biggest concerns:

  1. How easy would it be for someone who wants to use the API to look at some Rust or Go code and mentally translate that to their language of choice? This is one way Python really shines (and is another reason Bash is probably a horrendous choice for this).
  2. How would a CLI in Rust or Go be distributed? Ideally it could be published in such a way that a new user could run a single command like npm install -g <package_name> or brew install <package_name> which would install the tool in their path so it would be available everywhere.
  3. Do these languages have mature testing frameworks?
1 Like

More like duct tape and glass fibre reinforced duct tape :roll_eyes:

I take it you don’t like Python much? :stuck_out_tongue_winking_eye:

1 Like

If you want I can send you a copy of my M.Sc. project documentation that is more or less a 60 pages long rant about Python or more specifically my attempt to implement SPIHT in it once it’s finished.
It’s my personal Yellow Brick Half-Plane and was meant to be done by it’s why I started beeminding.

Admittedly the rant itself isn’t all that’s in there. But it is the center piece :wink:

Care to share a TL;DR? It’s pretty obvious why people would hate on Bash, but it’s a little less clear to me why people would dismiss Python. Certainly, static typing and true private attributes would be nice to have, but you can be very productive without them. This list of websites built entirely or partially with Python, including YouTube and Netflix, would seem to indicate that Python is a serious, production-ready language.

My tl;dr is: don’t.
I’d actually go into detail but I habe piled up a few goals that are about to derail and time is running out.
My suggestion would indeed be rust for cli. Or Scala and scala-native or using the Graal VM ahead of time compiler.
Even without native if done right the JVM’s startup time is no worse than Python’s and it certainly beats the hell out of it performance wise.
Not that that matters. What matters is type safety and very useful concepts in Scala vs a pile of bloody hacks in Python.

I’m all for type safety, though in my limited experience a good test suite seems to make a lack of language-level type safety a lot less of a problem than it could be.

I don’t mean to argue with you, though. I’m open to building the CLI in something other than Python. And type safety + performance gains would be wonderful.

If you ever feel like sharing more details on your concerns with Python, I’d love to hear them. :slight_smile:

I’m wondering about two things here.

Python is strong-typed but dynamic type as opposed to static type.

The article above on optimizing Python is on Python 2.4/2.5. January 2020 marks the sunsetting of python 2.7.

I’m quite particular to python but I’m not a CS person and just wanted to point out those two peculiarities I noticed.

1 Like

Good catches. I don’t know that I have all these terms down, but I think what @phi (correct me if I’m wrong) is referring to by type safety are languages that can guarantee that all type errors are caught at compile time. A language like Python, which doesn’t require types to be explicitly declared and doesn’t check types at compile time (or even require a compile step), isn’t able to make that guarantee.

I love the idea of type safety, and I’ve wanted to learn a language that has this attribute for a long time. At this point, though, my goal is to be productive at writing good code. And doing so doesn’t necessarily mean choosing the perfect language. I already know Python, and I’m confident that I can write solid, well-tested code in that language without spending a lot of time learning something entirely new. That’s why I’m writing the API in Python.

I’m open to writing the CLI in something other than Python. I expect it to be a rather light wrapper for the API. If it starts doing a lot of heavy-lifting, we’re doing something wrong. Most features should be implemented in the API, not the CLI.

So, if I were to revise my previous list of criteria for the language the CLI should be written in, they might look something like this:

  1. The language should meet some minimal level of performance.
  2. It should have a mature testing framework.
  3. Ideally it should support publishing the CLI in such a way that a new user could run a single command like npm install -g <package_name> or brew install <package_name> which would install the tool in their path so it would be available everywhere.
  4. Given the CLI will be open source and MIT licensed, the language shouldn’t create unnecessary barriers to new contributors.
  5. A bonus would be that the code should be relatively easy for mildly technical people to re-purpose for their own uses.

Python shines on points 2, 4, and 5. I’m no performance expert, but it seems there are questions in this thread on Python’s ability to meet 1. And I expect it can meet 3, but I don’t know how to do it.

How would something like Rust or Go, or another language you think I should consider, stack up to these criteria?

1 Like

Just as a note: I’m trying hard to provide a balanced view of the state of the ecosystem, and avoid shilling for any one language. So in what follows I’m going to perhaps be overly evenhanded, even despite any personal preferences I have for one language or another.

I agree, Python shines at readability. So much so that sometimes it’s praised as “executable pseudocode”, which is, when you think about it, really quite impressive.

Nothing else quite reaches the heights of Python, of course, but neither of them has the abstruseness of C or the boilerplate of Java, so I think they do pretty OK at this metric. Go specially has readability as one of its highlighted goals as a language. (In my opinion it doesn’t quite perfectly live up to that, at least in comparison to Python, but it’s still pretty decent.)

Rust has cargo, which is a package manager along the lines of Node’s npm or Python’s pip. It’s really really good. Package managers are often something you get frustrated at: cargo goes above and beyond at solving those problems. Anyway: cargo install <package_name> does exactly what you’d expect. (The same as pip install <package_name> in Python, or npm install -g <package_name> in Node.)

Go is a bit less good at this. Still, there is go get, which is roughly equivalent. But there are thorny issues around stuff like the GOPATH, which you really don’t want to get into. (OK, it’s not quite that bad. And I hear they’ve improved things somewhat to move away from the GOPATH. But I’d still be a little bit wary about it.) That said, one advantage of Go is that it produces fully staticly-linked binaries: so what many people do is they compile it (once for each architecture/platform), and host these binaries somewhere, making the equivalent command to fetch the executable be wget https://path.to.hosted/executable. Ultimately this isn’t that much worse: it’s still just a single command, and then everything works, without much fuss. (Although you do need to compile it once for each platform/architecture you want it to run on.)

Very much so. This new generation of languages is from an era when the importance of testing is very well known, and both have in-language support for tests.

In Rust that looks like:

#[test]
fn test_that_one_plus_one_is_two() {
    assert_eq!(1 + 1, 2);
}

In Go it looks like

func TestOnePlusOneIsTwo(t *testing.T) {
    if 1 + 1 != 2 {
        t.Error("1 + 1 should be 2")
    }
}

Both of these examples use the very mature, very high quality testing frameworks that are treated as a integral part of the language itself.

Type safety is, quite honestly, amazing. But building a product isn’t necessarily the time to invest in learning a new language. That’s not to say you absolutely shouldn’t, but it’s a cost, and one it seems you’re well aware of. Good.

You can give it a shot, if you’d like, and if you can afford to do so without giving up valuable time from your runway. But if you’re building a startup here, everything comes at a cost, everything is a tradeoff. It may well be worth it to build a worse product(!) with serious UX issues than to end up running out of time and not building anything at all.

Take my advice in the previous post as technical advice, not buisness advice. (Of course, building a good technical product is often part of building a good business.) I don’t know what you should actually do, other than to think very hard and to weigh all these options seriously.

I’ll reiterate: it may well be worth building a product with UX issues if it means you can actually build it. Don’t bite off more than you can chew. (But maybe you can chew this bite. I don’t know, that’s for you to decide for yourself.)

Back to the technical discussion: In this list, Python actually almost meets all five. The only one that maybe it doesn’t is 1, but it’s not really a question of performance as you normally think of it that’s an issue here. It’s specifically a question of startup time. (Which perhaps is an aspect of performance. Whatever.)

It’s pretty OK at 3: Users can just run pip install --user <package_name>, if you publish your package to PyPI. This is a good tutorial on how to do that. It’s not quite as simple as in e.g. Rust, but it’s pretty decent.

2 Likes

Thank you for this, @zzq! This is very informative.

The other advantage Python has in this respect is that most users will already have Python installed (Mac and Linux users at least; not sure about Windows users?). I’d assume most people would have to do something extra to get Rust + Cargo on their machine.

1 Like

Unfortunately, Python is typically distributed without pip by default, with that being a separate package to install. That means that in practice this is more or less the same between Python, Node, Rust, etc: in all these cases the user is going to need to install something (if they don’t already use the language in question.)

In some ways it can actually be a disadvantage that Python often comes preinstalled: often the version that ships with the OS is outdated, and the user will need to install a newer version anyway (for instance, on mac using homebrew to install Python instead of the system Python.) You can avoid that by writing Python code that runs on old versions, but that’s hard to do and easy to make a mistake with.

(This isn’t even mentioning the Python 2 vs Python 3 split, which is a separate matter. At least there the executables have different names, python vs python3, so it’s relatively simple to tell which one needs to be used. The “old versions” described above might for instance by Python 3.4 or 3.5, while the latest is 3.7.)

Note that one additional advantage of a compiled language like Rust or Go though is that if you want you can distribute precompiled binaries, meaning you can skip all runtime requirements completely. (Well, there are still platform/architecture requirements, so you’d need to compile several versions, one for Max, one for Linux, one for Windows, etc. (And depending on what you want to support, 32 bit and 64 bit versions of each, versions for AMD, etc. But that’s probably going a bit too far. It’s fair to ask people on more exotic architectures to compile it from source, in my opinion.))

Unfortunately, distributing binaries means that you miss out on using the language’s package manager. So then you either make the binaries available using mechanisms like e.g. Github releases and have the installation instructions be to download them with wget or curl, and/or you can package it for a different package manager, such as system package managers like Homebrew or APT.

Yes, software distribution is genuinely hard. Unfortunately there’s no free ride here: even the seemingly easy ways to do it (like trying to use the user’s OS’s version of Python) have some severe pitfalls. (Both what I described above, and things like Windows users not having it installed by default. Although actually now on Windows 10, you can get Python through the Microsoft store apparently.)

Sigh.

Regarding Python backwards compatibility, I can set up my IDE to flag compatibility issues, and also set up CI to run the test suite using multiple versions of Python. So that might mitigate the issue somewhat…

Honestly, though, I’m currently envisioning the CLI to be for power users. So one of the reasons to make sure the code is very easy to understand is so that a user whose setup isn’t supported can build their own tool if needed.

Though, it would be nice to know that distribution & compatibility are problems that can at least be theoretically solved with whatever stack we choose, even if we don’t support everything out of the gate. That way the tool could move towards broader compatibility over time.

What about an even crazier idea, plan to include multiple implementations in different languages? Then people could pick and choose what works for them and build from there. This probably doesn’t really solve the issues though.

Also: Does Node stack up well on these criteria?

  1. Performance: No idea.
  2. Test framework: More challenging than something like Python, but probably has some good options.
  3. Distribution: Installing Node can be annoying, but may not be worse than anything else.
  4. New-contributor friendly: Lots of people know JavaScript.
  5. Easy to understand / repurpose: Probably a mixed bag. JavaScript has its eccentricities.

Sorry, @phi, if Python makes you cringe, I hate to imagine your reaction to considering JavaScript for this. :wink:

To me: Javascript = Assembler pretty much.
As long as you use something else to compile to it, I see no problem. E.g. Scala or Purescript.

If you want to learn something new (as in: expand your horizon new vs. just learn some new syntax for the same old stuff), do Scala, Haskell, Rust, Clojure. Of all of these I would argue Scala is the most accessible.

Scala has top notch support for asynchronous programming with predictable concurrency (aka Futures and Promises) all in a nice, clean and concise syntax. Combined with a rich static type system, good type inference (so you don’t actually have to write out the types) and excellent unit testing support (specs2.org and scalacheck.org) it’s a really solid language. It also got a really good build system. And a very mature and rich ecosystem.

Python has… hacks and hacks and more hacks, no useful typing no nothing. And the linters are garbage for type checking. It is SLOW as hell. Seriously it is not just not fast, it is painfully slow.
And besides that, the problem with dynamically typed languages is that you can not refactor in them.

2 Likes

you can’t refactor at all in a dynamically typed language? what? refactoring is just looking over your code and consolidating and rewriting.

Fair point. And there are definitely good options for JS precompilers.

Learning a functional language is definitely on my bucket list. :nerd_face:

Again, test-driven development makes this way less of an issue. When 90+% of your code is under test, you definitely can refactor, even in a dynamically-typed language.

Yes and no. 90% of your unit tests will consists for ridiculous tests such as “what happens if I pass an Int in here instead of a String?”. I prefer to have meaningful unit tests and have a compiler ensure there are no super trivial errors.
Also you can not automatically refactor even with unit tests. You can just move the pain of manually refactoring from “hard to diagnose and to reproduce error in production” to “one amongst a dozen of other errors being thrown into your face during refactoring. And that’s not counting the ones you forgot to write testcases for”.
The bulk of work is still left to you. Manually renaming things. Moving things. Respecting the order of imports. Seriously. The import system in Python is SO broken and I had issues with this SO often and they are incredibly annoying to diagnose.

1 Like