For anyone emacs-curious, you can do a similar thing with org-babel
You can have a plaintext file which is also the program which is also the documentation/notebook/website/etc. It's extremely powerful, and is a compelling example of literate programming.
Actually, in terms of capabilities, org-babel is among the most capable, if it is not the most capable, systems for literate programming. I have used it to great effect when learning from computer programming books. I can now go back to those literate programs, and understand again much faster, than originally when reading the books. The literate part of it answers my "silly" questions, that come from not remembering 100% of the reasoning or my own thoughts. That said, there is of course a learning curve, and people unwilling to learn something like that are better off not going that route.
Thanks for the shout-out! I think org-babel is really well suited for this task, and can make some really great documentation. You can check out the video[0] from the talk and a git repo[1] with a more advanced demonstration.
When you have executable code in the documentation, folks want to follow PR-review workflow with the docs as well - which is a bit more team investment than editing a wiki.
This is exactly what I wanted for our team when I was at AWS. There are so many versions of operations which are just slightly too dangerous to automate, and this provides a path to iteratively building that up. Congratulations!
Preface: My opinions are my own and not my employer’s.
Curious how long ago were you at AWS? For context, I spent the last few years in AWS working on an internal platform service whose entire purpose was to reduce operational toil by helping you codify your operational runbooks and execute them safely and automatically. Atuin Desktop is similar to that service in some sense but that service just offered much more features.
If it's local-first then it's already subject to rot. Unless they're running it all in containers? In which case local doesn't matter.
If you want to record a runbook, then record a runbook. You can do that a million ways. Text file, confluence doc, screen recording, shell script, etc. People already don't do that; they're not gonna suddenly start doing it more because your UI is fancier.
Personally, I don't want to sit around all day writing code (or docs) to try to get the system to be like X state. I want to manually make it have X state, and then run a tool to dump the state, and later re-run the tool to create (or enforce) that state again. I do not want to write code to try to tell the computer how to get to that state. Nor do I want to write "declarative configuration", which is just more code with a different name. I want to do the thing manually, then snapshot it, then replay it. And I want this to work on any system, anywhere, without dependence on monitoring a Bash shell for commands or something. Just dump state and later reapply state.
That’s not what they are saying. They are saying that the system where you have to declare everything manually is annoying (which it is), ideally it would record the changes while you make changes and then deduplicate them, remove unnecessary ones to arrive at the final playbook that can be replayed if needed.
Such a process is rarely portable though, and will need to be repeated for each different system, at which point it would be great to already have a declarative description, that can automatically be translated into those steps required to get to state X.
Unless the Dockerfiles are kept secret, any container can be replicated from the given Dockerfile. Barring extreme (distro/system/hardware)-level quirks, a Docker container should be able to run anywhere that Linux can.
You are mixing build time reproduction with run time ones.
Docker images (not files) help with the run time consistency .
Docker (files) barely scratch the surface of build reproducibility. Most applications depend on the distribution package manager ( apt, apk etc) and language package manager (npm, cargo, etc), both sets of them have various challenges in consistent dependency resolution.
In addition build steps might have ordering challenges RPC calls to remote services no longer running and so on.
Anyone trying to to build a docker image from 10 years back experiences this problem
You're right in the absolute form, but I've yet to see a Dockerfile where (with a little thinking and elbow grease) I couldn't "easily" port it or update it, even after years.
It's basically the best and easiest "I am documenting how it works now" thing without any arcane "works on my machine" quirks I have yet found.
So I'm still agreeing here that it's a very good approximation of this idea.
Real reproducability is miles better, but usually cannot be formulated in a ~20 line single file "recipe". (and before anyone mentions Nix.. no, there's so much inherent complexity involved, that doesn't count like "apt-get install docker && docker build ."
A container can very rarely be reproduced by a dockerfile.
I imagine with a lot of discipline (no apt update, no “latest” tag, no internet access) you can make a reproducible docker file…. But it is far from normal.
Well sure, making a 100% reproducible build is hard - but Docker makes it easier, not harder. If 100% reproducible is the goal, what's easier than docker?
A Dockerfile is essentially a shell script with access to the outside world. It has unconstrained network access. It can access local hardware and filesystem if instructed to. However, it doesn't verify that whatever stuff it took from the outside remains the same across builds. Docker doesn't care if the same Dockerfile builds Apache httpd in one build and Nginx in another. It literally can't get more irreproducible than that.
But mysteriously, people say that Docker is reproducible because, uh, you can download gigabyte-sized binary blobs from the Docker registry. I wonder, what's not reproducible by that metric?
Docker images may be portable compared to binaries targeting traditional FHS distros. But it's not reproducible whatsoever.
Full reproducibility isn't easy, there is a cost to it.
However the payoff is rather significant so if you can temper that cost a bit and make it less inconvenient to achieve then you have a winning solution.
Tools that have been designed with reproducibility in mind. Like Guix.
Beware, I am definitely not claiming those are easy to use in general. Just that you can get to reproducibility using them more reliably and maybe easier than with docker.
Presumably if your goal is a reproducible build you just wouldn't do any unconstrained downloading in the process of designing the dockerfile and building the image. Making a choice to use a tool poorly for you requirements isn't a problem with the tool.
It kind of sounds like you're describing Ansible. You use modules for common tasks like ensuring a package is installed, a file is present or has certain content, etc. It's declarative and imdempotent.
I've written some fairly complex stuff in Ansible. It is mostly declarative but you should be careful with assumptions about its idempotency, especially if you reach out for community modules.
what happens when you want to tweak something you did in the middle of this process? do you have to go through the whole flow again manually to make a single change?
I imagine you could either A) just modify the dumped state, B) paramaterize it, C) have the program split up the state into transactions and modify those. The program will probably have to take more than one step, in order, in order to accomplish everything. If it fails, you'd want it to try to undo it, ala transactions. And since it can do all that, it can stop, start, or resume at specific steps.
Like, Terraform has always sucked because there was no way to dump existing resources as new code. So a team at Google made a tool to do it (Terraform-er). If Terraform had already had that feature, and if it didn't rely on having pre-existing state to manage resources, that would be like 90% of the way to what I'd want. Just dump resources as code, then let me re-run the code, and if I want I can modify the code to ask me for inputs or change things. (People think of Terraform as only working on Cloud resources, but you could (for example) make an Ubuntu Linux provider that just configures Ubuntu for you, if you wanted)
Any notion of state that satisfies requirements like
> Just dump state and later reapply state
is necessarily declarative.
> Just dump resources as code,
What is the code for this resource?
VM foo1
Memory 16GiB
Network mynet1
It depends on the current state of the system where the resource is applied. If VM foo1 already exists, with 16GiB of memory, and connected to network mynet1, then the code is a no-op, no code at all. Right? Anything else would be a mistake. For example if the code would delete any matching VM and re-create it, that would be disastrous to continuity and availability, clearly a non-starter. Or, if VM foo1 exists, with 16GiB of memory, but connected to anothernet3, then the code should just change the network for that VM from anothernet3 to mynet1, and should definitely not destroy and re-create the VM entirely. And so on.
It depends what you're talking about; Terraform specifically has a flawed model where it assumes nothing in the world exists that it didn't create itself. Other configuration management tools don't assume that; they assume that you just want an item to exist; if it does exist, great, if it doesn't exist, you create it. But for a moment I'll assume you're talking about the other problem with configuration management tools, which is "which of the existing resources do I actually want to exist or modify?"
That's a solved problem. Anything that you use on a computer that controls a resource, can uniquely identify said resource, through either a key or composite key. This has to be the case, otherwise you could create things that you could never find again :) (Even if you created an array of things with no name, since it exists as an item in a list, the list index is its unique identifier)
Taking Terraform as example again, the provider has code in it that specifies what the unique identifier is, per-resource. It might be a single key (like 'id', 'ASN', 'Name', etc) or a composite key ( {'id' + 'VPC' + 'Region'} ).
If the code you've dumped does not have the unique identifier for some reason, then the provider has to make a decision: either try to look up existing resources that match what you've provided and assume the closest one is the right one, or error out that the unique identifier is missing. Usually the unique identifier is not hard to look up in the first place (yours has a composite identifier: {VM:"foo1", Network:"mynet1"}). But it's also (usually) not fool-proof.
Imagine a filesystem. You actually have two unique identifiers: the fully-qualified file path, and the inode number. The inode number is the actual unique identifier in the filesystem, but we don't tend to reference it, as 1) it's not that easy to remember/recognize an inode number, 2) it can be recycled for another file, 3) it'll change across filesystems. We instead reference the file path. But file paths are subtly complex: we have sym-links, hard-links and bind-mounts, so two different paths can actually lead to the same file, or different files! On top of that, you can remove the file and then create an identically-named file. Even if the file had identical contents, removing it and creating a new one is technically a whole new resource, and has impact on the system (permissions may be different, open filehandles to deleted files are a thing, etc).
So what all of us do, all day, every day, is lie to ourselves. We pretend we can recognize files, that we have a unique identifier for them. But actually we don't. What we do is use a composite index and guess. We say, "well it looks like the right file, because it's in the right file path, with the right size, and right name, and right permissions, and (maybe) has the right inode". But actually there's no way to know for sure it's the same file we expect. We just hope it is. If it looks good enough, we go with it.
So that's how you automate managing resources. For each type of resource, you use whatever you can as a unique (or composite) identifier, guesstimate, and prompt the user if it's impossible to get a good enough guess. Because that's how humans do it anyway.
The main thing that keeps me from using Jupyter notebooks for anything that's not entirely Python, is Python.
For me, pipenv/pyenv/conda/poetry/uv/dependencies.txt and the invitable "I need to upgrade Python to run this notebook, ugh, well, ok -- two weeks later - g####m that upgrade broke that unrelated and old ansible and now I cannot fix these fifteen barely held up servers" is pure hell.
I try to stay away from Python for foundational stuff, as any Python project that I work on¹ will break at least yearly on some dependency or other runtime woe. That goes for Ansible, Build Pipelines, deploy.py or any such thing. I would certainly not use Jupyter notebooks for such crucial and foundational automation, as the giant tree of dependencies and requirements it comes with, makes this far worse.
¹ Granted, my job makes me work on an excessive amount of codebases, At least six different Python projects last two months, some requiring python 2.7, some requiring deprecated versions of lib-something.h some cutting edge, some very strict in practice but not documented (It works on the machine of the one dev that works on it as long as he never updates anything?). And Puppet or Chef - being Ruby, are just as bad, suffering from the exact same issues, only that Ruby has had one (and only one!) package management system for decades now.
We recently started using https://marimo.io/ as a replacement for Jupyter notebooks, as it has a number of great improvements, and this seems like a movement in a similar direction.
- I am on a team that oversees a bunch of stuff, some of which I am very hands-on with and comfortable with, and some of which I am vaguely aware exists, but rarely touch
- X, a member of the latter category, breaks
- Everyone who actually knows about X is on vacation/dead/in a meeting
- Fortunately, there is a document that explains what to do in this situation
- It is somehow both obsolete and wrong, a true miracle of bad info
So that is the problem this is trying to solve.
Having discussed this with the creator some[1], the intent here (as I understand it) is to build something like a cross between Jupyter Notebooks and Ansible Tower: documentation, scripts, and metrics that all live next to each other in a way that makes it easier to know what's wrong, how to fix it, and if the fix worked
It shouldn't but often still is... and maybe a runbook like this is easier to handle than a script with possibly 1000 lines and not a single comment.
Of course, in your ideal world maybe nothing of this applies and you never have any incidents ;)
> It is somehow both obsolete and wrong, a true miracle of bad info
How does Atuin solve that problem? It seems to me that inaccurate and obsolete information can be in an Atuin document as easily as in a text document, wiki, etc., but possibly I'm not seeing something?
I'm just a community mod, not a dev on the project, so take this with a grain of salt:
I believe the intent is that you get bidirectional selective sync between your terminal and the docs, so that if what's in the docs is out of date or wrong, then whatever you did to actually fix things can be synced back to the docs to reduce the friction of keeping the docs updated.
To me, it seems like it's because the thing you're fixing is actually the "runbook" that's being run. Instead of separating the documentation from the code, they're married together so it's easier to keep them in sync because you aren't having to remind yourself to go edit this secondary location when you make a quick change.
I'm cautiously curious about something like this, although I haven't tried it personally.
Yes, seems like right now pendulum is going in other way and separation is no longer in fashion and now fashionable thing is to have everything in one place.
The idea seems interesting to me just cause I do not really like terminals and having something more visually appealing and with better history and comments is an improvement though I am also not sure if Atuin is best way to achieve all of that.
Ok I think I see where this is coming from. I actually think seeing you description that it might even be a benefit to none technical people with no knowledge of what's going on. They can follow instructions and easily execute the relevant code what with it all sitting together.
However I don't see how it solves the obsolete or wrong documentation thing. You still have to make sure the runbook is correct, if it's not you've got the exact same problem.
Having a centralised place for all your scripts is an advantage with inline docs. But then this is a local desktop version...
Well, what is the purpose of deployments being built in ansible or deployer or whatever tooling as a general rule? And then packaging, say, extra python scripts to perform common tasks then dumping it all in a git repo?
Some people just like a particular workflow or tooling flow and build it really. Maybe it works for enough people to have a viable market, maybe not.
I am just using a PHP deployment process for no reason other than feeling like it for personal projects and it handles 60% of the work without me needing to do anything. But any runbooks for it are tasks built into the tool and in the same git repo for the entire server deployment. I'm not gonna put it in some random place or a shell script that I need to remember separate commands for.
Code, for programmers, is inherently self-documenting if you keep a simple functional style without any complexity with comments on the occasional section that isn't just "Create a MySQL user, roll the MySQL user's password, update the related services with the new password/user combination, remove the old user that the fired employee has credentials to on the off chance we failed to block them at the VPN" kind of stuff.
My dream tooling is for every tool to have an terminal interface so that I can create comprehensive megabooks to get all the context that lives in my head. i.e. jira, datadog, github etc, all in one pane.
IMHO just an API would be enough, tool could be written on top of that.
My ideal world would be every service, tool and application to have API that I can use i.e. if fridge is open too long (API polling or API webhook) I can send roomba to close it (using API of roomba). Because why not?!
+1. Personally, I’m a fan of TUIs too that make things a bit more user friendly. Just imagine an internal TUI framework that has components for each internal service that you can lego-build into personalised TUI dashboard. Hmm, seems like something I could work on the side at work. Would be a huge undertaking but very interesting.
That's entirely different to what's being desired by GP.
> > My dream tooling is for every tool to have an terminal interface so that I can create comprehensive megabooks to get all the context that lives in my head. i.e. jira, datadog, github etc, all in one pane.
My perspective on this is essentially having jira/datadog/github/etc be pluggable into the CLI, and where standard bash commands & pipes can be used without major restrictions. (Something akin to Yahoo Pipes)
MCP is highly centered around LLMs analyzing user requests & creating queries to be run on MCP servers. What's being desired here doesn't centralize around LLMs in any sense at all.
It’s actually not too far off. Yes MCP is designed for LLM interactions, but we observed that it’s an invocation API that’s pretty generic. So we built a package format that encapsulates computations and makes them accessible from any of MCP, REST, JSON-RPC over WS (the generic cousin of MCP)..
We build logic once and make it automatically accessible from any of these consumption methods, in a standardized way to our clients, and I am indeed piping some of these directly in the CLI to jq and others for analysis.
It's kind of sad the direction they took. The last thing I want is my runbooks being held hostage by my desktop with proprietary and possibly paid software.
Congratulations on the launch! I've been following Atuin for a bit and, while I'm not necessarily the intended audience for this runbook feature, love seeing people build fun new things.
Using C# as main language this allowed us to have runbooks using code shared as nuget package and so being able to interact with our own APIs and applications as any other code that runs in production.
Not the best experience to review but it worked for us.
Thanks! We're using Tauri (https://v2.tauri.app/) on the client, and Elixir + Phoenix (with a little bit of Rust via Rustler) on the server
Tauri means we can reuse a lot of the Rust we already have, easily do the systems stuff we need, and have something light + fast. Elixir has been awesome and makes a realtime sync backend easier
Not currently open source while it's under heavy early development, we will be opening up the desktop app later on
This is one place where it would be more likely to make sense to have an electron app, because with user code, you'd already have a lot of variables out of your control, and having a standard browser engine would help. Also unlike other apps, you hopefully wouldn't have 5 code notebook apps running.
It is bothersome to see people who obviously don’t believe in free software ideology and software freedoms (otherwise you would never produce nonfree software) (ab)using the open source community in this way.
Software freedoms exist as a concept for a reason, not just a bullet point to get people to click a download link that doesn’t even include source anyway.
I call such projects “open source cosplay”. It’s an outfit you put on for conferences, then take off when back at the office working on the nonfree valuable parts.
Atuin's CLI for shell history is open source, has been free for years, and is a very useful tool. If the author now wants to build a product on top so she can make a living, that's a win for everyone: the author, the open source users (since the project will keep being maintained), and people who get value out of the new product she's building.
The irony of this purist mindset is that it's actually very corporatist, big-tech, and proprietary in its implications. If open source devs are discouraged by the culture from building products and making a living independently, it means that the only people who can devote significant time to open source are employees of established companies (who themselves often sell closed source proprietary products) and people who are wealthy enough to work for free. Is that the world you want?
This kind of attitude is why less and less people are open sourcing software
Why would I waste my time releasing any of my projects for free when people will attack me and call me a poser anyway
Might as well charge people money, who by the way will actually be grateful to do so, that try to keep up with the open source community's purity treadmill
I'm really confused by products like this and Warp Drive[0]. What does this add over a shell script?
There is a response elsewhere in comments[1] which claims that this is trying to fix the problem of bad documentation, but this has the same fundamental problem. If you a) are responsible for fixing something, b) are unfamiliar with it, and c) the "fixing resources" - whether those are scripts, documentation, or a Runbook/Workflow - you were provided with by the experts are out-of-date; you're SOL and are going to have to get to investigating _anyway_. A runbook and a script are just different points along the spectrum of "how much of this is automated and how much do I have to copy-paste myself?"[2] - both are vulnerable to accuracy-rot.
Kinda related but just the other day I was thinking of the notebook/runbook workflow and wonder if there is a tool like this that also incorporates git checkpoints (either commit or stash) into it. Like top to bottom, associate all the blocks and resulting artifacts with a commit hash. Might be something to vibe code over the weekend.
what are the problems you're talking about? your references seem to refer to reproducing scientific publications, dependency issues, and cell execution ordering.
this project appears to be intended for operational documentation / living runbooks. it doesn't really seem like the same use case.
agreed - we actually have a dependency system in the works too!
you can define + declare ordering with dependency specification on the edges of the graph (ie A must run before B, but B can run as often as you'd like within 10 mins of A)
There of course should be a way to override the dependency, by explicitly pressing a big scary "[I know what I'm doing]" button.
Another thing is that you'll need branches. As in:
- Run `foo bar baz`
- If it succeeds, run `foo quux`,
Else run `rm -rf ./foo/bar` and rerun the previous command with `--force` option.
- `ls ./foo/bar/buur` and make certain it exists.
Different branches can be separated visually; one can be collapsed if another is taken.
Writing robust runbooks is not that easy. But I love the idea of mixing the explanatory text and various types of commands together.
The use case this addresses is 'adhoc activites must be performed without being totally chaotic'.
Obviously a nice one-click/trigger based CI/CD deployment pipeline is lovely, but uh, this is the real world. There are plenty of cases where that's simply either not possible, or not worth the effort to setup.
I think this is great; if I have one suggestion it would just be integrated logging so there's an immutable shared record of what was actually done as well. I would love to be able to see that Bob started the 'recover user profile because db sync error' runbook but didn't finish running it, and exactly when that happened.
If you think it's a terrible idea, then uh, what's your suggestion?
I'm pretty tired of copy-pasting commands from confluence. I think that's, I dunno, unambiguously terrible, and depressingly common.
One time scripts that are executed in a privileged remote container also works, but at the end of that day, those script tend to be specific and have to be invoked with custom arguments, which, guess what, usually turn up as a sequence of operations in a runbook; query db for user id (copy-paste SQL) -> run script with id (copy paste to terminal) -> query db to check it worked (copy paste SQL) -> trigger notification workflow with user id if it did (login to X and click on button Y), etc.
I'm not against this notebook style, I have runbooks in Jupyter notebooks.
I just think it's pretty easy to do things like start a flow back up halfway through the book and not fix some underlying ordering issues.
With scripts that you tend to have to run top to bottom you end up having to be more diligent with making sure the initial steps are still OK because on every test you tend to run everything. Notebook style environments favor running things piecemeal. Also very helpful! It introduces a much smaller problem in the process of solving the larger issue of making it easier to do this kind of work in the first place.
Agreed. The problem with reproducing Jupyter runbooks in academia is that someone thought a Jupyter runbook is a way to convey information from one person to another. Those are an awful model for that.
As an on-the-fly debugging tool, they're great: you get a REPL that isn't actively painful to use, a history (roughly, since the state is live and every cell is not run every time) of commands run, and visualization at key points in the program to check as you go your assumptions are sound.
Literate programming really needs the ability to reorder, otherwise it’s just sparkling notebooks. (Except for Haskell, which is order-independent enough as it is that the distinction rarely matters.)
“A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. *All in a modern, AI-native editor.*
Why does it need to be in a “modern, AI-native editor”?
For anyone emacs-curious, you can do a similar thing with org-babel
You can have a plaintext file which is also the program which is also the documentation/notebook/website/etc. It's extremely powerful, and is a compelling example of literate programming.
A good take on it here: https://osem.seagl.org/conferences/seagl2019/program/proposa...
Actually, in terms of capabilities, org-babel is among the most capable, if it is not the most capable, systems for literate programming. I have used it to great effect when learning from computer programming books. I can now go back to those literate programs, and understand again much faster, than originally when reading the books. The literate part of it answers my "silly" questions, that come from not remembering 100% of the reasoning or my own thoughts. That said, there is of course a learning curve, and people unwilling to learn something like that are better off not going that route.
Thanks for the shout-out! I think org-babel is really well suited for this task, and can make some really great documentation. You can check out the video[0] from the talk and a git repo[1] with a more advanced demonstration.
[0]: https://www.youtube.com/watch?v=0g9BcZvQbXU
[1]: https://gitlab.com/spudlyo/orgdemo2
Similar with BBEdit's Shell Worksheets, which mingle prose with commands you can run with a keypress.
[dead]
I took a stab at this ~7 years ago - https://nurtch.com/
The idea has a lot of merit. We even gave a talk about it in JupyterCon Paris 2023 - https://www.youtube.com/watch?v=TUYY2kHrTzs
When you have executable code in the documentation, folks want to follow PR-review workflow with the docs as well - which is a bit more team investment than editing a wiki.
Good luck!
My first thought was also "why not jupyter"? Nice to see someone else had the same thought!
This is exactly what I wanted for our team when I was at AWS. There are so many versions of operations which are just slightly too dangerous to automate, and this provides a path to iteratively building that up. Congratulations!
Preface: My opinions are my own and not my employer’s.
Curious how long ago were you at AWS? For context, I spent the last few years in AWS working on an internal platform service whose entire purpose was to reduce operational toil by helping you codify your operational runbooks and execute them safely and automatically. Atuin Desktop is similar to that service in some sense but that service just offered much more features.
When I was at Amazon (pre covid), Eider could've been used for that.
(Hosted notebooks with IAM integration.)
If it's local-first then it's already subject to rot. Unless they're running it all in containers? In which case local doesn't matter.
If you want to record a runbook, then record a runbook. You can do that a million ways. Text file, confluence doc, screen recording, shell script, etc. People already don't do that; they're not gonna suddenly start doing it more because your UI is fancier.
Personally, I don't want to sit around all day writing code (or docs) to try to get the system to be like X state. I want to manually make it have X state, and then run a tool to dump the state, and later re-run the tool to create (or enforce) that state again. I do not want to write code to try to tell the computer how to get to that state. Nor do I want to write "declarative configuration", which is just more code with a different name. I want to do the thing manually, then snapshot it, then replay it. And I want this to work on any system, anywhere, without dependence on monitoring a Bash shell for commands or something. Just dump state and later reapply state.
So you then have binary blobs of state without any documentation of how or why it is the way it is? That doesn't seem maintainable.
Dockerfiles are basically this, but with a file documenting the different steps you took to get to that state.
That’s not what they are saying. They are saying that the system where you have to declare everything manually is annoying (which it is), ideally it would record the changes while you make changes and then deduplicate them, remove unnecessary ones to arrive at the final playbook that can be replayed if needed.
yes it would be nice to have a computer that could read your mind flawlessly.
Sounds like you want autoexpect!
https://linux.die.net/man/1/autoexpect
Such a process is rarely portable though, and will need to be repeated for each different system, at which point it would be great to already have a declarative description, that can automatically be translated into those steps required to get to state X.
> If it's local-first then it's already subject to rot.
Can you expand on this?
That was the Docker manifesto.
> That was the Docker manifesto.
It essentially still is.
Unless the Dockerfiles are kept secret, any container can be replicated from the given Dockerfile. Barring extreme (distro/system/hardware)-level quirks, a Docker container should be able to run anywhere that Linux can.
You are mixing build time reproduction with run time ones.
Docker images (not files) help with the run time consistency .
Docker (files) barely scratch the surface of build reproducibility. Most applications depend on the distribution package manager ( apt, apk etc) and language package manager (npm, cargo, etc), both sets of them have various challenges in consistent dependency resolution.
In addition build steps might have ordering challenges RPC calls to remote services no longer running and so on.
Anyone trying to to build a docker image from 10 years back experiences this problem
You're right in the absolute form, but I've yet to see a Dockerfile where (with a little thinking and elbow grease) I couldn't "easily" port it or update it, even after years.
It's basically the best and easiest "I am documenting how it works now" thing without any arcane "works on my machine" quirks I have yet found.
So I'm still agreeing here that it's a very good approximation of this idea.
Real reproducability is miles better, but usually cannot be formulated in a ~20 line single file "recipe". (and before anyone mentions Nix.. no, there's so much inherent complexity involved, that doesn't count like "apt-get install docker && docker build ."
A container can very rarely be reproduced by a dockerfile.
I imagine with a lot of discipline (no apt update, no “latest” tag, no internet access) you can make a reproducible docker file…. But it is far from normal.
Well sure, making a 100% reproducible build is hard - but Docker makes it easier, not harder. If 100% reproducible is the goal, what's easier than docker?
A Dockerfile is essentially a shell script with access to the outside world. It has unconstrained network access. It can access local hardware and filesystem if instructed to. However, it doesn't verify that whatever stuff it took from the outside remains the same across builds. Docker doesn't care if the same Dockerfile builds Apache httpd in one build and Nginx in another. It literally can't get more irreproducible than that.
But mysteriously, people say that Docker is reproducible because, uh, you can download gigabyte-sized binary blobs from the Docker registry. I wonder, what's not reproducible by that metric?
Docker images may be portable compared to binaries targeting traditional FHS distros. But it's not reproducible whatsoever.
Full reproducibility isn't easy, there is a cost to it.
However the payoff is rather significant so if you can temper that cost a bit and make it less inconvenient to achieve then you have a winning solution.
I have cooked this up based on Bazel, rules_oci and rules_distroless: https://github.com/josephglanville/images Specifically this file is a busybox based image with some utilities included from a Debian snapshot: https://github.com/josephglanville/images/blob/master/toolbo...
More difficult than Dockerfile? Sure. However better in pretty much every way otherwise including actual simplicity.
Vagrant creates reproducible VMs. Not quite the same thing of course.
https://developer.hashicorp.com/vagrant
Tools that have been designed with reproducibility in mind. Like Guix.
Beware, I am definitely not claiming those are easy to use in general. Just that you can get to reproducibility using them more reliably and maybe easier than with docker.
> but Docker makes it easier, not harder
Incorrect. Step one of reproducibility is "disable unconstrained downloading from the internet". Docker does the opposite.
Presumably if your goal is a reproducible build you just wouldn't do any unconstrained downloading in the process of designing the dockerfile and building the image. Making a choice to use a tool poorly for you requirements isn't a problem with the tool.
It kind of sounds like you're describing Ansible. You use modules for common tasks like ensuring a package is installed, a file is present or has certain content, etc. It's declarative and imdempotent.
I've written some fairly complex stuff in Ansible. It is mostly declarative but you should be careful with assumptions about its idempotency, especially if you reach out for community modules.
what happens when you want to tweak something you did in the middle of this process? do you have to go through the whole flow again manually to make a single change?
I imagine you could either A) just modify the dumped state, B) paramaterize it, C) have the program split up the state into transactions and modify those. The program will probably have to take more than one step, in order, in order to accomplish everything. If it fails, you'd want it to try to undo it, ala transactions. And since it can do all that, it can stop, start, or resume at specific steps.
Like, Terraform has always sucked because there was no way to dump existing resources as new code. So a team at Google made a tool to do it (Terraform-er). If Terraform had already had that feature, and if it didn't rely on having pre-existing state to manage resources, that would be like 90% of the way to what I'd want. Just dump resources as code, then let me re-run the code, and if I want I can modify the code to ask me for inputs or change things. (People think of Terraform as only working on Cloud resources, but you could (for example) make an Ubuntu Linux provider that just configures Ubuntu for you, if you wanted)
Any notion of state that satisfies requirements like
> Just dump state and later reapply state
is necessarily declarative.
> Just dump resources as code,
What is the code for this resource?
It depends on the current state of the system where the resource is applied. If VM foo1 already exists, with 16GiB of memory, and connected to network mynet1, then the code is a no-op, no code at all. Right? Anything else would be a mistake. For example if the code would delete any matching VM and re-create it, that would be disastrous to continuity and availability, clearly a non-starter. Or, if VM foo1 exists, with 16GiB of memory, but connected to anothernet3, then the code should just change the network for that VM from anothernet3 to mynet1, and should definitely not destroy and re-create the VM entirely. And so on.It depends what you're talking about; Terraform specifically has a flawed model where it assumes nothing in the world exists that it didn't create itself. Other configuration management tools don't assume that; they assume that you just want an item to exist; if it does exist, great, if it doesn't exist, you create it. But for a moment I'll assume you're talking about the other problem with configuration management tools, which is "which of the existing resources do I actually want to exist or modify?"
That's a solved problem. Anything that you use on a computer that controls a resource, can uniquely identify said resource, through either a key or composite key. This has to be the case, otherwise you could create things that you could never find again :) (Even if you created an array of things with no name, since it exists as an item in a list, the list index is its unique identifier)
Taking Terraform as example again, the provider has code in it that specifies what the unique identifier is, per-resource. It might be a single key (like 'id', 'ASN', 'Name', etc) or a composite key ( {'id' + 'VPC' + 'Region'} ).
If the code you've dumped does not have the unique identifier for some reason, then the provider has to make a decision: either try to look up existing resources that match what you've provided and assume the closest one is the right one, or error out that the unique identifier is missing. Usually the unique identifier is not hard to look up in the first place (yours has a composite identifier: {VM:"foo1", Network:"mynet1"}). But it's also (usually) not fool-proof.
Imagine a filesystem. You actually have two unique identifiers: the fully-qualified file path, and the inode number. The inode number is the actual unique identifier in the filesystem, but we don't tend to reference it, as 1) it's not that easy to remember/recognize an inode number, 2) it can be recycled for another file, 3) it'll change across filesystems. We instead reference the file path. But file paths are subtly complex: we have sym-links, hard-links and bind-mounts, so two different paths can actually lead to the same file, or different files! On top of that, you can remove the file and then create an identically-named file. Even if the file had identical contents, removing it and creating a new one is technically a whole new resource, and has impact on the system (permissions may be different, open filehandles to deleted files are a thing, etc).
So what all of us do, all day, every day, is lie to ourselves. We pretend we can recognize files, that we have a unique identifier for them. But actually we don't. What we do is use a composite index and guess. We say, "well it looks like the right file, because it's in the right file path, with the right size, and right name, and right permissions, and (maybe) has the right inode". But actually there's no way to know for sure it's the same file we expect. We just hope it is. If it looks good enough, we go with it.
So that's how you automate managing resources. For each type of resource, you use whatever you can as a unique (or composite) identifier, guesstimate, and prompt the user if it's impossible to get a good enough guess. Because that's how humans do it anyway.
How is this different from a local Jupyter notebook? Can we not do this with ! or % in a .ipynb?
Genuine question. Not familiar with this company or the CLI product.
The main thing that keeps me from using Jupyter notebooks for anything that's not entirely Python, is Python.
For me, pipenv/pyenv/conda/poetry/uv/dependencies.txt and the invitable "I need to upgrade Python to run this notebook, ugh, well, ok -- two weeks later - g####m that upgrade broke that unrelated and old ansible and now I cannot fix these fifteen barely held up servers" is pure hell.
I try to stay away from Python for foundational stuff, as any Python project that I work on¹ will break at least yearly on some dependency or other runtime woe. That goes for Ansible, Build Pipelines, deploy.py or any such thing. I would certainly not use Jupyter notebooks for such crucial and foundational automation, as the giant tree of dependencies and requirements it comes with, makes this far worse.
¹ Granted, my job makes me work on an excessive amount of codebases, At least six different Python projects last two months, some requiring python 2.7, some requiring deprecated versions of lib-something.h some cutting edge, some very strict in practice but not documented (It works on the machine of the one dev that works on it as long as he never updates anything?). And Puppet or Chef - being Ruby, are just as bad, suffering from the exact same issues, only that Ruby has had one (and only one!) package management system for decades now.
100% same question.
Usually, I feel like Jupyter gives both worlds—- flexible scripting and support for os commands (either through !/% or even os.system()
Jupyter Notebooks have always felt a bit hacky for terminal purposes to me, so I'm excited to give this a shot.
How about marimo?
Looks interesting!
We recently started using https://marimo.io/ as a replacement for Jupyter notebooks, as it has a number of great improvements, and this seems like a movement in a similar direction.
This looks super similar to https://runme.dev
This is amazing!
Exactly what I was looking for, thanks!
I can't say I see the point in this. Can someone explain what I'm missing? Why would I use this over a simple shell script?
My experience with runbooks has been:
- I am on a team that oversees a bunch of stuff, some of which I am very hands-on with and comfortable with, and some of which I am vaguely aware exists, but rarely touch
- X, a member of the latter category, breaks
- Everyone who actually knows about X is on vacation/dead/in a meeting
- Fortunately, there is a document that explains what to do in this situation
- It is somehow both obsolete and wrong, a true miracle of bad info
So that is the problem this is trying to solve.
Having discussed this with the creator some[1], the intent here (as I understand it) is to build something like a cross between Jupyter Notebooks and Ansible Tower: documentation, scripts, and metrics that all live next to each other in a way that makes it easier to know what's wrong, how to fix it, and if the fix worked
[1]Disclosure: I help mod the atuin Discord
If the fix/solution would be easily describable and automate-able, it wouldn't/shouldn't be a problem anyway. I don't see how this solves anything.
It shouldn't but often still is... and maybe a runbook like this is easier to handle than a script with possibly 1000 lines and not a single comment. Of course, in your ideal world maybe nothing of this applies and you never have any incidents ;)
> It is somehow both obsolete and wrong, a true miracle of bad info
How does Atuin solve that problem? It seems to me that inaccurate and obsolete information can be in an Atuin document as easily as in a text document, wiki, etc., but possibly I'm not seeing something?
I'm just a community mod, not a dev on the project, so take this with a grain of salt:
I believe the intent is that you get bidirectional selective sync between your terminal and the docs, so that if what's in the docs is out of date or wrong, then whatever you did to actually fix things can be synced back to the docs to reduce the friction of keeping the docs updated.
Thanks for this explanation. This makes sense.
To me, it seems like it's because the thing you're fixing is actually the "runbook" that's being run. Instead of separating the documentation from the code, they're married together so it's easier to keep them in sync because you aren't having to remind yourself to go edit this secondary location when you make a quick change.
I'm cautiously curious about something like this, although I haven't tried it personally.
Yes, seems like right now pendulum is going in other way and separation is no longer in fashion and now fashionable thing is to have everything in one place.
The idea seems interesting to me just cause I do not really like terminals and having something more visually appealing and with better history and comments is an improvement though I am also not sure if Atuin is best way to achieve all of that.
Ok I think I see where this is coming from. I actually think seeing you description that it might even be a benefit to none technical people with no knowledge of what's going on. They can follow instructions and easily execute the relevant code what with it all sitting together.
However I don't see how it solves the obsolete or wrong documentation thing. You still have to make sure the runbook is correct, if it's not you've got the exact same problem.
Having a centralised place for all your scripts is an advantage with inline docs. But then this is a local desktop version...
Seems like this is literate programming for shell scripts.
Thus “Runbooks That Run.”
Because it's written in Rust and this is Hacker News.
I was going to talk about using powershell but just for the rust I also really like Nushell. I personally would take either one over this...
Well, what is the purpose of deployments being built in ansible or deployer or whatever tooling as a general rule? And then packaging, say, extra python scripts to perform common tasks then dumping it all in a git repo?
Some people just like a particular workflow or tooling flow and build it really. Maybe it works for enough people to have a viable market, maybe not.
I am just using a PHP deployment process for no reason other than feeling like it for personal projects and it handles 60% of the work without me needing to do anything. But any runbooks for it are tasks built into the tool and in the same git repo for the entire server deployment. I'm not gonna put it in some random place or a shell script that I need to remember separate commands for.
Code, for programmers, is inherently self-documenting if you keep a simple functional style without any complexity with comments on the occasional section that isn't just "Create a MySQL user, roll the MySQL user's password, update the related services with the new password/user combination, remove the old user that the fired employee has credentials to on the off chance we failed to block them at the VPN" kind of stuff.
Will this be open source like Atuin CLI and the sync server are? Is this going to be productized?
It'll be Open Source'd: https://news.ycombinator.com/item?id=43766200#43766584
Are you worried about getting rug pulled by the platform?
Most likely not free. Regardless, happy to see this be announced!
My dream tooling is for every tool to have an terminal interface so that I can create comprehensive megabooks to get all the context that lives in my head. i.e. jira, datadog, github etc, all in one pane.
IMHO just an API would be enough, tool could be written on top of that. My ideal world would be every service, tool and application to have API that I can use i.e. if fridge is open too long (API polling or API webhook) I can send roomba to close it (using API of roomba). Because why not?!
World of API...
+1. Personally, I’m a fan of TUIs too that make things a bit more user friendly. Just imagine an internal TUI framework that has components for each internal service that you can lego-build into personalised TUI dashboard. Hmm, seems like something I could work on the side at work. Would be a huge undertaking but very interesting.
Not sure what you mean- github and datadog already have official CLI tools.
Jira has official CLI as well https://appfire.atlassian.net/wiki/spaces/JCLI/overview
and plenty of unofficial ones
Maybe something like wtfutil? (Although wtf development has been stuck for a year, but I guess that's the general idea...)
https://wtfutil.com/
You might like MCP then.
> You might like MCP then.
That's entirely different to what's being desired by GP.
> > My dream tooling is for every tool to have an terminal interface so that I can create comprehensive megabooks to get all the context that lives in my head. i.e. jira, datadog, github etc, all in one pane.
My perspective on this is essentially having jira/datadog/github/etc be pluggable into the CLI, and where standard bash commands & pipes can be used without major restrictions. (Something akin to Yahoo Pipes)
MCP is highly centered around LLMs analyzing user requests & creating queries to be run on MCP servers. What's being desired here doesn't centralize around LLMs in any sense at all.
It’s actually not too far off. Yes MCP is designed for LLM interactions, but we observed that it’s an invocation API that’s pretty generic. So we built a package format that encapsulates computations and makes them accessible from any of MCP, REST, JSON-RPC over WS (the generic cousin of MCP)..
We build logic once and make it automatically accessible from any of these consumption methods, in a standardized way to our clients, and I am indeed piping some of these directly in the CLI to jq and others for analysis.
[dead]
It's kind of sad the direction they took. The last thing I want is my runbooks being held hostage by my desktop with proprietary and possibly paid software.
Congratulations on the launch! I've been following Atuin for a bit and, while I'm not necessarily the intended audience for this runbook feature, love seeing people build fun new things.
Our team used polyglot notebooks https://marketplace.visualstudio.com/items?itemName=ms-dotne...
Using C# as main language this allowed us to have runbooks using code shared as nuget package and so being able to interact with our own APIs and applications as any other code that runs in production.
Not the best experience to review but it worked for us.
Have been following along with the development, glad to see it announced!
Looks neat. What tech stack is used for this? Is it open source by chance?
Thanks! We're using Tauri (https://v2.tauri.app/) on the client, and Elixir + Phoenix (with a little bit of Rust via Rustler) on the server
Tauri means we can reuse a lot of the Rust we already have, easily do the systems stuff we need, and have something light + fast. Elixir has been awesome and makes a realtime sync backend easier
Not currently open source while it's under heavy early development, we will be opening up the desktop app later on
Are there any plans to add an integration to something like Phoenix LiveBook?
> we will be opening up the desktop app later on
This leaves room for stuff like the Functional Software License.
Amazing. Im very happy this is not yet another electron app
Tauri wraps around the system's web view, so it's semantically equivalent to Electron.
(nb: system web views are very inconsistent, so they're considering adding a Chromium renderer, which will bring everything full circle)
This is one place where it would be more likely to make sense to have an electron app, because with user code, you'd already have a lot of variables out of your control, and having a standard browser engine would help. Also unlike other apps, you hopefully wouldn't have 5 code notebook apps running.
It is bothersome to see people who obviously don’t believe in free software ideology and software freedoms (otherwise you would never produce nonfree software) (ab)using the open source community in this way.
Software freedoms exist as a concept for a reason, not just a bullet point to get people to click a download link that doesn’t even include source anyway.
I call such projects “open source cosplay”. It’s an outfit you put on for conferences, then take off when back at the office working on the nonfree valuable parts.
Atuin's CLI for shell history is open source, has been free for years, and is a very useful tool. If the author now wants to build a product on top so she can make a living, that's a win for everyone: the author, the open source users (since the project will keep being maintained), and people who get value out of the new product she's building.
The irony of this purist mindset is that it's actually very corporatist, big-tech, and proprietary in its implications. If open source devs are discouraged by the culture from building products and making a living independently, it means that the only people who can devote significant time to open source are employees of established companies (who themselves often sell closed source proprietary products) and people who are wealthy enough to work for free. Is that the world you want?
This kind of attitude is why less and less people are open sourcing software
Why would I waste my time releasing any of my projects for free when people will attack me and call me a poser anyway
Might as well charge people money, who by the way will actually be grateful to do so, that try to keep up with the open source community's purity treadmill
Do you want it to be open source because of the price or because you’re afraid of being rug pulled by the platform or you want to contribute?
If I use something I like the idea that I can fix bugs should the need arise.
I'm really confused by products like this and Warp Drive[0]. What does this add over a shell script?
There is a response elsewhere in comments[1] which claims that this is trying to fix the problem of bad documentation, but this has the same fundamental problem. If you a) are responsible for fixing something, b) are unfamiliar with it, and c) the "fixing resources" - whether those are scripts, documentation, or a Runbook/Workflow - you were provided with by the experts are out-of-date; you're SOL and are going to have to get to investigating _anyway_. A runbook and a script are just different points along the spectrum of "how much of this is automated and how much do I have to copy-paste myself?"[2] - both are vulnerable to accuracy-rot.
[0]: https://www.warp.dev/warp-drive
[1]: https://news.ycombinator.com/item?id=43766842
[2]: https://blog.danslimmon.com/2019/07/15/do-nothing-scripting-...
> I'm really confused by products like this and Warp Drive[0]. What does this add over a shell script?
Because everything is a start-up now.
Oh, that's really neat! Thanks for sharing!
This sort of slogan says nothing about what actually makes it worth looking into.
What more do you need than "written in Rust"?
This makes me think of using org mode to build runbooks.
Cool name, a reference to well known books
the waitlist social media jump the list mechanic is kinda sus, regardless joined the waitlist
This looks so dope!
Kinda related but just the other day I was thinking of the notebook/runbook workflow and wonder if there is a tool like this that also incorporates git checkpoints (either commit or stash) into it. Like top to bottom, associate all the blocks and resulting artifacts with a commit hash. Might be something to vibe code over the weekend.
All the problems of reproducibility in Python notebooks (https://arxiv.org/abs/2308.07333, https://leomurta.github.io/papers/pimentel2019a.pdf) with the power of a terminal.
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
https://news.ycombinator.com/newsguidelines.html
what are the problems you're talking about? your references seem to refer to reproducing scientific publications, dependency issues, and cell execution ordering.
this project appears to be intended for operational documentation / living runbooks. it doesn't really seem like the same use case.
I mean it feels pretty obvious to me that cell execution order is a pretty real issue for a runbook with a bunch of steps if you're not careful.
I do think that given the fragile nature of shell scripts people tend to write their operation workflows in a pretty idempotent way, though...
agreed - we actually have a dependency system in the works too!
you can define + declare ordering with dependency specification on the edges of the graph (ie A must run before B, but B can run as often as you'd like within 10 mins of A)
There of course should be a way to override the dependency, by explicitly pressing a big scary "[I know what I'm doing]" button.
Another thing is that you'll need branches. As in:
Different branches can be separated visually; one can be collapsed if another is taken.Writing robust runbooks is not that easy. But I love the idea of mixing the explanatory text and various types of commands together.
I mean, is it worse than having it:
- in excel
- in a confluence document
- in a text file on your desktop
The use case this addresses is 'adhoc activites must be performed without being totally chaotic'.
Obviously a nice one-click/trigger based CI/CD deployment pipeline is lovely, but uh, this is the real world. There are plenty of cases where that's simply either not possible, or not worth the effort to setup.
I think this is great; if I have one suggestion it would just be integrated logging so there's an immutable shared record of what was actually done as well. I would love to be able to see that Bob started the 'recover user profile because db sync error' runbook but didn't finish running it, and exactly when that happened.
If you think it's a terrible idea, then uh, what's your suggestion?
I'm pretty tired of copy-pasting commands from confluence. I think that's, I dunno, unambiguously terrible, and depressingly common.
One time scripts that are executed in a privileged remote container also works, but at the end of that day, those script tend to be specific and have to be invoked with custom arguments, which, guess what, usually turn up as a sequence of operations in a runbook; query db for user id (copy-paste SQL) -> run script with id (copy paste to terminal) -> query db to check it worked (copy paste SQL) -> trigger notification workflow with user id if it did (login to X and click on button Y), etc.
I'm not against this notebook style, I have runbooks in Jupyter notebooks.
I just think it's pretty easy to do things like start a flow back up halfway through the book and not fix some underlying ordering issues.
With scripts that you tend to have to run top to bottom you end up having to be more diligent with making sure the initial steps are still OK because on every test you tend to run everything. Notebook style environments favor running things piecemeal. Also very helpful! It introduces a much smaller problem in the process of solving the larger issue of making it easier to do this kind of work in the first place.
Agreed. The problem with reproducing Jupyter runbooks in academia is that someone thought a Jupyter runbook is a way to convey information from one person to another. Those are an awful model for that.
As an on-the-fly debugging tool, they're great: you get a REPL that isn't actively painful to use, a history (roughly, since the state is live and every cell is not run every time) of commands run, and visualization at key points in the program to check as you go your assumptions are sound.
This is more like literate programming (but for shells) than jupyter notebooks.
Literate programming really needs the ability to reorder, otherwise it’s just sparkling notebooks. (Except for Haskell, which is order-independent enough as it is that the distinction rarely matters.)
Give marimo a try, it's much better for reproducibility.
linky https://github.com/marimo-team/marimo#:~:text=all%20in%20a%2... (Apache 2)
From their repo:
“A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. *All in a modern, AI-native editor.*
Why does it need to be in a “modern, AI-native editor”?
(Closing tab, flashing marimo out of brain)
[dead]