In the context of traditional SaaS, using dynamic secrets loaded at runtime (KMS+Dynamo, etc.).
For agentic tools and pure agents, a proxy is the safest approach. The agent can even think it has a real API key, but said key is worthless outside of the proxy setting.
It suprises me how often I see some Dockerfile, Helm, Kubernetes, Ansible etc write .env files to disk in some production-alike environment.
The OS, especially linux - most common for hosting production software - is perfectly capable of setting and providing ENV vars. Almost all common devops and older sysadmin tooling can set ENV vars. Really no need to ever write these to disk.
I think this comes from unaware developers that think a .env file, and runtime logic that reads this file (dotenv libs) in the app are required for this to work.
I certainly see this misconception a lot with (junior) developers working on windows.
- you don't need dotenv libraries searching files, parsing them, etc in your apps runtime. Please just leave it to the OS to provide the ENV vars and read those, in your app.
- Yes, also on your development machine. Plenty of tools from direnv to the bazillion "dotenv" runners will do this for you. But even those aren't required, you could just set env vars in .bashrc, /etc/environment (Don't put them there, though) etc.
- Yes, even for windows, plenty of options, even when developers refuse to or cannot use wsl. Various tools, but in the end, just `set foo=bar`.
Environment variables are -by far- the securest AND most practical way to provide configuration and secrets to apps.
Any other way is less secure: files on disk, (cli)arguments, a database, etc. Or about as secure but far more complex and convoluted. I've seen enterprise hosting with a (virtual) mount (nfs, etc) that provides config files - read only - tight permissions, served from a secure vault. A lot of indirection for getting secrets into an app that will still just read them plain text. More secure than env vars? how?
Or some encrypted database/vault that the app can read from using - a shared secret provided as env var or on-disk config file.
Disagree, the best way to pass secrets is by using mount namespaces (systemd and docker do this under /run/secrets/) so that the can program can access the secrets as needed but they don't exist in the environment. The process is not complicated, many system already implement it. By keeping them out of ENV variables you no longer have to worry about the entire ENV getting written out during a crash or debugging and exposing the secrets.
How does a mounted secret (vault) protect against dumping secrets on crash or debugging?
The app still has it. It can dump it. It will dump it. Django for example (not a security best practice in itself, btw) will indeed dump ENV vars but will also dump its settings.
The solution to this problem lies not in how you get the secrets into the app, but in prohibiting them getting out of it.
E.g. builds removing/stubbing tracing, dumping entirely. Or with proper logging and tracing layers that filter stuff.
There really is no difference, security wise, between logger.debug(system.env) and logger.debug(app.conf)
Really depends on your threat model and use case. The problems with .env files: plain text on disk, no access control, no rotation mechanism, no audit trail, trivial to leak accidentally, secrets go into env variables (which are exposed and often leak). Which of those do you care about? What are you trying to prevent?
At the simplest level, keeping .env-ish files, use sops + age [1] or dotenvx [2] (or similar) to encrypt just the values. You keep the .env file approach, the actual secrets are encrypted, and now you can check the file in and track changes without leaking your secrets. You still have the env variable problems.
There are some options that'll use virtual files to get your secrets from a vault to your process's env variables, or you can read the secrets from a secret manager yourself into env variables, but that feels like more complexity without a lot more gain to me. YMMV.
You could use a regular password manager (your OS's keychain, 1Password and its ilk, etc) if you're just working on your own. Also in the more complexity without much gain category for me.
If you want to use a local file on disk, you could use a config file with locked down permissions, so at least it's not readable by anything that comes along. ssh style.
Better is to have your code (because we're talking about your code, I assume) read from secret managers itself. Whether that's Bitwarden, AWS / GCP / Azure (well, maybe not Azure), Hashicorp, or one of the many other enterprisey options. That way you get an audit trail and easy rotation, plus no env variables and no plain text at rest. You can still leak them, but you have fewer ways to do so.
Speaking of leaking accidentally, the two most common paths: Logging output and Docker files. The first is self explanatory, though don't forget about logging HTTP requests with auth headers that you don't want exposed. The second is missed by a lot of people. If you inject secrets into your Dockerfile via `ARG` or `ENV` that gets baked into the image and is easy to get back out. Use `--mount-type=secret` etc. (Never use the old Docker base64 stored secrets in config. That's just silly.)
There are other permutations and in-between steps, these are just the big ones. Like all security stuff, the details really depend on your specific needs. It is easy to say, though, that plain text .env files injected into env variables are at the bad end of the spectrum. Passing the secrets in as plain text args on the command line is worse, so at least you're not doing that!
This is a great breakdown. Particularly the point about Docker ARG/ENV baking secrets into images — that catches so many teams.
On the "read from secret managers directly" option — that's the ideal but the friction is what kills adoption. Most small teams look at Vault's setup guide and go back to .env files. Doppler and Infisical lowered that bar but they're still priced for enterprise ($18/user/mo for Doppler's team plan).
I've been building secr (https://secr.dev) to try to hit the sweet spot: real encryption (AES-256-GCM, envelope encryption, KMS-wrapped keys) with a CLI that feels as simple as dotenv. secr run -- npm start and your app reads process.env like normal. Plus deployment sync so you can secr push --target render instead of copy-pasting into dashboards.
The env variable leakage problem you mention is real and something I don't think any tool fully solves without the proxy approach hardsnow described. But removing the plaintext-file-on-disk vector and the sharing-over-Slack vector covers the majority of real-world leaks.
You can get lots of tokens per second on the CPU if the entire network fits in L1 cache. Unfortunately the sub 64 kiB model segment isn't looking so hot.
But actually ... 3000? Did GP misplace one or two zeros there?
I wondered the same, but the rendering seems right, the output was almost instant. I'll recheck the token counter; anyway as you say, fast isn't practical. Actually I had to develop my own tiny model https://huggingface.co/xaskasdf/brandon-tiny-10m-instruct to fit something "usable", and it's basically a liar or disinformation machine haha
What if someone deploys an agent with the aim of creating cleverly hidden back doors which only align with weaknesses in multiple different projects? I think this is going to be very bad and then very good for open source.
I had asked some more Harry Potter questions before regarding Book 4. I was listening to the Full Cast Edition audiobook and was using Chatgpt for clarifications. I rechecked this was the response and not the thinking.
“There’s plenty of space at the bottom” only really took off in popularity decades later. Feynman’s accomplishments are undeniable, Nobel prize and all, but his celebrity status is given by other aspects of his personality. No Feynman equivalent I can think of is alive today. Perhaps Geoffrey Hinton and his views on the risk of AGI? He’s far from the only one of course.
I tried to vibe code a technical not so popular niche and failed. Then I broke down the problem as much as I could and presented the problem in clearer terms and Gemini provided working code in just a few attempts. I know this is an anecdote, but try to break down the problem you have in simpler terms and it may work. Niche industry specific frameworks are a little difficult to work with in vibe code mode. But if you put in a little effort, AI seems to be faster than writing code all on your own.
by the time you're coding your problem should be broken down to atoms; that isn't needed anymore if you break it down to pieces which LLMs can break down to atoms instead.
> I know this is an anecdote, but try to break down the problem you have in simpler terms
This should be the first thing you try. Something to keep in mind is that AI is just a tool for munging long strings of text. It's not really intelligent and it doesn't have a crystal ball.
To add on to this, I see many complaints that "[AI] produced garbage code that doesn't solve the problem" yet I have never seen someone say "I set up a verification system where code that passes the tests and criteria and code that does not is identified as follows" and then say the same thing after.
To me it reads like saying "I typed pseudocode into a JS file and it didn't compile , JS is junk". If people learn to use the tool, it works.
Anecdotally, I've been experimenting with migrations between languages and found LLMs taking shortcuts, but when I added a step to convert the source code's language to an AST and the transformed code to another AST and then designed a diff algorithm to compare the logic is equivalent in the converted code, and to retry until it matches within X tolerance, then it stopped outputting shortcuts because it simply would just continue until there were no shortcuts made. I suspect complainants are not doing this.
I feel that the devil is in the edge cases and this allows you to have the freedom to say "ok I want to try for 1.0 match between everything, I can accept 0.98 match, and files which have less of a match it can detail notes for and I can manually approve them". So for things where the languages differ too much for specific patterns such as maybe an event handing module, you can allow more leniency and tell it to use the target languages patterns more easily, without having to be so precise as to define every single transformation as you would with a transpiler.
It's called problem decomposition and agentic coding systems do some of this by themselves now: generate a plan, break the tasks into subgoals, implement first subgoal, test if it works, continue.
That's nice if it works, but why not look at the plan yourself before you let the AI have its go at it? Especially for more complex work where fiddly details can be highly relevant. AI is no good at dealing with fiddly.
That's what you can do. Tell the AI to make a plan in an MD file, review and edit it, and then tell another AI to execute the plan. If the plan is too long, split it into steps.
This has been a well integrated feature in cursor for six months.
As a rule of thumb, almost every solution you come up with after thirty seconds of thought for a online discussion, has been considered by people doing the same thing for a living.
There’s nothing stopping you from reviewing the plan or even changing it yourself. In the setup I use the plan is just a markdown file that’s broken apart and used as the prompt.
> I know this is an anecdote, but try to break down the problem you have in simpler terms and it may work.
This is an expected outcome of how LLMs handle large problems. One of the "scaling" results is that the probability of success depends inversely on the problem size / length / duration (leading to headlines like "AI can now automate tasks that take humans [1 hour/etc]").
If the problem is broken down, however, then it's no longer a single problem but a series of sub-problems. If:
* The acceptance criteria are robust, so that success or failure can be reliably and automatically determined by the model itself,
* The specification is correct, in that the full system will work as-designed if the sub-parts are individually correct, and
* The parts are reasonably independent, so that complete components can be treated as a 'black box', without implementation detail polluting the model's context,
... then one can observe a much higher overall success rate by taking repeated high-probability shots (on small problems) rather than long-odds one-shots.
To be fair, this same basic intuition is also true for humans, but the boundaries are a lot fuzzier because we have genuine long-term memory and a lifetime of experience with conceptual chunking. Nobody is keeping a million-line codebase in their working memory.
There is almost zero credible evidence I think you could point to that this even vaguely resembles a credible path that we are on in reality. Sometimes theoretical models don’t match reality and this sure seems to be a good example of that.
reply