This is great. One thing that's worth remembering about randomly generated token...

csnweb · on Aug 24, 2021

I am wondering how that relates to searching for the token in a database (index). Does it still matter? I would assume the time might depend on where or even wether the token is in the index or not, not sure how much though and what to do about it.

simonw · on Aug 24, 2021

Yes, it matters - timing attacks could absolutely work against an indexed column in a database.

The trick I've used for this is to have tokens that look like this:

"234523:7a561002f780e23853bdfbd89ae79bf2"

Then you have database entries like this:

id = 234523 token = 7a561002f780e23853bdfbd89ae79bf2

When a token comes in you use the bit before the : to look up the database record by its indexed ID, then perform a safe comparison between the rest of the token and the value you retrieved from the database.

zaroth · on Aug 25, 2021

I’d rather store an HMAC of the token.

That way you’re hashing the user input, which sanitizes it. It also protects against timing attacks.*

And most importantly it protects against credentials leaking. Even insiders who can read a token from the database can’t use it to authenticate, because the app logic expects a pre-image of that value.

*HMAC resists timing attacks because the attacker can no longer control the bit pattern on either side of the equality check.

juancampa · on Aug 25, 2021

So basically passwords

ezekg · on Aug 25, 2021

Tokens are basically passwords, so yes.

kortex · on Aug 25, 2021

No.

Semantically these are very different objects/concepts

Passwords:

- tightly couple authentication (I am kortex) and authorization (I can do everything logged-in kortex can do)

- not meant to be rotated unless breached (forced password rotation mandates weaken passwords, imho)

- are passwords meant for human transport, if even briefly

- hard to revoke (once expired, still valid until user rotates)

- usually insufficient entropy, dependent on user habits

- one flavor: a secret string

Tokens:

- decouple authentication and authorization

- ephemeral

- revokable

- enable automatic rotation (session tokens vs auth tokens)

- can have metadata

- as much entropy as you want

- many flavors

ezekg · on Aug 25, 2021

Give me a break -- we're just arguing semantics at this point. My whole premise was that tokens are a secret, just like passwords are a secret, and that tokens should be treated like a first-class secret. Not that they're absolutely "the same." The article gives the illusion that Simple Random Tokens are safe to store as plain text. I was arguing that it's not, and that the token should be ran through an HMAC function at the very least to prevent plain text storage and to harden against timing attacks when querying. Nowhere was I implying that passwords and tokens are strictly identical in all aspects. I've worked at places where tokens are stored in plain text (i.e. almost everywhere I've worked) and it's a bad practice.

sbierwagen · on Aug 25, 2021

Passwords are hashed because users constantly reuse them between sites. If they didn't do that there would be no reason to hash them.

kortex · on Aug 25, 2021

I thought salting+hashing was primarily to mitigate damage in an event of someone scooping the credentials data store. And/or bruteforcing.

If there were one login system/site in the whole world, you'd still want to hash/encrypt them.

I mean hashing is one of the easiest things to do to add a layer of security, so even if the db is encrypted, why not? Maybe I'm missing your point.

ezekg · on Aug 25, 2021

No, passwords are salted because they are reused...

tptacek · on Aug 25, 2021

This is not true at all. This whole discussion is pretty confused. All of the mitigations we apply to passwords are premised on passwords being (1) cryptographically weak, (2) irrevocable, and (3) infectious to other systems. Nerds are, for reasons I will never understand, hyperfixated on "salting". Randomized password hashes primarily address an old attack that is more or less irrelevant to any modern hash or KDF. But that attack, too, mattered because (1), (2), and (3) hold for passwords.

kortex · on Aug 25, 2021

Ah ok I see what you are saying. My bad.

Totally agree with your main point of "tokens, like passwords, should be kept discrete."

tptacek · on Aug 25, 2021

Tokens are secrets.

Passwords are secrets.

Ed25519 private keys are secrets.

We hash passwords.

Thus we must hash tokens.

Should we therefore hash our Ed25519 private keys?

ezekg · on Aug 25, 2021

> Should we therefore hash our Ed25519 private keys?

No, but I'm of the opinion that you should encrypt private keys before storing them in a database.

tptacek · on Aug 25, 2021

What about the key that decrypts the private keys?

cpach · on Aug 25, 2021

Encrypt that one too! Chelonia all the way down.

some_furry · on Aug 26, 2021

Isn't there a weird attack possible with publishing SHA512 hashes of Ed25519 private keys that Signify barely side-stepped?

tptacek · on Aug 25, 2021

Not really, no.

ezekg · on Aug 25, 2021

You wrote that entire blog post ... and you disagree that tokens are essentially passwords? Why so?

tptacek · on Aug 25, 2021

You might find the answer in the post.

ezekg · on Aug 25, 2021

I read the entire post. Still not following. Perhaps just a misunderstanding between us?

jbverschoor · on Aug 25, 2021

Same here. These are all examples of the same things. If you get your hands on the secret, you can do anything.

The only thing the post explains that there is a difference between checking a secret with the database vs message validation, also based on a secret. But for both, if you have the secret, you're doomed.

Expiration / immediate revocation when doing message validation is more difficult than having a distributed cache, which supports key eviction/expiry/removal.

Also, an asymmetric solution has similar problems: if you get the private key from the client, you're doomed. So it does protect the server part, but if people have access there, you're even more doomed. But at least it'll be easier to identify who had a security leak./

kevinmcconnell · on Aug 25, 2021

I think the difference may be in the "They’re easily revoked and expired" part.

Passwords are usually one-per-resource, e.g. a user has a single password. If that password is compromised you can reset it, but all the consumers that used it will need the new credential.

Whereas tokens are typically one-per-consumer, so if one is compromised you can revoke/expire just that token, without affecting other consumers.

Same with expiry times -- you can set that on a token without it expiring all access to that resource.

Not sure if that's what the author had in mind, but it's a difference in how these things are often used (even if fundamentally they are otherwise very similar).

tptacek · on Aug 25, 2021

Part of the problem here is that these discussions mean a bunch of different things when they draw the equivalence between passwords, tokens, and keys. Sometimes we're talking about the secret storage problem; sometimes we're talking about the bearer token problem; sometimes we're talking about brute-force, human-recall, and reuse problems.

Frustratingly, different tokens have different levels of exposure to all of these problems. A well-confined Macaroon has essentially none of the secret storage problems of a password. An API request authenticating key has none of the bearer token problems. No API token has the human-recall, reuse, and brute-force problems.

People hyperfixate on the secret storage problem, I think because they feel like they can get their heads around it. As you can probably tell, I'm loath to relitigate the debate about whether we should store passwords with secret salts. Even when the discussion is legitimately focussed on that problem, I find discussions about it go to weird, incoherent places that aren't really informed by real threat models, and are really hair-splitting arguments about the security of environment variables vs. the security of filesystems, just dressed up as top-3 security considerations.

Even for trivial basic-auth API keys, the secret storage problem is not the same as that of a password. Part of the problem with passwords is that they effectively cannot be revoked; "revoked" passwords creep back into systems, and, worse, infect other systems. That's not how API keys work; the threat model is different and so are the countermeasures that are profitable to deploy for them.

Another thing that I feel happens in discussions like this is that there's no notion of cost-benefit. There is option A, and option B, and option B is on some axis superior to option A, even if that benefit is marginal. Since there's no engineering cost to consider, there's no meaningful discussion. But in reality, there's always a cost: deploying a countermeasure at a minimum incurs the opportunity cost of not deploying some other countermeasure that could have been built with the same (finite) engineering resources. Since these discussions always seem to get mired in secret-salt double-hashing hmac(hmac(root-secret, measurement_1 || measurement_2), secret) stuff, I hope you can see the concern: the discussion essentially advocates for ever-more-marginal wins that "fit" the message board thread, rather than seriously considering the real problem.

But of course, the other problem is that the arc of the thread bends towards HMAC'ing your Biscuit token. "Tokens are just passwords", after all.

At any rate: this thread says, "passwords are secrets, tokens are secrets, ergo passwords are tokens". Signing keys are also secrets. Nobody HMACs them. Something is wrong with the syllogism. I'm not that interested in picking apart what.

bpicolo · on Aug 25, 2021

Serious question - what makes the database serialization not vulnerable to timing attacks in the same vein? I wouldn't expect those to be purely constant time implementations.

simonw · on Aug 25, 2021

You do the final comparison outside of the database after retrieving the stored value - I use the Python secrets.compare_digest() function for that.

some_furry · on Aug 24, 2021

> I am wondering how that relates to searching for the token in a database (index). Does it still matter?

It can, but the practicality of exploiting this timing leak isn't at all a settled issue.

Previously, https://soatok.blog/2021/08/20/lobste-rs-password-reset-vuln...

ezekg · on Aug 24, 2021

Run the token through a secure HMAC function before storing it in the DB. Problem solved. KISS. :)

iudqnolq · on Aug 25, 2021

Why would something derived from a random string have better comparison properties than the random string?

ezekg · on Aug 25, 2021

You can’t perform a timing attack for a token “foo” in SELECT WHERE token = :token if the token stored in the DB is the HMAC of “foo”. E.g. trying “f” and then “fo” produce 2 entirely different, random tokens from the query’s POV. The attacker could never deduce that the correct token is “foo.”

SilverRed · on Aug 25, 2021

Could you elaborate on what you mean by timing attacks? What would go wrong if you did a basic params[:token] == dbvalue ?

pests · on Aug 25, 2021

Most naive string comparison methods (including language expressions like above) will usually compare one letter at a time until a mismatch is found. This allows an attacker to build up the correct password one character at a time. The correct letter will take longer to check then the incorrect.

quelltext · on Aug 25, 2021

But in the "real world" wouldn't any such time difference be very small and fluctuations in network delay and measurement error so large in comparison that this isn't typically exploitable?

Sure, better be safe and use a safe comparison method and that's simple enough, but still, how realistic is such an attack over the public internet?

I'm aware of "practical" results: http://crypto.stanford.edu/~dabo/papers/ssl-timing.pdf

But the experiment setup was on a LAN, with AFAICT a single server, no other traffic to that server, and other such conditions.

simonw · on Aug 25, 2021

The trick with timing attacks is that you don't measure the time taken for a single request. You send the same request thousands of times and take an average of the response time, which lets you pick away at the secret one character at a time.

quelltext · on Aug 25, 2021

Sure, that's what you'd attempt to address errors, but that still doesn't convince me. That only gets you so far. My intuition says that you won't be able to distill any meaningful results in practice across long distances and requests being routed to different backend servers under varying load. You are trying to isolate a very specific duration in the order of microseconds.

Of course, I'm not in the field and am happy to learn more. However, my intuition also tells me we should have seen more realistic experiments and results over the years to get a clear picture of the extent at which such an attack is actually feasible and when it seizes to be.

flyinglizard · on Aug 25, 2021

The difference is probably below 1 microsecond. A modern CPU running at 3GHz roughly performs 3 instructions every 1 nanosecond. It varies a lot but character by character comparison, especially one that’s happening often and thus cached is one of the most trivial usecases so let’s assume this nominal throughput.

I’d estimate a naive comparison loop to be around 20-30 cycles, for compare and control? And let’s wrap it in 250 cycles more because someone decided to use Python. 300 cycles, then, are about 100ns (tenth of a microsecond).

Not saying it can’t be done with a timing attack over the internet but you’d need a huge sample size.

kbknapp · on Aug 26, 2021

A real attacker would try to minimize all those constraints by being collocated with the target as close as possible such as in the same datacenter or same rack if possible.