Profiling Python in Production

mau · on Dec 30, 2015

Here is a profiler developed by dropbox using the same technique described in the article:

- https://blogs.dropbox.com/tech/2012/07/plop-low-overhead-pro...

- https://github.com/bdarnell/plop

sitkack · on Dec 30, 2015

Reducing CPU load also

* reduces power usage, wear and tear on hardware

* gives more capacity for traffic surges

* gives more headroom for new features

* enables running on a smaller instance

This isn't an argument against slow code, more of a suggestion to tune out unnecessary work.

valarauca1 · on Dec 30, 2015

Slow code is just paying for developer productivity in CPU time instead of labor time.

It's great in theory until the AWS bill arrives. Often times it's still fine, just depends where the bottom line is.

sitkack · on Dec 30, 2015

Agreed, I love slow interpreted languages like Python and writing _fast_ Python is a different kind of optimization that one would do C. In cases like the article, the biggest gains are in just not doing something, or doing it less often.

nitin_flanker · on Dec 30, 2015

This also ahead reduces the maintenance time and efforts which ahead increase productivity. Overall they did a great job.

kissgyorgy · on Dec 30, 2015

That might not be True for optimizations, because optimized code is usually less readable.

sanxiyn · on Dec 30, 2015

Here is another Python profiler intended for live use: https://github.com/what-studio/profiling

iamspoilt · on Dec 30, 2015

This question might be bit naive but how is this approach any better or different from monitoring tools like NewRelic which does the profiling for you?

emfree · on Dec 30, 2015

Author of the post here. That's a good question. I don't know if this approach is objectively better, but it has a few nice features.

* We generally favor free/open source solutions where practical.

* It is quite a bit cheaper in dollar terms.

* The actual code to make this work is very lightweight. By doing it yourself, you have total control, and can extend or tweak to get exactly the data you want. Being able to easily add bespoke instrumentation is really powerful. To give an example from one of our use cases (IMAP sync), let's say you wanted to cohort your data by mail provider. I.e., you suspect that the workload profile when syncing against server A is significantly different than syncing against server B, and you want to know for sure. It's pretty easy to take your codebase and your instrumentation, and add that by inspecting some thread-local context at runtime. Might be hard to do with an off-the-shelf commercial tool.

iamspoilt · on Jan 1, 2016

I completely agree about the bespoke instrumentation you have discussed here and this approach has started making a lot more sense to me now. I am gonna use it for one of our micro-services soon. Cheers for sharing this amazing post.

PudgePacket · on Dec 30, 2015

cost perhaps ?

moonchrome · on Dec 30, 2015

>It’s a large Python application (~30k LOC) which handles syncing via IMAP, SMTP, ActiveSync, and other protocols.

In what context is 30k LOC a large application ? 30k LOC is small enough that one programmer can write and easily have an overview of the entire codebase. Maybe it's a typo and it's 300k LOC

jonesb6 · on Dec 30, 2015

I'd also consider a poorly written and documented 30k LOC to be a "large" codebase, and it would probably take multiple people to wrestle with it.

kashif · on Dec 31, 2015

In python 30K is not as small as say 30K in Java