Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Profiling Python in Production (nylas.com)
177 points by mau on Dec 30, 2015 | hide | past | favorite | 14 comments


Here is a profiler developed by dropbox using the same technique described in the article:

- https://blogs.dropbox.com/tech/2012/07/plop-low-overhead-pro...

- https://github.com/bdarnell/plop


Reducing CPU load also

* reduces power usage, wear and tear on hardware

* gives more capacity for traffic surges

* gives more headroom for new features

* enables running on a smaller instance

This isn't an argument against slow code, more of a suggestion to tune out unnecessary work.


Slow code is just paying for developer productivity in CPU time instead of labor time.

It's great in theory until the AWS bill arrives. Often times it's still fine, just depends where the bottom line is.


Agreed, I love slow interpreted languages like Python and writing _fast_ Python is a different kind of optimization that one would do C. In cases like the article, the biggest gains are in just not doing something, or doing it less often.


This also ahead reduces the maintenance time and efforts which ahead increase productivity. Overall they did a great job.


That might not be True for optimizations, because optimized code is usually less readable.


Here is another Python profiler intended for live use: https://github.com/what-studio/profiling


This question might be bit naive but how is this approach any better or different from monitoring tools like NewRelic which does the profiling for you?


Author of the post here. That's a good question. I don't know if this approach is objectively better, but it has a few nice features.

* We generally favor free/open source solutions where practical.

* It is quite a bit cheaper in dollar terms.

* The actual code to make this work is very lightweight. By doing it yourself, you have total control, and can extend or tweak to get exactly the data you want. Being able to easily add bespoke instrumentation is really powerful. To give an example from one of our use cases (IMAP sync), let's say you wanted to cohort your data by mail provider. I.e., you suspect that the workload profile when syncing against server A is significantly different than syncing against server B, and you want to know for sure. It's pretty easy to take your codebase and your instrumentation, and add that by inspecting some thread-local context at runtime. Might be hard to do with an off-the-shelf commercial tool.


I completely agree about the bespoke instrumentation you have discussed here and this approach has started making a lot more sense to me now. I am gonna use it for one of our micro-services soon. Cheers for sharing this amazing post.


cost perhaps ?


>It’s a large Python application (~30k LOC) which handles syncing via IMAP, SMTP, ActiveSync, and other protocols.

In what context is 30k LOC a large application ? 30k LOC is small enough that one programmer can write and easily have an overview of the entire codebase. Maybe it's a typo and it's 300k LOC


I'd also consider a poorly written and documented 30k LOC to be a "large" codebase, and it would probably take multiple people to wrestle with it.


In python 30K is not as small as say 30K in Java




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: