Is it just me or is PHP easily the best choice for quickly implementing something like this? The Ruby example required the framework Sinatra, Python isn't a language most servers support out-of-the-box (it is beautiful though) and Clojure is the ugliest language I've ever seen (at least in this example). I look forward to the day PHP is standardised (it will happen), because it cops a lot of flak when it gets the job done without needing usually anything installed on the server to run it. Upload your scripts and run, BAM!
PHP just has a different deployment model in most cases using just Apache + mod_php out of the box and PHP is embedded inside the Apache process.
This model is memory intensive and does not scale well.
A more accurate comparison would be something like Nginx + PHP+FPM or even HipHop VM for PHP.
If you're only measuring stick is "Upload your scripts and run, BAM!" then it's no real argument. But if you're at the scale of having professional systems engineers and scaling to millions of users then deployment isn't an issue as you have your processes running smoothly.
If you can deploy the same set up 10,000 times easily, it doesn't matter that the initial set up took you 2 hours vs. 1 hour.
Software is infinitely reproducible. The effective deployment cost then becomes 0.
Essentially, at scale, you are no longer running Apache + mod_php.
> If you can deploy the same set up 10,000 times easily
If you had 10,000 web servers, are you telling me its difficult to create a script that will automatically pull files from a centralised network store?
In the above case, that's how I would set up and its trivial.
To be fair, the beast we call the PHP standard library is gargantuan and tailored to web development. Much larger than Ruby + Sinatra or Python + Flask, and arguably worse at general purpose programming. So, different strokes for different folks.
> I look forward to the day PHP is standardised (it will happen), because it cops a lot of flak when it gets the job done without needing usually anything installed on the server to run it. Upload your scripts and run, BAM!
What do you mean by standardizedPHP was quite popular in the past, and serves about 35% of the web traffic(wikipedia and facebook alone would constitute a significant amount).
I don't know about you, but my requirements are to get the job done reasonably, not "get the job done by copying a script on a cheap-ass shared hosting and you can't install any extra library. lol".
Naming conventions as pointed out are a big issue. "strtolower", "str_replace" both string manipulating functions that have different naming conventions, don't get me started on arguments of the functions either. Sometimes the string comes first, sometimes the needle comes first, etc. Once PHP has its naming conventions down it'll be much better.
Part of the reason the Clojure code is ugly is because he's basically doing imperative code in a functional language. Here's my (highly annotated) take:
(ns nurblizer.core
(:gen-class)
(:use compojure.core)
(:require
[clojure.string :as str]
[clostache.render :as clostache]
[ring.adapter.jetty :only run-jetty]
[compojure.handler :only site]))
;; main nurble stuff
(def nouns
(->> (-> (slurp (clojure.java.io/resource "nouns.txt")) ; read in nouns.txt
(str/split #"\n")) ; split by line
(map (comp str/trim str/upper-case)) ; feed the lines through upper-case and trim
set)) ; transform into a set
(def nurble-replacement-text "<span class=\"nurble\">nurble</span>")
(defn nurble-word [word]
(get nouns (str/upper-case word) nurble-replacement-text)) ; return word if word in set else nurble
(defn nurble [text]
(str/replace text #"\n|\w+" #(case % ; using anon func literal, switch on argument
"\n" "<br>" ; when arg is newline replace with br
(nurble-word %)))) ; otherwise nurble the argument (a word)
;; webserver stuff
(defn read-template [template-file]
(slurp (clojure.java.io/resource (str "templates/" template-file ".mustache"))))
(defn render
([template-file params]
(clostache/render (read-template template-file) params
{:_header (read-template "_header")
:_footer (read-template "_footer") })))
(defroutes main-routes
(GET "/" [] (render "index" {}))
(POST "/nurble" [text] (render "nurble" {:text (nurble text)})))
(defn -main []
(run-jetty (site main-routes) {:port 9000}))
The result is the same but the approach is different: build a set of uppercase nouns, define how each word in the text is handled, and do a single string replacement pass over the text to produce the output. Doing things like tokenizing the text and processing it as a seq or keeping the nouns lower case and producing upper case output are easy but seem like unnecessary complications so I left them out.
Note: In the grand tradition of the exercise, the webserver stuff is untested. I did repl-check the nurble part but was too lazy to set up a project.clj and have leiningen pull in the webserver deps.
PHP Notice: Undefined variable: nouns in nurble.php on line 12
PHP Notice: Undefined variable: nouns in nurble.php on line 12
PHP Notice: Undefined variable: nouns in nurble.php on line 12
PHP Notice: Undefined variable: nouns in nurble.php on line 12
PHP Notice: Undefined variable: nouns in nurble.php on line 12
;-)
(Also: check out explode() instead of preg_split() and with a simple pattern like this, str_ireplace might be faster, as might be searching for the word in the whole text file instead of converting it into an array.)
Interesting article, except that the examples don't line up exactly. For example, in PHP it's trivial to store the file contents in a var and refer to that over and over (as OP does in Ruby), but instead he chooses to read from the file over and over again and then calls that a shortcoming of PHP vs Ruby when that's purely an implementation decision.
Ruby is clean (but I dislike the stack), and Python looks good, but ye gads is Closure ugly... Uglier than PHP, even.
My favorite part about PHP is how easy it is to get a simple Nurble-like app up and running... but that's just me.
OK, I'll bite: how would you store the array of words in memory across requests in PHP? As far as I'm aware, the runtime deliberately doesn't provide a way to do that.
The closest you could get is to store it in $_SESSION, but that's per-user rather than global, and it still has to read the data from your session handler. Which by default just uses flat files, so we're no better off than we started.
The APC opcode cache extension also has a user cache which has an API similar to memcache but a bit faster since it's using shared memory and not going over the network. It's shared between all visitors across all requests.
Please do, I'll update and edit as appropriate. I wasn't aware a trivial way existed of storing a persistent variable in PHP between existed. I bet I'll feel dumb when I find out what it is, though.
Disk IO is a massive problem at scale. This obviously isn't "at scale", so it works fine here, but avoiding hitting the disk (edit: any slow IO, really) is Performance 101.
"full disclosure: I didn’t want to install Apache+PHP on my dev machine, so I haven’t tested this one"
Something rubbed me wrong with this statement. If you are going to make a remark like '[Clojure's] performance should be the highest of the above,' shouldn't you at least try them first?
Yeah, that was kind of odd. Apache and PHP are probably the easiest stack to install out of these four, so why bother including it in your article?
Full quote:
"Ok, that seems to work (full disclosure: I didn’t want to install Apache+PHP on my dev machine, so I haven’t tested this one). Deployment in PHP works like this:
Make sure your server is running apache and mod_php
Put the files where the server expects"
So which one is it, did you not install Apache+PHP or do you just check out and put files where the server expects?
I laughed at that too. Let's take the hard way out and test these admittedly-ridiculous-to-deploy web stacks instead of spending a few minutes installing a LAMP stack.
PHP can scale too: look at Facebook as an example or any hugely popular website running Wordpress/Drupal. Where do you want so spend your time: scaling your PHP production environment or configuring your JVM/rvm/Python/Passenger?
To be honest, I never used one. It came installed on my stack and all I did was disable apache, enabled NGINX and PHP-FPM (cgi compiled php, quite a bit faster).
Look at php-fpm (FastCGI process manager) which has been bundled with PHP 5.3+. On the nginx side, you would configure it to use fastcgi_pass. An example configuration is available in the Silex documentation:
The inevitable ruby style niggling, feel free to ignore: personally I like something a little more concise and inlined, some parts seem to have used more verbose idioms, (like Regexp.new instead of a literal), and also I think you need to escape the backreferences or use single quotes, something like:
(words - @@nouns).each do |w|
text.gsub!(/(\b)#{w}(\b)/i, '\1<span class="nurble">nurble</span>\2')
end
EDIT: oops, backslash. Also, in the example, line 19, where you have 'sub', don't you mean 'pattern'?
I submitted this article expecting to learn a few things about my style. In this case, I didn't actually know you could interpolate into regexes like that, so thanks!
Cool stuff. The danger of this, though, is that you risk doing unidiomatic examples a disservice.
I thought you were just going for line-for-line parity until I came across the Clojure example. I think you could've written that same `for` loop in Clojure and either yield the original word or yield "nurble" if it intersected a noun. But this is put up or shut up territory and I'll take another look.
Also, could you clarify why you make multiple assertions that Clojure is "too much" for this task?
I'm a fulltime Ruby developer learning Clojure on the side and Clojure's Ring+Compojure feels like Rack+Sinatra to me. In fact, the choice between Ruby and Clojure to me is as trivial as using Python or PHP. Here's ClojureDoc's tutorial on making a DB-backed webapp in Clojure: http://clojure-doc.org/articles/tutorials/basic_web_developm.... It's just Sinatra/Flask in another flavor all over again.
The only downside of Clojure so far for me is that I might already be drunk with that special Lisp koolaid already.
It's hard to argue that the Clojure code is as easy to read as the Ruby or Python, and if you're working with a team that matters. Not to mention the relative dearth of Clojure-fluent developers. Neither of those hurdles is impossible to overcome, but if you're choosing between Ruby and Clojure, you have to justify that decision somehow.
Of course, if I'm the only developer I choose Clojure every time. That's probably the Lisp koolaid doing its thing.
I think people like to over-exaggerate the difficulty of learning Clojure. If you have one person on the team who is fluent in it, the rest of the team can pick up enough of it to be productive in days.
The place I work at used to be exclusively a java shop, we introduced Clojure for a project a year ago and everybody loved it. So, far we had co-ops learn it, contractors, and the whole team. I'm just not sure who these people who can't learn Clojure are exactly.
It might be more difficult to read initially if you've only worked with one family of languages, but that doesn't make it more difficult to read inherently. In fact, there are many benefits to readability once you're familiar with the syntax. It's a lot more regular, there's less special cases than in most languages, and you can see the relations in your code visually, since you're seeing the same AST that the compiler sees.
Weird that performance was the first reason mentioned when saying that is why he picked clojure. Ease of development is usually the top of my list. Then ease of hiring developers to maintain the system. Clojure is hard to dive into and hard to hire for right now. Two big minuses in my book.
I sent you a pull request. I think the Clojure version becomes more performant still if you slurp the nouns into a set so you don't have to do a linear traversal.
Also, I think the regex usage makes the code a little harder to digest and comes at the cost of further unnecessary cycles.
I think this code gets even easier if you don't have to preserve the whitespace and filter non-alpha characters. (see danneu's suggestions)
Finally, what are your thoughts on passing along the wordlist (nouns) as an arg to nurble? This would offer referential transparency and also make the test easier.
Btw, I think your update broke your gist (where you changed (some (partial = w) nouns) to (not (some (partial = w) nouns)). Check that out again.
What are you considering the core? The nurble method?
Ruby:
configure do
@@nouns = Set.new File.open('nouns.txt').map {|noun| noun.strip.downcase }
end
def nurble(text)
words = Set.new text.downcase.split
# Replace words which are not nouns with nurble.
(words - @@nouns).each do |word|
text.gsub! /(\b)#{word}(\b)/i, '\1<span class="nurble">nurble</span>\2'
end
text.gsub(/\n/, '<br />')
end
Python is similar.
Personally, I find it far more readable that the JSP sample.
Yeah - the nurble method. It's really the Clojure that looks the worst.
Taking just the boolean query to figure out whether the word is in the noun list, I'd place them in this order from simplest to understand through to most obfuscated
Python: if word not in NOUNS
Java: if (!nouns.contains(word))
PHP: if(!in_array($word, $nouns))
Ruby: if not @@nouns.include? word
Clojure: if (not (some (partial = word) nouns))
The Python is pretty much just english
The Java is almost english except for the !
PHP is starting to get more obtuse - ! and $ and you have to know the order of the parameters
Ruby (from the article) has @@ and ? but isn't too bad - might go above the php
Clojure is just weird - looking at the others, I reckon a person could easily modify and write a similar program without knowing much of the language. With Clojure I think it requires deeper understanding.
I haven't included your Ruby in that list because it's doing a different (much neater :) ) thing.
Perhaps I am missing something, but at least the PHP example is not the most efficient. Seems like you could easily build a nice regex with all your nouns and replacing them in the string, rather than splitting up the text and looping through each word. The code would be considerably shorter and likely quite a bit more efficient.
As for your comment on you have to load up the nouns file each file, assuming you have PHP 5.2 or later, the file will stay in memory when using APC and is reused. You can opt to extract the content and store that in memory also.
Lastly, for performance, the code could be improved as I mentioned above, but Apache is definitely not the way to go for performance. NGINX is the way to go, both are very similar, but when it comes to performance NGINX is super lightweight, fast and handles a heck of a lot more connections.
You could have investigated micro php framework like silex tbh (silex.sensiolabs.org), with which you could have used Twig template (twig.sensiolabs.org). A bit an overkill for such a trivial task, though.
Heh. Our solutions came out very similar. I clearly did something wrong, as I thought backreferences didn't work in Go's Regexp package. Oh well. I'll remember for next time.
- Using anything more than PHP for such task is an overkill.
To be clear, I'm not saying that PHP is the best language of them all. Like all languages it has its ups and downs and there's a lot of subjectivity on what you like and dislike. I'm just saying that such comparison isn't really fair.
> - You're comparing raw PHP with frameworks from other languages. Try not using frameworks in other languages.
Then try using Perl instead of PHP, maybe, so you can have fun importing CGI libs? Because PHP is a web dev framework. Don't argue with me, argue with PHP's own history page:
"... the very first incarnation of PHP was a simple set of Common Gateway Interface (CGI) binaries written in the C programming language... Rasmus rewrote PHP Tools [so the] new model was capable of database interaction and more, providing a framework upon which users could develop simple dynamic web applications..." -- http://php.net/manual/en/history.php.php
> I'm just saying that such comparison isn't really fair.
As presented, it's absolutely fair, otherwise you're comparing a web app framework (PHP) to languages that don't have, for example, HTTP POST handling unless you import those libs.
There are some minor differences (due to Go lacking backreferences with it's Regexp package, as far as I can tell) that I don't care to fix. Still, in ~74 lines of code, you get something pretty decent (and no external web server required, either).
Just a quick note that since php didnt use a microframework (because it didnt need to) you could've used many of the ones that already exist, however, someone wrote a clone of Sinatra for PHP called Frank[1], although the code is old now and there are many newer, maintained microframeworks to chose from.
Note you are very nice with the PHP implementation by using a function instead doing the foreach inside the HTML ;)
May I suggest improve the article by including:
1- an HTML template of each framework (since each seems to have a their own syntax)
2- the JS and GO example (since it's the trend on YC)
3- Use the Framework name instead of the language name
"The bit about dependencies is most interesting to me. In PHP, the problem of dependencies is offloaded to Apache, and by extension the server host. PEAR helps a lot, but even that’s not guaranteed to be available everywhere."
Most modern PHP projects (PHP 5.3+) are using namespaces, autoloaders, and Composer to handle dependencies.
i'm just running the php nurble() function from command line, but it seems you should reference global $nouns within the function and use either "/\W/" (upper case "W") or "/\s/" in the preg_split. am i missing something?
Any "language/framework comparison app" that does not include user authentication and permissions, sessions, user specific persistent data, https communication, queueing of long lived operations, unit tests, and scaling across multiple instances is worse than useless.
Build three versions of an app with these characteristics using Ruby on Rails, Python, and Node and I pledge I would buy the resulting book for thirty five dollars.
Try KickStarter. I funded this book http://s831.us/105pQB8 and I am glad I did. I would pay at least five dollars up front to fund the writing of such a book.