Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why Markdown Is Not My Favourite Language (wilfred.me.uk)
139 points by rachbelaid on April 26, 2013 | hide | past | favorite | 84 comments


A lot of the author's reasons have to do with lack of a spec. This topic was brought up by Jeff Atwood (creator of StackOverflow) last year -- http://www.codinghorror.com/blog/2012/10/the-future-of-markd...

I had to build some markdown functionality into a project the other day, and so I googled around to see if there has been any update on this. Didn't find anything official, but a few random comments and blog posts here and there seem to indicate that StackOverflow, Github, and Reddit are working together to come up with a spec.

(Apparently John Gruber has disavowed any responsibility for leading the community in this regard, so now it's up to the big users of markdown to figure something out on their own).


I was with him until he recommended one based on Wiki markup. Wiki markup is pretty universally reviled for good reasons.

He right complained about the heavyweight notation required for a simple link in Markdown, but everything is that bad or worse in most Wiki markup.

Really, the complexity of the markup you need depends on your application and any attempt to make a "one size fits all" markup is going to fail. The needs of a simple commenting system are vastly different from a user-editable CMS that doesn't want to give access to full HTML, but wants robust formatting options.


I am really happy to see this article. I was just talking to a friend about how much I dislike Markdown and how it has become the de facto markup.

Can you explain to me why wiki markup is universally reviled?

I find Creole to be great text markup. It feels really natural to use and it expresses the intended meaning whether it's rendered using HTML or displayed in plain text. I really like this statement by Christoph Sauer on the Creole website: http://wikicreole.org/wiki/ChristophSauer

I agree that Creole or any other formatting markup is not going to solve everyone's problems. However, I think that whenever Markdown is a good choice, Creole is a better one.


That statement is nothing special, both Markdown and asciidoc claim to be a formalisation of the plain text conventions you already use. IMHO most of the alternatives mentioned in this comment thread understand their goals and the design space well enough, and many are definitive improvements over Creole's MediaWiki-inspired, committee-designed syntax. They also win in featurefulness (Markdown and Textile have html embedding, asciidoc and ReST have extension mechanisms), which is important for a wiki.


For me, the reason is that Creole (and formats like it), feels like markup. Writing Markdown feels like writing text.

I was used to most of the conventions from Markdown mostly as-is from writing plain text for a couple of decades before I first saw Markdown.

Whenever I come across Creole-style formats, I cringe - it means I need to think about the markup instead of think about just writing.


Markdown is basically another flavor of wiki markup. It just happens to be the most popular iteration at the moment.

I'm not sure how you can say wiki markup is universally reviled without lumping markdown into that statement. You must have a particular flavor of wiki markup in mind.


Wiki markup is recognisable with link conventions [[like this internal link]] and [[like this|http://external/link]] (MediaWiki and Creole; earlier wikis relied solely on CamlCase). Markdown's design is inspired by conventions used in plain-text email, with links [like this][1].

[1] http://site/


That is not how I write plain-text email, and I don't think most people do either.

I write links in plain-text emails as: www.example.com/awesome-video-of-kittens.jsp?referrer=whatever

Note in particular that I don't put http in front, nor do I provide an alternative title.


I don't get it, do you type the URLs by hand, or do you manually cut the http:// from what the browser gives you? Both options seem odd to me.


I didn't notice the browser ads the http.

Anyway it should still understand links without http because if I am typing the link out by hand (say because it is a site I remember) then I would still want it to be a link.


> I didn't notice the browser ads the http.

This is baffling for me. You are not new to the Internet, but until only quite recently, every single browser always showed the protocol is the address bar. Only of late did Google Chrome and then e.g. Mobile Safari remove “http://” from the display to save space; and yes, naturally they include this string when copy-pasting as doing otherwise would result in an incompletely pasted URI (protocols are not optional, except by current convention in non-hypertextual media).


> Markdown's design is inspired by conventions used in plain-text email, with links [like this][1].

And [title](http://url.goes.here) for in-line URLs.


Of course. Either way, it's close to the plain text convention, though one wouldn't bracket the link-name part.


> Markdown is basically another flavor of wiki markup.

One clear advantage of markdown is that its markup is designed to be as unobtrusive as possible to the point that its text/plain rendering is its own source. That's clearly not the case of MediaWiki or even Textile.


    Stack Overflow supports an alternative syntax for
    this use case:
    
      Our primary landing page is <http://www.example.com>
   
    Again, this is not knowledge that the user can take
    with them to other Markdown based sites.
Actually, that is part of the markdown spec. See: http://daringfireball.net/projects/markdown/syntax#autolink


The fact that the other ambiguity references a stackoverflow article suggests that the author may not have actually looked at the spec.

TBH the article reads like an elaborate advertisement for Creole.


The material you quote is confusing, but what Stack Overflow and GitHub do that Markdown does not define is allowing auto-linking of URLs without the angle brackets. That is, http://ajh.us/ on SO or GH is the same as <http://ajh.us/>; on any Markdown-based system.


Oh, he didn't mention half the stuff I thought he would. He's right about a lot of things though.

Take this for example:

    Hello world, there isn't
    1. There are 4.
The original markdown treats the second line as a continuation of a line in a single paragraph. Github treats the second line as the start of an ordered list. Why? Because then you can do things like this:

    # My list
    1. foo
    ...
I've been so torn about how to implement so many of these things. I've had a lot of people ask me about that particular issue, and when I tell them the original markdown behavior requires 2 newlines after a line before starting a list, it was the last thing they expected.

The main problem with markdown is not the fact that these counterintuitive things exist, but the fact that they're not consistently implemented. I've spent nearly 2 years trying to reconcile all these differences. See https://github.com/chjj/marked for more crazy markdown nonsense in the test suite.


Next time people ask, just explain that markdown always reflows consecutive lines.

GitHub's change (for comment fields only) isn't list specific, they just never reflow. They document it here:

https://help.github.com/articles/github-flavored-markdown#ne...


Having newlines=line-breaks does not necessitate what I mentioned above. This is actually a common issue across multiple markdown engines, including ones that do not implement github's newline behavior. I suppose you can think of it that way logically, but I don't consider this a part of GFM since it is actually so easily mistakenly implemented. It is list-specific.

If I did agree with you on this, I could edit my post and change "github" to any of a dozen markdown implementations that supposedly implement markdown accurately and it would be just as true.

But, like I said, GFM could obey the markdown spec with respect to the list and still obey its own spec.

To explain this concept:

Why doesn't

    Four
    + Five
    + Six
Simply yield

    <p>Four<br>+ Five<br>+ Six</p>
And instead yields

    <p>Four</p><ul><li>Five</li><li>Six</li></ul>
It mistakenly thinks that line is a list - that is the bug, not the newline behavior. You could have both behaviors at the same time without conflict. It could return the first output and still be true to the newline part of the GFM spec as well as the original markdown. Instead, it chooses to do something ridiculous.


The spec deviation for newlines could be done in multiple ways, but the implementations you mention all choose to trigger block constructs (which includes lists, and also headings) on any start of line. It's because the syntax tweak is intended to avoid confusing people with context-dependent syntax (syntax appears not to work if there wasn't an empty line before), not to give an alternative way to generate <pre>.


My favorite text based markup language is asciidoc [1] because I think it has the best looking source text syntax [2]. The goal is to make it completely readable in plaintext and not super markup-y. If someone were to "less README.asciidoc" they should not even notice that it's markup--it's just a nicely formatted text file.

That said, it's really only made for documentation and is based on the Docbook XML toolchain which makes it an absolute bear to use.

And like all of these text-y markup languages, it has corner cases that don't work well.

[1] http://asciidoc.org/

[2] http://asciidoc.org/index.txt


  > The goal is to make it completely readable in plaintext and not super markup-y.
If that was the goal, then I think it failed. I have used it in the past, and it has more weird rules and syntax than even reST. It is powerful though, I will grant it that.


Collapsing single newlines is a love-it-or-hate-it feature. Personally I like it because it allows me to place one sentence (or clause) per line, which makes version control much easier. But a lot of people seem to hate it with a passion, and I can also understand their frustration because it just isn't intuitive to anyone who doesn't already know how HTML treats whitespace.

Things that actually do annoy me from time to time:

- In the default implementation, an underscore in the middle of a word causes the rest of the word, and everything until the next underscore, to be italic. Fortunately, most of the actual implementations have fixed this, but that ain't standard. I also don't think Reddit has fixed it yet.

- Using one asterisk OR one underscore for <em> and two asterisks OR two underscores for <strong> feels redundant. There is no need for two different syntaxes to achieve the same effect. I would prefer one underscore for <em> and one asterisk for <strong> (email clients have been doing this for plain-text emails for a long time), but maybe that's just me.

- AFAIK there is no built-in syntax for underline, strikethrough, superscript, and subscript. Maybe we should repurpose some symbols to handle such cases.

On the other hand, there are things the article complains about that I absolutely don't think need to be "fixed":

- HTML filtering/escaping. Seriously? There are plenty of excellent HTML filters in every language. No need to reinvent the wheel. Just pipe the stdout of your Markdown parser to the stdin of your HTML filtering library. In addition, some of us use Markdown to write our own blogs, where we have every right to insert an occasional <script> or <iframe>. You know, like embedding a YouTube video. Different people should have different rights to use one or another HTML feature, and managing such rights should be the job of your app, not your Markdown parser.

- Any sort of Wiki markup, like Creole. The distinguishing feature of a Wiki markup is that it makes it easy for users to cross-reference documents. But in order to cross-reference documents, we need to agree upon some sort of document organization system. But given the very diverse situations in which Markdown is used, is it even feasible to agree upon a single such system? It would seem that the only cross-reference mechanism that is compatible with all the things we use Markdown for is the good old URL, and Markdown can already handle URLs pretty well. Converting URLs to something that references another place within your app should be the job of a pre-processor/post-processor. Don't be afraid to write one.


My biggest issue with markdown involves links. Parentheses are allowed in URLs and those confuse most markdown parsers:

http://blog.nig.gl/post/48802013022/although-parentheses-are...


I don't know if this is standard or not but there is a way of dealing with that on reddit.

[test](http://msdn.microsoft.com/en-us/library/dd904817\(v=office.1...

That'll work fine, just insert a backslash before the character and it'll be treated purely as a character instead of a formatting marker.


escaping makes parsing too complex. i'd rather just be forced to use %29, which is the closing parenthesis URL encoded.


Consecutive blockquotes also tend to break for me, leaving me to write all of them in HTML, which is pretty silly.


It sounds like Creole is the same as Markdown, in the sense that it's "trying to find a balance between different languages." It seems that it's not a standard that everyone supports, it's a common denominator that eventually other people will add more features to.

Also, he says that StackOverflow supports an "alternative" link style such as <http://www.google.com>, but I'm pretty sure that's a standard. http://daringfireball.net/projects/markdown/syntax#autolink


I assume a lot of implementers ignore that autolink syntax because it's much easier to just escape all less-than and greater-than signs to prevent any HTML.


Who ignores the <link> syntax?

The html is generally handled like this:

parse the Markdown syntax and render as html, then parse, whitelist, and render the html.


It's been sad to watch Markdown eclipse Textile http://txstyle.org in popularity. Textile provides a more complete mapping to HTML than Markdown, while remaining easy to use.

Its primary flaw is translating line breaks into HTML line breaks. Per the article's criticisms, it too lacks a formal spec and supports embedded HTML.


I like Textile too, it has the cleanest "link syntax":http://txstyle.org/doc/12/links. It's supported in Redmine and ikiwiki.


Textile does tables too.

http://redcloth.org/hobix.com/textile/

For styled web content, I prefer Textile. For beginners who don't need to CSS style content inline, Markdown is better.


I think its primary flow is the link syntax.

As far as I remember, github's wiki pages used to be in Textile format. I might be mistaken here, but I remember being annoyed all the time by the link syntax.


My beef with Markdown (and Textile and any of the other current markup formats) is that it is geared towards technical people.

Non-techies have problems remembering the format, and with typing it correctly, and many will never realize (or bother to look into) that there is a format, even if you provide helper toolbars and prominent links to help pages.

One of our current sites uses Textile because it started up long before Markdown existed. Textile is much worse that Markdown. For example, a single starting space means <pre>. And the link syntax is awful.

People understand "" and "_" very well. Bullets are fairly intuitive. Everything else requires looking at the format code help sheet.

My suggestion for a better link syntax is this: Anchor in brackets before or* after the link. So this:

    [Some link] http://example.com/
and this:

    http://example.com/ [Some link]
would both become a link with "Some link" as the text. If there are brackets-text both before and after, choose the one after.

This is unambiguous for almost all kinds of text, and anyone who is oblivious to the syntax is unlikely to stumble into it by mistake. And it's quite easy to remember, especially since the order is unimportant.

You don't need much more than that, because people who really care about formatting are also capable of learning HTML. I see plenty of non-tech people -- many of them writers and journalists, but also non-professionals -- who have learned the basics of HTML because they are really punctilious about formatting and the aesthetics of text.


Interestingly, he mentions the line break problem, and later recommends Creole, which also has the line break problem.


I personally like the pandoc or the gitit version which also has a working title/authors/categories syntax.

I use that to write nearly everything which I can easily convert to latex/html/whatever which is especially nice with different templates.


As to the security concerns, I strongly recommend passing the results of Markdown compilation into a separate HTML sanitizer library. Preferably one based on whitelisting. You should do this even if your Markdown compiler claims to protect against malicious HTML. The way I see it, a dedicated HTML sanitizer library is less likely to let something through than a Markdown compiler that offers sanitization as an afterthought.

For Ruby, I like Sanitize (https://github.com/rgrove/sanitize). I'm sure you can find something similar for any major language.


Another alternative not previously mentioned is MakeDoc by Carl Sassenrath (of AmigaOS & Rebol fame) - http://en.wikipedia.org/wiki/MakeDoc


The link syntax is verbose, and there is no way to do inline links. The official page was last updated in 2009.


re: 2009 - Probably because makedoc2 isn't going to change. It's replacement (makedoc3) seems to have hibernated but it does show up in some places - http://www.rebol.com/r3/docs/markup.html

re: inline links - No its not currently part of the makedoc2 spec or implementation. However inline links are present in make-doc-pro (http://www.robertmuench.ch/development/projects/mdp/) which is an implementation which includes some extensions to the makedoc2 spec.

Here's an example with inline url in make-doc-pro:

  Click on =url http://news.ycombinator.org Hacker News= to get latest tech info


I have never met a markup language with a intuitive way to make links. I wish they would all just support a href HTML and convert it internally to whatever their weird option is.


Markdown doesn't convert it backwards for you, but a big feature is that it passes html through.

The linked article is a little wrong though, the "Stack Overflow" alternative link syntax is a mainline feature.


This always bugged me too. When I had the occasion to create my own, I just did it this way:

Any link like http://example.com/ is converted into a link.

http://example.com|Anchor text| is converted into <a href="http://example>Anchor text</a>

I thought that was as intuitive as I could make it.

I can never remember using markdown whether it's [http:/wwww.example.com](anchor text) or (http://example.com)[anchor text]


My suggestion: Anchor in brackets before or after the link. So this:

    [Some link] http://example.com/
and this:

    http://example.com/ [Some link]
would both become a link with "Some link" as the text. If there are brackets-text both before and after, choose the one after.

This is unambiguous for almost all kinds of text, and anyone who is oblivious to the syntax is unlikely to stumble into it by mistake.


TWiki's isn't horrible. [[URL][text]] (or just leave out the text part)


I still inevitably have to look up the syntax, like which comes first, URL or text.


nice to see creole being compatible with org-mode tables and lists, mosts of my notes are in org-mode and i always wondered how to make everything markdown compatible; now i just switch to creole instead of being religious about markdown. good to see a ruby converter as well.


If github tables supported '+' as a field delimiter in the horizontal dividing line, in addition to '|' as they support currently, then github would support org-mode tables. It sounds like a small change. github, if you're listening -- any chance?


Markdown's embedded html is a strength, especially for something long-form and structured like a wiki. Of course it requires whitelisting, but that's not terribly hard to do (with the understanding that a markdown parser can be much more anal about bad syntax than a browser). The other wiki syntaxes either allow an html subset (MediaWiki[1]) or aren't used as is (plain creole is about as expressive as markdown-without-html). ReST on the other hand is extensible from the start, but the extensions are specific to the build context and the rendering target (mostly html, epub and latex).

[1] https://meta.wikimedia.org/wiki/Help:HTML_in_wikitext


Markdown is in the same bucket as make, bash, and C. While many people today (thanks to the benefit of hindsight) could make an improvement over any of those, it's virtually impossible to make a big enough improvement to outweight the cost of giving up ubiquity.

Already being installed on the machines and minds of millions of users is incredibly valuable for a system. It's fantastically hard to compete with that, which means that many successful systems end up being a local maximum.

You could make a new system that's strictly better if you could get everyone to use it. But before everyone is using it, it's not better enough to get them to switch.


I don't think markdown is anywhere close to this.


Why are we even talking about wiki markup in the year 2013? I'm including all the variants, including markdown and creole and MediaWiki syntax.

Wiki markup exists to bridge the gap between textareas and fully formatted pages. It was a great idea in 2001-2004 or so. We have decent HTML editors now.

Wiki markup makes it difficult for non-geeks to contribute. Wiki markup is non-WYSIWYG, slowing every edit down with some 'parse' or 'compile' cycle. At some point you will have to allow raw HTML and JavaScript to be embedded within it, and then you have the various nightmares that ensue from context-dependent errors in parsing, output, or even security.

There are a few benefits that markup provides over a decent HTML editor - certain changes can be more semantic, and it's easier to diff. As for semantic changes, I would suggest making up your own tags and attributes, as HTML has always allowed, and expanding them as needed to the desired target formats. As for diffs, this is harder, but some focused attention to the problem could solve it. And diffs have their own usability issues anyway; they are barely readable for computer languages. We probably need a better paradigm for tracking changes to human-readable documents.


HTML is good for layout, but bad for formatting. Markdown et al. are good for formatting, but useless for layout. WYSIWYG-giness is not a part of the problem, you can have WYSIWYG editor for markdown.

> At some point you will have to allow raw HTML and JavaScript to be embedded within it

{{citation needed}}

> There are a few benefits that markup provides over a decent HTML editor - certain changes can be more semantic, and it's easier to diff

You forget about indexing and sending plaintext emails. You'll have to do something about soup of tags surrounding a typical WYSIWYG-generated html.

Besides, you don't need to think about sanitization and malicious inputs: markdown output tends to be mostly harmless. You'll have to disable inline html and custom element classes, though, otherwise it might break your page layout.


i have created a light-markup format called "z.m.l.", which is short for "zen markup language".

i started a decade ago on project gutenberg e-texts, so i've had a lot of time to evaluate my decisions, and to tighten up both the format and my converter, which i have ported to several different languages, most recently javascript, where it runs quite crisply.

i've also coded a number of authoring-tools, including online sites, and offline apps which are cross-platform. some of these authoring-tools have a pedagogical bent, making the z.m.l. format easy to learn, especially since it was specially formulated to be simple to grok quickly. (my converter has no complicated regex black-box magic.) and all of my programs have an instant preview facility.

z.m.l. puts its focus on long-form documents, like books, so output-formats include .epub and .mobi, not just .html. in addition, viewer-apps with high-powered functionalities have been created, again online and cross-platform offline.

for an advance preview, send a tweet to me, @bbirdiman...

-bowerbird


You should just put a preview here.


i will be taking it wide very soon. awaiting some kickstarter decisions.

-bowerbird


MultiMarkdown [1] is an adaptation of Markdown that claims to solve the sticky bits and adds additional goodies. The latest version is forked from peg-markdown.

[1] https://github.com/fletcher/peg-multimarkdown


I like reStructuredText since it is very readable in plain text form, which I think Markdown sources are very often not (and it bugs me).

Just like the other markups, there are nice tools like rst2man, rst2pdf etc, and you can use it for making both slides and reports.


Multi-paragraph lists are exactly the same problem I'm having with Markdown. It's so annoying that I have to dig in the HTML-code afterwards and fix it manually.


As far as I know I don't think it has nested lists either.


Of course it does. Just indent by four spaces:

  1. Foo
      1. Alice
      2. Bob
  2. Bar


I'm pretty sure that's third party enhancement of markdown (maybe github flavoured markdown). Some markdown editors do not support this. See the official markdown guide. There is no entry on nested list. http://daringfireball.net/projects/markdown/syntax#list



I've always struggled to find what is supported in a HN text box(MD, reST or whatever markup language) and what is not and somehow that YC or HN post/official page has eluded me till now!


This thread isn't about HN, but you'll find this https://news.ycombinator.com/formatdoc linked when you try to edit your comments.


Markdown is flawed, no doubt, but the brilliance of it is the resemblance of markdown text to nicely formatted ASCII vs. the suggested alternatives.

I agree that markdown's flaws are significant, but the suggested alternatives don't address them either (except turning off or carefully dealing with inline HTML).


Fragmentation is a bigger problem than any other. Markdown has its warts, as do all the alternatives (and IMO Creole is far worse), but it has enough traction that users will have a decent chance of knowing how to do what they want, which means it's the best choice.


org-mode blows all these terrible pseudo-markup syntaxes out of the water.


Org is my favorite markup too (mostly because of emacs' org-mode). I certainly hope it picks adoption in the future. One of the downside, however, is its syntax for verbatim paragraph is a bit awkward (requires a colon at the start of each line).


And if people browsed the web with w3m, they could even use it easily in input fields such as this one :)



Not all of us have memorized all the XKCD numbers->comic title yet, so if your comment is nothing more than a link, could you at least include the title?


Anyone know what happened to the effort behind https://github.com/markdown


My ideal markup language would be something like markdown but with a nicer image syntax and support for latex style math equations.


Pandoc's version of markdown has both of those (along with lots of other extensions).

http://johnmacfarlane.net/pandoc/demo/example9/pandocs-markd...


org-mode also does it.


The problem with code blocks following lists makes the choice of markdown for literate CoffeeScript particularly frustrating.


Why was reST not supported and contributed to by the community, BTW?

I saw a comparison here[1] and good that GitHub supports[2] other markups too. I don't think there's much of a need to move to Markdown other than just for the sake of using something else.

I remember reading Jeef's post The Future of Markdown[3] few months ago and then what ensued was really frustrating [4].

[1]http://www.unexpected-vortices.com/doc-notes/markdown-and-re...

[2]https://github.com/github/markup

[3]http://www.codinghorror.com/blog/2012/10/the-future-of-markd...

[4]https://twitter.com/gruber/status/262287246953164800


Even with the issues of markdown, I still prefer it over restructured text. For some reason I have never liked reST. I use Sphinx[1] for documentation, and while I love it as a tool, I still dislike reST.

I would love it if Sphinx (or if there was a decent sphinx alternative) supported creole or markdown, but it is quite tightly tied to reST.

[1]: sphinx-doc.org


I feel exactly the same. For whatever reason, it seems like at every point where markdown and reST diverge, the way that I naturally want to write lines up with markdown (so I rarely have to look at the docs) and reST does something completely unintuitive (to me). So writing reST just feels like it's fighting me on every little thing while markdown almost always Just Works (for me). Sphinx and ReadTheDocs are great, but I dread having to write reST to use them.


The problem is that there is nothing that can replace rst currently. The reason Sphinx uses rst is because it's incredibly expressive and extensible. This is incredible important if you want to write good documentation.

For instance you can use `.. versionadded:: 0.2` to indicate that something was added in a specific version. The builder then render a nice and consistent block that can be styled in whatever way necessary. You can use :ref:`bar` to reference something in Sphinx, :kbd:`alt + k` to indicate a key sequence etc.

We could not have used Markdown for this without making a new dialect of Markdown. Also unlike rst Markdown is very ambiguous and restrictive. There are certain elements you can't use below others. That very, very rarely applies to rst. For instance you can without a problem have code blocks in tables or code lists in tables etc.


Ha. Good to hear I am not the only one that feels that way!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: