PDF Killed the Programming Language

It’s a slow Sunday morning so I was going to browse around a new language I’d been hearing rumors of, and maybe send them a little link love if I liked what I saw. However it seems all their tutorials, manuals, white papers, and almost everything else are in PDF. Yuck. Not worth my time.

They’re complaining that they can’t get any thought leaders to pay attention to them. If they insist on publishing on the Web in a format designed for paper books, it’s no wonder no one has noticed them. Write back when you start noticing this little thing called HTML, guys. I’ve got a feeling it’s going to be big one of these days.

Seriously folks, PDF is wrong for the Web and always has been. Unless you’re distributing fliers to your salespeople that they’re going to take to a print shop and have printed up for their next meeting, there’s really no reason to publish PDF on the Web. Whenever I see a link to a PDF, I know that the author was so enamored of their tool that they didn’t stop to consider the desires or convenience of the reader. They put their needs ahead of their readers.

HTML has been around for almost 20 years now. The time has long since passed for using HTML incapable tools that can only print to PDF. If you’re going to publish on the Web, then you need to publish in HTML; and that means you need to author in HTML or a format that can be easily converted to HTML. PDF is a sure sign that a site is being put up for the ego-gratification of its authors, and has little interest in communicating with anyone else.

For most documents today, the Web is the most important medium. It is more important to have a good web site that people can read than it is to have attractive dead trees. If a publisher/author does not have the time or budget to manually convert printed formats into HTML for the Web, then they should author in HTML first and print that, rather than authoring in Word and posting it as PDF on the Web. HTML is a much better paper format than PDF is a web format. With a little CSS or XSLT trickery, an HTML authored book is indistinguishable from one written in Word.

PDF is nothing more than a prepress format in the same family as raw PostScript, dvi files, XSL-FO, and a dozen other technologies designed for 1980s-era Linotypes. It has a niche, but that niche is not the Web. If you want people to read what you write, then publish in HTML, never PDF.

23 Responses to “PDF Killed the Programming Language”

  1. Andrzej Taramina Says:

    I’m not so sure I agree 100% with you this time.

    Sure…html is great on the web. But whenever I want to read something long and involved, I invariably end up printing it rather than reading it on a screen. So for longer documents, PDF makes sense to me.

    Like most things in life, it’s not a black and white issue. Shades of grey rule.

  2. dvholten Says:

    no – dont buy that. html requires a site-grabber or being always online. i am not aware of any html-website which looks good when printed.
    pdf is good.

  3. John Cowan Says:

    PDF and PostScript are the normal way to publish journal articles on the Web, and journal articles, or would-be articles, are the normal way to distribute new (and even old) information about programming languages. Buck that trend, and you don’t get anywhere in the PL world, so those of us interested in the subject acquire usable PDF readers (not meaning Adobe Reader; in my case, Foxit) and PS readers (GSview for me). It’s really not about arrogant authors, really.

    I also strongly sympathize with Andrzej, and I’m feeling the increasing stress of not having a printer right now (hopefully next month). Nominal 72-dpi screens just don’t cut it for middle-aged eyes, and reading while scrolling horizontally is extremely unnatural. And when you are printing, HTML is really problematic; I don’t know of any browser that DTRT at all.

    Finally, even if you don’t want to give the language in question a precious link, could you at least mention the name, so that those of us who are interested in PLs and are not format bigots could go off to see if there’s any there there?

  4. Torsten Curdt Says:

    Absolutely disagree. What’s so different in using a PDF reader vs a browser? …except than you give PDFs documents a proper layout more easily. Maybe the whole web should turn from HTML to PDF. I know designers would love it. Just that HTML has been around for ages does not make it good. It’s legacy – that’s all it is.

  5. Joost Says:

    So you’re boycotting that language because of the pdfs?
    Regardless of the merits of pdf I think you’re taking it or yourself to seriously.
    Cheers.

  6. Elliotte Rusty Harold Says:

    PDF and PostScript are the normal way to publish journal articles on the web only because journals and authors just reprinting material intended for paper without reorganizing the data in a more accessible, sensible format. However, within some forward-looking communities the papers are indeed published in HTML. This is the case for most of the refereed work on XML, for example, or at least the only work anyone actually reads. (That is my point: if you care about people reading this stuff you make the effort to write it in HTML. Of course a lot of journal articles exist for no reason other than to pad a publication count prior to tenure and promotion decisions, so I guess it doesn’t really matter if these are only published in PDF.) The PLoS journals are all publishing in HTML (and XML, and PDF but their main publication format is HTML).

    As to a decent PDF reader, I’ll believe it when I see one. I’ve tried a lot including Acrobat, Preview, Ghostview and some others; and they are all clearly inferior to browsers for onscreen reading. I don’t think this is a result of incompetence on the part of everyone who writes a PDF reader. The bottom line is that the format was never designed for onscreen reading or browsing, and the few hooks that have been added to it for this purpose over the years are insufficient and not actually used by publishers.

    Of course if you want to print and read everything that’s a different story. (Personally I just blow everything up to a larger font. Wouldn’t you know it? That works better in my browser than in PDF. HTML reflows. PDF doesn’t.) However it’s not like it’s all that hard to print HTML. HTML reads on paper much better than PDF reads on the screen. But, yes. PDF is better for print, and if I really want to spend hours reading a book, I might want a printed copy. I’m just browsing along trying to learn something new. A simple tutorial, FAQ list, or white paper doesn’t need PDF.

  7. Todd Ditchendorf Says:

    Elliotte, I agree 100%. Being forced to read PDF on the web makes my blood boil (you know I have high blood ;).

    Surprised to see people disagree.

  8. Ed Davies Says:

    Personally I just blow everything up to a larger font. Wouldn’t you know it? That works better in my browser than in PDF. HTML reflows. PDF doesn’t.

    Absolutely. Ctrl-+ is my first reaction to many sites, though ones like this which start with a decent size font typically don’t need it unless I’m, literally, very laid back. The people who really don’t get it are the ones who write HTML which won’t reflow properly – giving the worst of both worlds. Being a “designer” is not the same thing as being a dictator; it’s best if readers can view the contents of a document in whichever way is convenient to them.

    On the other hand, I can’t understand this urge to print stuff. I’ve only used my printer once since last summer – and that was only to try out an old program I wrote in 1990. The colour cartridge was all bunged up so I had to use just black.

    I like the way many W3C documents are published – as a bunch of not-too-long HTML files which you can browse on-line or download in a .zip or .tar.gz. E.g., CSS 2.1. This is good both for quick reference on-line and for off-line reading with reasonable confidence that the two are likely to be the same. (Also, their “design” is pretty minimal but well suited to the job and reflows nicely.)

  9. Ian Phillips Says:

    You *can* get good looking print output from HTML, this article seems pretty well timed for the current discussion:

    http://www.alistapart.com/articles/boom

  10. John Cowan Says:

    Nobody should be allowed to post an opinion to this discussion unless they mention their age. I’m 48, and high-resolution is not a convenience but a necessity. I am a heavy user of both Ctrl-+ (or rather Ctrl-mousewheel) and PDF zooming.

    Anyhow, I don’t think the fact that that people who study HTML and XML use HTML and XML to publish in means anything except a laudable desire to eat their own dogfood. The Unicode Consortium, for example, uses a sensible mix of plain text, HTML, and PDF for their documents.

  11. Reg Braithwaite Says:

    I’m also surprised to see push back on this subject.

    First, the academic PL community may like journaled papers, but the programming community shows a strong preference for stuff that woks in a browser.

    This can be tested empirically: Simply visit a site like programming.reddit.com. Note how many howls there are when people post links to PDF without warning people in advance. Note the fact that PDF papers need to be incredible, time tested works to get any sort of positive rating.

    Second, what in tarnation is wrong with publishing to both HTML and a printable form? People have been doing this for years, and there is no excuse for restricting your readership to a single, inconvenient format.

    PDF is great for exactly one purpose. HTML is the general-purpose format that does a bunch of things in a mediocre way. But there’s no need to argue which is better: simply provide both when you wish to serve the community that print out your work.

  12. EntropyFails Says:

    > HTML has been around for almost 20 years now.
    Hmm.. 1993 to 2007… That’s “around 20” in the same way the girl at the party you were hitting on was “around 20”. ;)

    PDF’s are a very nice way of getting strange symbols into documents. If you have something technically dense to discuss, the web isn’t always the best choice for making the document easy to write, or read. And the lack of pagination can really distract from understanding the information. Plus PDF’s are easy to make from Latex and has a wider install base than PostScript. PDF-2-html translation is fairly shoddy at best as well, so you cannot blame people for not wanting to create a crappy version of their hard work. If you care that much, perform the translation yourself.

    Regardless, if you refuse to learn a programming language because you had to read some PDF’s to do so, you obviously didn’t care much about the language to begin with. The format is never a barrier to the knowledge hungry.

  13. Chris Coppenbarger Says:

    PDF on the web is rather annoying, most of the time, but useful if you want to offer your readers something to download to read later, especially if it’s a long page or file. At least the Acrobat Reader is a memory hog, anyway. What I have done, particularly for articles, is store the information in a database, print it to the web in HTML, and offer either a print button, to print using the CSS, which works pretty good, and a save button, which converts the page into PDF so that the person can read it later from their computer. However, HTML first, with a print.css using the media=print. FPDF is a free php class for converting html pages on-the-fly. Works nicely.

  14. Weirdos Says:

    What’s wrong with using HTML, and adding a link to a PDF for the minority who, apparantly, regularly print books for reading while not at their computer?

    HTML can be saved to disk for offline viewing, just like PDFs, and a PDF on the internet can’t be viewed unless you’re online or have saved it to disk.

    Honestly, the idea that the web is just a medium for getting books to a printer before reading is weird.

    And it takes a PDF to do 2 or 3 column output on a screen, thus maximising the amount of scrolling required in order to actually read the damn document. If you really must do that, you need a better reason than “some people might print it out and they’ll really love this format”. (Ok, actually, it can be done in HTML fairly easily, but noone *actually* does it – but every other PDF author seems to delight in doing a magazine-style format.)

  15. Andrew Shebanow Says:

    Cranky Old Man, is that you?

    Let’s boil your argument down, paragraph by paragraph.

    I find PDF inconvenient and annoying, so I won’t read any content published in PDF.
    Anyone who publishes in PDF is a guaranteed failure, since I won’t read what they wrote and therefore no one else will either.
    I don’t need the level of visual fidelity provided by PDF unless I want to print on dead trees. Since I don’t need it, no one else does either. Therefore, anyone who publishes PDF to the web is egotistical and insensitive to the needs of their readers.
    I think HTML/CSS fidelity is “good enough” for all types of content, therefore formats that provide better visual fidelity are unnecessary.
    No one should be allowed to author using tools that don’t output directly to HTML. I don’t, therefore no one else should be able to.
    The Web is the most important medium today. Therefore, all content should be developed in HTML first. Hey Word-level fidelity is plenty good enough for me, and I can almost reach that level of quality using CSS or XLST.
    Did I mention that I think PDF sucks?

    Seriously, though: I’m a big fan of your books (I even got a thank you in one of them), but this is the most ridiculous, poorly reasoned rant I’ve ever heard from you. You don’t like reading things in PDF? Fine. But don’t try and generalize your personal preferences and prejudices into some kind of rulebook for the Web.

    Disclaimer: I work for Adobe, but my opinions are my own.

  16. Andrew Shebanow Says:

    Sorry for the duplication in there, the “Edit in Textmate” command did something wierd to my text.

  17. Bruce Stephens Says:

    What really sucks is that I can’t bookmark PDF pages. The reader (finally!) remembers where I was in a document, but that’s about it; no way (no way that’s obvious to me, anyway) to remember that some interesting bit is on page 271 of some specific PDF file.

  18. Warren Henning Says:

    If the language in question was Scala, I can point you to HTML stuff.

  19. niggel Says:

    “If you want people to read what you write, then publish in HTML, never PDF.”

    Sorry you are as thick as the people you are accusing.
    Just because *you* dont like PDF its not a reason to rant like this.. specially making it sound as if you are speaking of facts.

    So you were trying to learn a new language? thats not a page or two… so “FAQ or whitepaper” but i might imagine a more complicated matter… the kind of ones you might print to read latter.
    I personally dont care about pdf (which is defacto a multiplatform webstandard) or HTML (more complicated to store and read offline if more than one page)
    and you gave up an declare the death of a “programming language” because you were too lazy to read a PDF… tsk tsk…

    The moral of this story as it will remain on the web forever?:

    metalab.unc.edu has some funny people working in it.

  20. rambler Says:

    I’m only 30-odds, but I still like to read printed PDF/PS — immensely more convenient than HTML, especially when you have to go back every so often, so I’m not buying one bit of this rant.

    Anyway, I did wade through all the stuff here only with the hope to find out what PL you were looking at — so could you please be so kind and give us all a hint? (Or do you really take yourself so seriously as to think that the language is indeed dead, once you declared it so for the lack of HTML docs? :-)))))

  21. Alexander Fairley Says:

    MathML is not supported well.
    Most things actually worth reading involve some mathematical notation.
    .
    . . PDF is fine by me.

  22. john melesky Says:

    “HTML has been around for almost 20 years now.”

    Not only is that questionably true, it’s also irrelevant: PDF has been around for almost exactly as long as HTML (PDF 1.0 was put out in 1993, HTML 1.0 also in 1993). Those who publish in PDF are not using some archaic format that was rendered irrelevant by the web — they’re using a format that is just as modern as the web.

  23. Labnotes » Rounded Corners - 110 Says:

    […] The PDF diet. Elliotte Rusty Harold complains, and I wholeheartedly agree: “Whenever I see a link to a PDF, I know that the author was so enamored of their tool that they didn’t stop to consider the desires or convenience of the reader. They put their needs ahead of their readers.” […]

Leave a Reply