Stuff Yaron Finds Interesting

Technology, Politics, Food, Finance, etc.

Making LY X Produce Decent Hyperlinked HTML & PDF Files

As someone who writes for a living I have used a whole slew of word processors and generally haven't been all that happy with any of them. When I write I want to focus on the content, not the presentation. So the whole WYSIWYG generation of word processors left me cold. In fact, I spend most of my time in outline mode in Word. LyX takes a different approach. It focuses on What You See Is What You Mean (WYSIWYM). In other words, it doesn't worry about formating, just content.

When I first reviewed LY X a year ago I decided it was good enough to use (and have done so regularly since) but still painful. With the 1.5.1 release I can revise that review to say that it's only minorly irritating to use LY X but it's good points are so numerous that it's more than worth the pain if you need to deal with large documents, with math formulas or with large bibliographies. And thanks to the efforts of folks like Dr. Richard G. Heck (who has my undying gratitude for fixing HTML generation in LY X) LY X is substantially better at generating HTML.

LY X does have a learning curve and one is well advised to at least read the tutorial (Help->Tutorial). But I believe that the modern versions of LY X have vastly improved over their predecessors and the learning curve is well worth the effort.

Contents

1 Starting a New Document

To start a new document I begin by using File->New From Template and then pointing at the locally saved version of my template file. This file sets a few environmental variables that enable URLs and set a few other things the way I like.

2 Math

TEX, the underlying technology for LY X, was original invented in order to produce beautiful mathematical equations. So as one would expect LY X comes with a fairly nifty equation editor. For example, here's an equation I pulled out of my book on retirement planning:

 (1+ IBond )N - ((1+ IBond )N - 1)* T = (1 +(IBondsEquiv * (1 - M arginalTax))N N N 1- 1 + (IBondsEquiv * (1- M arginalT ax)) = ((1+ IBond ) - 1((1+ IBond ) - 1* T)N ((1+-IBond-)N-- ((1+-IBond-)N --1)*-T)1N---1 IBondsEquiv = 1 - M arginalTax

What's especially nice is that I can enter the TEX formula commands directly into the dialog and LY X will translate the command on the fly to the right graphical representation. But there is also a graphical formula palette available by right clicking on an equation. The only trick I've had to remember is to press command-return when I want to enter multiple equations at one time.

3 Bibliographies

One of the more unpleasant tasks in writing a paper with lots of references is keeping track of those references, linking to citations and making sure the bibliography is properly formatted and ordered. Thankfully there is a great standard called BibTex that handles a lot of the drudgery. The first step to using BibTex is to get a bibliography manager program which can keep track of all the bibliographic entries. For OS X there's BibDesk[Bib(2006)]. It's pretty nifty but I did find perusing its manual worthwhile, especially regarding its AutoFile feature. With AutoFile, BibDesk is designed not only to keep track of reference information but also of the underlying documents being referenced.

To get BibTex references working I double click on the "BibTex Generated Bibliography" icon at the bottom of the new file created from the template and navigate to the .bib file generated by BibDesk. To add actual citations I use Insert->Citation which will list all the citations available in the BibTex file. It's worth pointing out that BibDesk only updates the BibTex file when the File->Save command in BibDesk is used.

One of the issues I've noodled on for a while is how to get decent bibliography entries for Webpages. BibDesk supports URL, Electronic and Webpage styles but I find that they either don't have enough information or require too much information to properly fill out. I now find myself mostly using the "proceedings" style where I put the web page name as the title, a year and then enter the URL. I find this publishes quite nicely.

4 URLs

A real weakness in LY X is its URL handling. There is a special URL dialog but unfortunately it can only produce URLs of the form text /lyx. That is, some text followed by a URL. To get around this I use Evil Red Text (ERT). I create an ERT by pressing the TEX button and then entering information of the form \href{http://www.goland.org}{this URL} which is the command I used to generate this URL.

5 Adding Cross References

TEX supports a very simple type of cross referencing using \ref and \label that allows one to reference anything that has been labeled. These references, especially if set to 'reference', will work in HTML but references always reference the nearest section to the label, they don't link to the actual labeled text. This is fine for printed documents but less than optimal for HTML.

One horrendously painful workaround is to use the \hypertarget{Label}{Text} and \hyperlink{Label}{Text} macros. Both macros are inserted using the TEX button to create ERT. For example, I use \hypertarget{L1}{this text} to link this text to this link that I created using \hyperlink{L1}{this link}.

6 Making Pretty HTML

File->Export->HTML. Thanks to Dr. Heck it's as simple as that.

7 Conclusion

Although LY X isn't necessarily the easiest word processor ever invented it richly rewards the investment to learn it by providing an environment that lets the author focus on the text above all. I really enjoy editing text in LY X even for its various glitches and issues. I also give a hearty thanks to the LY X developers for doing a great job of continually improving LY X.

References

[Bib(2006)] BibDesk, 2006. URL http://bibdesk.sourceforge.net/.

A Appendix – Creating the Template

A.1 Configuring URL Output

The HREF function described above is enabled via a package called hyperref. To enable hyperref and also to set the output behavior so that hyperlinks will be colored when outputted in PDF documents (a useful feature since PDF viewers like Apple's preview do not highlight URLs by default in PDFs) the text below needs to be added to the preamble of the document. Preamble is accessed via Document->Settings->LATEX Preamble

\usepackage{hyperref}

\hypersetup{colorlinks=true, linkcolor=blue, filecolor=blue, pagecolor=blue, urlcolor=blue}

A.2 Configuring BibTex

I added in a BibTex based bibliography by going to the end of the document and using Insert->Lists/TOC->BibTex Bibliography. As described above(3 ) I point the file at the appropriate .bib file. I set the style to "plainnat" because I like how it outputs citations (author's last name + Year) and it also will output URLs listed in the bibliography. I also check "add Bibliography to TOC" so the Bibliography shows up in the table of contents.

A.3 Making Simple Quotes

I think the default quote format in LY X is a bit odd. I prefer simple quotes. To enable this I set Document->Settings->Language->Quote Style->"text".

A.4 Avoiding Problems With Fonts

Leaving the fonts (available via Document->Settings-Fonts) all on default will produce reasonable output. Messing around with the choices can be a bit dicey in terms of generating HTML as there is a problem with some of the fonts generating bizarre ligatures. For example, set Roman to Bera Serif and generate a HTML file that contains the strings "fi" and "fj". The HTML output will invert them, "fi" will come out "fj" and "fj" will come out "fi". If you keep all the fonts on default then all will be well.

B Historical Cruft

Before Dr. Heck fixed LY X it used to be that it wasn't possible to generate HTML from LY X that honored bibliographies. The dance below fixed that problem. It's no longer necessary but I paid a really painful price to figure out the exact right set of commands so I'm keeping a copy here, just in case.

Let's assume that file is called "myfile.tex". I next open a terminal window, navigate to the directory containing myfile.tex and type in the following commands:

latex myfile

bibtex myfile

latex myfile

latex myfile

htlatex myfile xhtml

Notice that the extension ".tex" is not used. Then end result will be a file called myfile.html that will contain a full HTML representation, with a bibliography and will generate gifs for any math formulas.

17 Responses to Making LY X Produce Decent Hyperlinked HTML & PDF Files

  1. Anonymous says:

    You need a good editor — (or is that a “bleditor” for blogs?)

  2. Administrator says:

    It’s actually an interesting problem. I’ve been running a website for over five years now and in that time lots has changed. As I discussed in a previous article I have changed OS’s, Client Programs, Server Programs, etc. So at this point I don’t want something blog specific. I just want a good editor. OpenOffice does a great job on generic HTML (which is what I run the site off, I have a program I hacked together that translates the HTML into WebPress posts) but when math and bibliographies are involved I find LyX much easier to use than OpenOffice.

  3. Stephen Harris says:

    LyX has the purpose of making it easeir to write
    LaTeX code. LyX does not need a spellchecker or
    LaTeX to function, those these augment functioning
    of LyX. TeX and LaTex were invented before html so
    they have no inherent compatibility. Latex is a
    more complex structure and does not always have
    simpler translations into the limits of html. The
    Latex developers are not responsible for add-on
    packages to LaTex. So it is far removed to think
    that LyX developers are responsible for LateX
    add-on packages. The user who wants customization
    should edit the config or init file or extend the
    (Perl) module in the case of Latex2html.

  4. Administrator says:

    LyX will be what it’s authors and community want it to be. I hope that the community recognizes that the future of publication is on the web and with HTML and therefore focuses on providing better HTML support. TeX (and by extension LyX) have solved many real world problems for document authors and while not all of TeX’s features translate well into HTML many do and I believe we would all be richer if the transition were made.

  5. The Dude says:

    Your PDF font problem is probably caused by the use of “T1” fonts. T1 fonts are designed to be used by languages that use lots of diacritical marks (or so I’ve been told). Since English doesn’t use diacritical marks, you can change T1 to “default”. Look in the “lyxrc” file. Change the attribute \font_encoding to “default”. The PDF that lyx generates will now contain scalable fonts.

  6. bard says:

    i dont think that people are using lyx in order to publish the final output on web as html (or they do it just as sideeffect). if you started use lyx with this idea, you went basically wrong. btw some googling reveals, that export to html can be made even without evil red ERT for urls ;)

  7. Administrator says:

    I can’t speak for people, just myself and I do use LyX to publish to the web. Perhaps the LyX community thinks that an illegitimate goal but if so then why bother producing html output mechanisms?

  8. Joe GArsia says:

    LyX is a nice frontend to LaTeX. There are several packages for LaTeX hyperlinked PDFs, like Prosper and hyperref. You should be able to do pretty well with a combination of those.

    The LaTeX that LyX creates is neat and readable, so it shouldn’t be a problem to hand tweak it if you need to.

  9. greg says:

    Can you please help me? I’ve looked everywhere for the answer but can’t find it.

    I am also having the same problem with URLs. I’ve tried electronic, url, and webpage as the citation types, and none of these show the url when I click preview in Bibdesk (and of course the citation doesn’t have the url in lyx). You used proceedings to solve this problem, but this still didn’t show me any urls.

    Any ideas?

    Thanks,
    Greg

  10. Administrator says:

    Hum… that is odd. I’m honestly not sure but I do know who can help. http://www.lyx.org/internet/mailing.php gives the contact data to join the lyx-users mailing list. They are incredibly active and extremely good about answering questions.

  11. Carl Turner says:

    Sorry, but this page is so messed up in appearance that I can only hope it wasn’t written this badly by hand – the anchor tags started to mark sections aren’t closed, so great big wads of text think they’re links, when they’re not. And as for the “It validates in XHTML and CSS minus plugins.”… heh. For starters, #sidebox2 contains ‘s, but there’s no previous , which is presumably why the bullet points are so weird.

    Anyway, do check WordPress/your tools/your HTML because something went wrong here, and on a page about HTML generation – indeed, links – it’s sort of unforgivable ;-)

  12. Carl Turner says:

    Oh s**t :D sorry, I typed the tags with the <> bits… damn. Anyway, the point is that the <li>’s aren’t in a <ul>.

  13. Administrator says:

    Carl – The problem started when I fixed another bug in the software that runs my blog. I just haven’t had time yet to find out where the new bug was introduced and fix it. :(

  14. Administrator says:

    So the bug in my blog was in some code I wrote to output HTML. The code translates from XML and it was outputting tags with no child as <foo/> instead of <foo></foo>.

    In XML <foo/> == <foo></foo> but HTML processors treat the anchor tag in particular differently. They see <a/> == <a>. In other words they don’t honor the closing / and just see an open anchor. That’s what caused all the ugliness.

    I fixed the code so I now always output the <foo></foo> form and that seems to have fixed things.

  15. Carl Turner says:

    You also forgot to use “&lt;” / “&gt;” for < and >! Irritating, isn’t it? And yes, there is an improvement! The nav menu on the side’s still wonky though – as I mentioned above, the <li> tags appear to have no <ul>

  16. Administrator says:

    The mess in my comment was due to the fact that the commenting system in WordPress uses completely different escaping rules than the posts do. So I went in and manually edited the comment to use proper escaping. But I don’t believe you will find similar problems in any of my blog entries.

    I also cleaned up a bunch of other annoying problems. For example, WordPress likes to drop “\” when you upload articles to SQL and there is a bug in WordPress that if you have a <br></br> in a <pre> then it turns the <br></br> into </br> which caused a lot of formating errors. Both are now fixed.

    As for the screwed up nav menu, I’m too wimpy to take that one on. That is a bug in the theme I’m using and the code for the theme is more complex than my brain can grok. So for now I’ll just have to live with it. Some day I’ll probably just switch themes.

  17. Pingback: Latex and Math In WP « BUmath

Leave a Reply to Carl Turner Cancel reply

Your email address will not be published. Required fields are marked *