Wednesday, June 13, 2007

converting LaTeX into word...

I write (most) of my research in LaTeX format. But journals often demand .rtf or even .doc formats for the final version of my paper. Sometimes by speaking to them very nicely you can get them to accept tex versions (Phil Studies and Phil Perspectives both did this). But sometimes that's just not an option.

This leads to hours of heartache and potentially lots of typos, as I try ten ways of transferring the stuff over to my word processor. And I have to deal with getting logic into word, which is never nice. I used to use a special compiler to get it into html format, and then "save as" word. But that didn't actually save much time, so I've recently begun to just cut-and-paste the raw tex file, and reformat it and rewrite any code I've put in. I've downloaded a couple of trial applications that promise to convert stuff directly into doc, but with no success (they throw a wobbly whenever they meet any dollar signs, it seems).

Does anyone know what the best way to do this is? Would it help to get scientific word (more money to the man, I know, but at this stage I'm desperate).

20 comments:

Alex S. said...

Hey Robbie,

If you have a program that will convert .txt files into .pdf files, then the task is easy: just open up the .pdf file with Adobe Acrobat, and then cut-and-paste into a Word document. (You will have to reformat a bit, but it should involve nothing as mind-numbing as snipping out all of the LaTex syntax. Ugh.)

Let me know if that helps.

Ásta said...

Hi Robbie,

I have this problem as well. There is some tex to rtf software that I have not been able to use successfully. (Commercial) Pdf converters are available, though: apparently in Acrobat Pro you can save a pdf file as a doc file. I just tried PDFtoOffice (for Mac) which seems to work fine. It costs money, but might be worth the hassle.

Best, Ásta

Ant said...

This is my preferred hack. I use geometry to set the page width to something like really huge: huge enough that every paragraph in resulting pdf is on just one line (so no linebreaks to delete). I suppress page numbers. I use times. I do a global search to replace ff with f{f} (etc., so no ligatures which word doesn't much like).

And then I copy and paste from the pdf to a word file; word seems to handle the text nicely enough, preserves italics and bold, logic is always going to be a problem but at least for me hand coding that is still faster than fiddling with trying to get a mechanical global conversion to work.

raul said...

for those who use linux, and are able to run kde applications, kword might help. you can import both .tex and .pdf files, and then export them in either .rtf or .doc format. as with alex s's and ant's suggestions above, it's not perfect (esp. with logic stuff). but i find it preferable to doing things manually.

on the other hand, it's weird that journals require papers in either .rtf or .doc format. most philosophy journals' publishing houses also publish some math/statistics/physics journals (a few examples: blackwell, who publishes analysis, phil quaterly, nous, ppr, etc., also publishes over 30 math/statistics journals. springer, who publishes phil studies, erkenntnis, synthese, etc., also publishes over 50 math journals (not counting many logic journals) and over 50 physics journals. oup, who publishes mind and the british journal for the phil of science, also publishes over 35 math/physics journals. duke university press, who publishes the phil review, also publishes the duke mathematical journal). and i'd have supposed that many, if not most, in the math/statistics/physics community use latex. so could it be that many editors of philosophy journals are not aware of the capabilities of their parent publishing companies?

Robbie said...

Thanks all! I'll give these a go. Alex: I think I've tried that, but found that it put carriage returns at the end of each line in the pdf document. The sort of hack that Ant suggests sounds like it might deal with that problem, though. I'll be giving it a go.

Asta: I'll see if I can find what happens with the full version of acrobat... would be really nice if it worked. I'll experiment with PDFtoOffice too. I've just tried it, and one thing I notice is that footnotes become part of the main body of text. I guess that's going to be a problem for all these ideas.

Raul: I used to use linux, but recently have gone back to microsoft. I agree that it seems crazy that journals don't take tex files. To be fair, some are better than others: I was able to send a txt+pdf to Phil Studies, and to Phil Perspectives. But others haven't. E.g. AJP and Phil Review each insisted on rtf. It'd be nice to know why they can't handle this...

Dan López de Sa said...

LaTeX2RTF works pretty well for me

I think I've submitted .txt files to Phil Studies, through the "Editorial Manager". Do they ask for .rtf versions of accepted papers?

Robbie said...

Just to report I'm still struggling with this... especially when I've got miles of formulae in the paper. Asking journal editors really nicely to check if they can accept in this format has had success.

But mostly it looks like I'm going to be spending research support money on getting people to re typeset it when it proves necessary.

Dale Dorsey said...

I just stumbled across this post. I use a pre-built word macro that you can download. Basically, it follows the strategy you outline, but shaves lots of time off. I've had no problems. It's called "tex2doc". Type that into google, and it's the first link. It's not perfect, but it's darn close.

tomstafford said...

Just to say am struggling with this too and have just tried Latex2rtf and am very impressed

Carlos Cortissoz said...

Hi! Some time ago I am trying to deal with this problem and definitively I don't want to stop using LaTeX.

Recently a learned that my LaTeX Distribution (which is TexLive 2007, I have a Mac) includes a package called "wordlike".

This package produces a PDF resembling a word document, which is kind of downgrading it (the package's author himself recognizes this), but could be very useful if you want to copy and paste into word.

teesid said...

Have you tried GrindEQ? It can convert almost every math expression in my paper.

Rubby said...

I cannot convert \mathbb{S} into a .doc file using the GrindEQ. Is that a way how to do it?

Anonymous said...

i'm on Ubuntu Intrepid
i did
latex filename.tex
then
xdvi filename.dvi
which gave option to save as txt

Anonymous said...

You really should try out tex4ht, it is very good in exporting latex, as it is actually a TeX engine itself. Very handy with complicated bibliography packages etc. TeX4Ht does not export directly to Word, but does export to OpenOffice, which then can take things further. Another option is to produce html optimized for Word, which does not handle to formulae, though.

mnajem said...

I am having the same problem for my 10-15 pages technical paper.

What I did was I used "tth" on Linux and converted the paper using e2 flag so that it will put pics and figures inline the writings.

Afterwards I open up the file using OO Writer and did some editing ... some fonts were garbled.

But it is a lot better than any other way that I found so far.

KWord was hang during the import session... so OOWriter is better in this HTML import.

Liz Quarterman said...

Great posts - and very helpful as a copyeditor who has to deal with a LaTex file from an author for a psychology journal. (But can someone delete that last post please. Ridiculous.)

Robbie Williams said...

Hi all,

Thanks for all these suggestions! Sorry for not posting here more---I moved this blog to wordpress, so this one is somewhat out of date. But it looks like this thread continues to be useful to folks (and I get email updates when they appear).

Liz---thanks for the comment; I've removed the most obvious spam.

Anonymous said...

Has anyone tried http://www.verypdf.com/app/pdf-to-word/try-and-buy.html

?
I was wondering if you had an example of a latex-written pdf with formulas, tables, plots (especially multi-panel ones) and references which you generally thought was hard to translate into Word. Unfortunately I don't have anything like that handy at the moment. If you could upload it somewhere and make the link available we could all have a go with the free trial version and share our opinions on the result! It seems relatively cheap but I'd like to have a proper go!

Thanks

P.S. By the way I'm not trying to promote any software just wanted to have a go at something which has all the difficult bits in it.

Anonymous said...

I have tried the above software (or at least the free trial) to translate pdf to Word and it works ok for text and images. Tables are also translated fairly well. Formulas are not translated not even pasted in as an image, so it's no good for formulas.

PDFsToWord Email said...

Till now, I have knowledge about pdf to word conversion tools but latex is a new term. Can you post something related to this new format so that things can get clarified.