Have you ever thought of using a grammar checker on LaTeX files?
If so, you probably know that the process is far from simple. Since LaTeXdocuments contain special commands and keywords (the so-called "markup") thatare not part of the "real" text, you cannot run a grammar checker directly onthese files: it cannot tell the difference between markup and text. The otheroption is to remove all this markup, leaving only the "clear" text; however,when a grammar tool points to a problem at a specific line in this clear text,it becomes hard to retrace that location in the original LaTeX file.
TeXtidote solves this problem; it can read your original LaTeX file andperform various sanity checks on it: for example, making sure that everyfigure is referenced in the text, enforcing the correct capitalization oftitles, etc. In addition, TeXtidote can remove markup from the file and sendit to the Language Tool library, whichperforms a verification of both spelling and grammar in a dozen languages.What is unique to TeXtidote is that it keeps track of the relative position ofwords between the original and the "clean" text. This means that it cantranslate the messages from Language Tool back to their proper locationdirectly in your source file.
You can see the list of all the rules checked by TeXtidote at the end of thisfile.
TeXtidote also supports spelling and grammar checking of files in the Markdown format.
You can either install TeXtidote by downloading it manually, or by installingit using a package.
Under Debian systems (Ubuntu and derivatives), you can install TeXtidote usingdpkg
. Download the latest .deb
file in theReleases page; supposeit is called textidote_X.Y.Z_all.deb
. You can install TeXtidote by typing:
$ sudo apt-get install ./textidote_X.Y.Z_all.deb
The ./
is mandatory; otherwise the command won't work.
You can also download the TeXtidote executable manually: this works on alloperating systems. Simply make sure you have Java version 8 or later installedon your system. Then, download the latestrelease ofTeXtidote; put the JAR in the folder of your choice.
TeXtidote is run from the command line. The TeXtidote repository contains asample LaTeX file calledexample.tex.Download this file and save it to the folder where TeXtidote resides. You thenhave the choice of producing two types of "reports" on the contents of yourfile: an "HTML" report (viewable in a web browser) and a "console" report.
To run TeXtidote and perform a basic verification of the file, run:
java -jar textidote.jar --output html example.tex > report.html
In Linux, if you installed TeXtidote using apt-get
, you can also call itdirectly by typing:
textidote --output html example.tex > report.html
Here, the --output html
option tells TeXtidote to produce a report in HTML format;the >
symbol indicates that the output should be saved to a file, whose nameis report.html
. TeXtidote will run for some time, and print:
TeXtidote v0.8 - A linter for LaTeX documents
(C) 2018-2019 Sylvain Hallé - All rights reserved
Found 23 warnings(s)
Total analysis time: 2 second(s)
Once the process is over, switch to your favorite web browser, and open thefile report.html
(using the File/Open menu). You should see something like this:
As you can see, the page shows your original LaTeX source file, where someportions have been highlighted in various colors. These correspond to regionsin the file where an issue was found. You can hover your mouse over thesecolored regions; a tooltip will show a message that describes the problem.
If you don't write any filename (or write --
as the filename), TeXtidotewill attempt to read one from the standard input.
To run TeXtidote and display the results directly in the console, simply omitthe --output html
option (you can also use --output plain
), and do not redirect the output to a file:
java -jar textidote.jar example.tex
TeXtidote will analyze the file like before, but produce a report that lookslike this:
* L25C1-L25C25 A section title should start with a capital letter. [sh:001]
\section{a first section}
^^^^^^^^^^^^^^^^^^^^^^^^^
* L38C1-L38C29 A section title should not end with a punctuation symbol.
[sh:002]
\subsection{ My subsection. }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
* L15C94-L15C99 Add a space before citation or reference. [sh:c:001]
things, like a citation\cite{my:paper} .The text
Each element of the list corresponds to a "warning", indicating thatsomething in the text requires your attention. For each warning, the positionin the original source file is given: LxxCyy indicates line xx, column yy. Thewarning is followed by a short comment describing the issue, and an excerptfrom the line in question is displayed. The range of characters where theproblem occurs is marked by the "^^^^" symbols below the text. Each of thesewarnings results from the evaluation of some "rule" on the text; an identifierof the rule in question is also shown between brackets.
Another option to display the results directly in the console is the single line report:
java -jar textidote.jar --output singleline example.tex
Textidote will analyze the file like before, but this time the report looks like this:
example.tex(L25C1-L25C25): A section title should start with a capital letter. "\section{a first section}"
example.tex(L38C1-L38C29): A section title should not end with a punctuation symbol. "\subsection{ My subsection. }"
example.tex(L15C94-L15C99): Add a space before citation or reference. "things, like a citation\cite{my:paper} .The text"
Each line corresponds to a warning, and is parseable by regular expressionseasily, e.g., for further processing in another tool. The file is given at thebeginning of the line, followed by the position in parentheses. Then, thewarning message is given, and the excerpt causing the warning is printed indouble quotes (""). Note, that sometimes it may happen that a position cannotbe determined. In this case, instead of LxxCyy, ? is printed.
You can disable the use of color in any form of command-line output using the--no-color
switch.
You can perform further checks on spelling and grammar, by passing the--check
option at the command line. For example, to check text in English,you run:
java -jar textidote.jar --check en example.tex
The --check
parameter must be accompanied by a two-letter code indicatingthe language to be used. Language Tool is a powerful library that can verifyspelling, grammar, and even provide suggestions regarding style. TeXtidotesimply passes a cleaned-up version of the LaTeX file to Language Tool,retrieves the messages it generates, and coverts the line and column numbersassociated to each message back into line/column numbers of the originalsource file. For more information about the kind of verifications made byLanguage Tool, please refer to its website.
Additionally, the --firstlang lang
option can be used to make Language Tool check for false friends in your first language.For example, to check a text in english, when your first language is german, you may run:
java -jar textidote.jar --check en --firstlang de example.tex
The language codes you can use are:
de
: (Germany) German, and the variants de_AT
(Austrian) and de_CH
(Swiss)en
: (US) English, and the variants en_CA
(Canadian) and en_UK
(British)es
: Spanishfr
: Frenchnl
: Dutchpt
: Portuguesepl
: PolishIf you have a list of words that you want TeXtidote to ignore when checkingspelling, you can use the --dict
parameter to specify the location of atext file:
java -jar textidote.jar --check en --dict dico.txt example.tex
The file dico.txt
must be a plain text file contain a list of words to beignored, with each word on a separate line. (The list is case sensitive.)
If you already spell checked you file using Aspell andsaved a local dictionary(as is done for example by thePaperShell environment),TeXtidote can automatically load this dictionary when invoked. Morespecifically, it will look for a file called .aspell.XX.pws
in the folderwhere TeXtidote is started (this is the filename Aspell gives to localdictionaries). The characters XX
are to be replaced with the two-letterlanguage code. If such a file exists, TeXtidote will load it and mention it atthe console:
Found local Aspell dictionary
You may want to ignore some of TeXtidote's advice. You can do so by specifyingrule IDs to ignore with the --ignore
command line parameter. For example,the ID of the rule "A section title should start with a capital letter" issh:001
(rule IDs are shown between brackets in the reports given byTeXtidote); to ignore warnings triggered by this rule, you call TeXtidote asfollows:
java -jar textidote.jar --ignore sh:001 myfile.tex
If you want to ignore multiple rules, separate their IDs with a comma (but nospace).
TeXtidote can be instructed to remove user-specified environments using the --remove
command line parameter. For example:
$ java -jar textidote.jar --remove itemize myfile.tex
This command will remove all text lines between \begin{itemize}
and \end{itemize}
before further processing the file.
The same can be done with macros:
$ java -jar textidote.jar --remove-macros foo myfile.tex
This command will remove all occurrences of use-defined command \foo
in the text. Alternate syntaxes like \foo{bar}
and \foo[x=y]{bar}
are also recognized and deleted.
Before TeXtidote analyses a file, you can ask it to apply a set offind/replace operations (for example, to replace a macro by some predefinedcharacter string). You can write these patterns into a text file and pass themto the program at the command line:
$ java -jar textidote.jar --replace replacements.txt
Here, replacements.txt
is the file that contains the find/replace patterns,fomatted as follows:
# Empty lines beginning with a pound sign are ignored
# Search and replace patterns are separated by a tab
find replace
foo bar
# Patterns can also be regular expressions
abc\d+[^x] 123
By default, TeXtidote ignores everything before the \begin{document}
command. If you have a large document that consists of multiple included LaTeX"sub-files", and you want to check one such file that does not contain a\begin{document}
, you must tell TeXtidote to read all the file using the--read-all
command line option. Otherwise, TeXtidote will ignore the wholefile and give you no advice.
TeXtidote also automatically follows sub-files that are embedded from a main document using \input{filename}
and \include{filename}
(braces are mandatory). Any such non-commented instruction will add the corresponding filename to the running queue. If you want to exclude an \input
from being processed, you must surround the line with ignore begin
/end
comments (see below, Helping TeXtidote).
You can also use TeXtidote just to remove the markup from your original LaTeXfile. This is done with the option --clean
:
java -jar textidote.jar --clean example.tex
By default, the resulting "clean" file is printed directly at the console. Tosave it to a file, use a redirection:
java -jar textidote.jar --clean example.tex > clean.txt
You will see that TeXtidote performs a very aggressive deletion of LaTeXmarkup:
figure
, table
and tabular
environments are removed$...$
) are replaced by "X"\cite
commands are replaced by "0"\ref
commands are replaced by "[0]"\textbf
, \emph
, \uline
, \footnote
)are removed (but the text is kept)Surprisingly, the result of applying these modifications is a text that isclean and legible enough for a spelling or grammar checker to providesensible advice.
As was mentioned earlier, TeXtidote keeps a mapping between character rangesin the "cleaned" file, and the same character ranges in the original LaTeXdocument. You can get this mapping by using the --map
option:
java -jar textidote.jar --clean --map map.txt example.tex > clean.txt
The --map
parameter is given the name of a file. TeXtidote will put in thisfile the list of correspondences between character ranges. This file is madeof lines that look like this:
L1C1-L1C24=L1C5-L128
L1C26-L1C28=L1C29-L1C31
L2C1-L2C10=L3C1-L3C10
...
The first entry indicates that characters 1 to 24 in the first line of theclean file correspond to characters 5 to 28 in the first line of the originalLaTeX file --and so on. This mapping can have "holes": for example, character25 line 1 does not correspond to anything in the original file (this happenswhen the "cleaner" inserts new characters, or replaces characters from theoriginal file by something else). Conversely, it is also possible thatcharacters in the original file do not correspond to anything in the cleanfile (this happens when the cleaner deletes characters from the original).
TeXtidote uses the OS default encoding when reading files (e.g. utf-8
in Linux, cp1252
in Windows). You can override this setting using the --encoding
command line option:
java -jar textidote.jar --encoding cp1252 example.tex
If you need to run TeXtidote with many command line arguments (for example:you load a local dictionary, ignore a few rules, apply replacements, etc.), itmay become tedious to invoke the program with a long list of arguments everytime. TeXtidote can be "configured" by putting those arguments in a textfile called .textidote
in the directory from which it is called. Here is anexample of what such a file could contain:
--output html --read-all
--replace replacements.txt
--dict mydict.txt
--ignore sh:001,sh:d:001
--check en mytext.tex
As you can see, arguments can be split across multiple lines. You can thencall TeXtidote without any arguments like this:
textidote > report.html
If you call TeXtidote with command line arguments, they will be merged withwhatever was found in .textidote
. You can also tell TeXtidote to explicitlyignore that file and only take into account the command line arguments usingthe --no-config
argument.
TeXtidote also supports files in the Markdown format. The only difference is that rules specific to LaTeX (references to figures, citations) are not evaluated.
Simply call TeXtidote with a Markdown input file instead of a LaTeX file. The format is auto-detected by looking at the file extension. However, if you pass a file through the standard input, you must tell TeXtidote that the input file is Markdown by using the command line parameter --type md
. Otherwise, TeXtidote assumes by default that the input file is LaTeX.
It order to get the best results when using TeXtidote, it is advisable thatyou follow a few formatting conventions when writing your LaTeX file:
\begin{environment}
and/or \end{environment}
onthe same line\title{}
) that have their opening and closing braces on different linesare not recognized by TeXtidote and will result in garbled output andnonsensical warnings.\section
or \paragraph
alone on their line andseparate them from the text below by a blank line.As a rule, it is advisable to first see what your text looks like using the--clean
option, to make sure that TeXtidote is performing checks onsomething that makes sense.
If you realize that a portion of LaTeX markup is not handled properly andmesses up the rest of the file, you can tell TeXtidote to ignore a regionusing a special LaTeX comment:
% textidote: ignore begin
Some weird LaTeX markup that TeXtidote does not
understand...
% textidote: ignore end
The lines between textidote: ignore begin
and textidote: ignore end
willbe handled by TeXtidote as if they were comment lines.
When you are using markdown you can also selectively ignore parts of the document:
<!-- textidote: ignore begin -->
This should be ignored
<!-- textidote: ignore end -->
To make using TeXtidote easier, you can create shortcuts on your system. Hereare a few recommended tips.
First, we recommend you create a folder called /opt/textidote
and put thebig textidote.jar
file there (this requires root privileges). This step isalready taken care of if you installed the TeXtidote package using apt-get
.
(This step is not necessary if TeXtidote has been installed with apt-get
.)In/usr/local/bin
, create a file called textidote
with the followingcontents:
#! /bin/bash
java -jar /opt/textidote/textidote.jar "$@"
Make this file executable by typing at the command line:
sudo chmod +x /usr/local/bin/textidote
(These two operations also require root privileges.) From then on, you caninvoke TeXtidote on the command line from any folder by simply typingtextidote
, e.g.:
textidote somefile.tex
If you use a desktop environment such as Gnome or Xfce, you can automatethis even further by creating a TeXtidote icon on your desktop. First,create a file called /opt/textidote/textidote-desktop.sh
with the followingcontents, and make this file executable:
#!/bin/bash
if [ -x /usr/bin/notify-send ]; then
err() { notify-send -a TeXtidote -i /opt/textidote/textidote-icon.svg "$*"; }
else
err() { printf "%s\n" "$*" >&2; }
fi
[ $# -lt 1 ] && err "At least one file should be provided as input" && exit
dir=$(dirname "$1")
pushd "$dir" || err "$dir does not exist" && exit
java -jar /opt/textidote/textidote.jar --check en --output html "$@" > /tmp/textidote.html
popd || exit
xdg-open /tmp/textidote.html &
This script enters into the directory of the file passed as an argument,calls TeXtidote, sends the HTML report to a temporary file, and opens thedefault web browser to show that report.
Then, on your desktop (typically in your ~/Desktop
folder), create anotherfile called TeXtidote.desktop
with the following contents:
[Desktop Entry]
Version=1.0
Type=Application
Name=TeXtidote
Comment=Check text with TeXtidote
Exec=/opt/textidote/textidote-desktop.sh %F
Icon=/opt/textidote/textidote-icon.svg
Path=
Terminal=false
StartupNotify=false
This will create a new desktop shortcut; make this file executable. From thenon, you can drag LaTeX files from your file manager with your mouse and dropthem on the TeXtidote icon. After the analysis, the report will automaticallypop up in your web browser. Voilà!
You can auto-complete the commands you type at the command-line using the TABkey (as you are probably used to). If you installed TeXtidote using apt-get
,auto-completion for Bash comes built-in.You can also enable auto-completion for other shells as follows.
Users of Zsh can also enable auto-completion; in your~/.zshrc
file, add the line
source /opt/textidote/textidote.zsh
(Create the file if it does not exist.) You must then restart your Zsh shellfor the changes to take effect.
Users of Visual Studio Code can integrate TeXtidote by calling it with the --output singleline
and --no-color
options and parse its results. Moreover, user cphyc also wrote a nice build task.
Emacs users can benefit from TeXtidote through flycheck.
A dedicated flycheck-checker
can be defined as in the following init.el/.emacs
snippet (by user soli).
(flycheck-define-checker tex-textidote
"A LaTeX grammar/spelling checker using textidote.
See https://github.com/sylvainhalle/textidote"
:modes (latex-mode plain-tex-mode)
:command ("java" "-jar" (eval (expand-file-name "~/PATH/TO/textidote.jar")) "--read-all"
"--check" (eval (if ispell-current-dictionary (substring ispell-current-dictionary 0 2) "en"))
"--no-color" source-inplace)
:error-patterns (
(warning line-start "* L" line "C" column "-" (one-or-more alphanumeric) " "
(message (one-or-more (not (any "]"))) "]")))
(add-to-list 'flycheck-checkers 'tex-textidote)
Here is a list of the rules that are checked on your LaTeX file by TeXtidote.Each rule has a unique identifier, written between square brackets.
In addition to all the rules below, the --check xx
option activates all therules verified by Language Tool(more than 2,000 grammar and spelling errors). Note that the verification timeis considerably longer when using that option.
If the --check
option is used, you can add the --languagemodel xx
option to find errors using n-gram data. In order to do so, xx
must be a path pointing to an n-gram-index directory. Please refer to the LanguageTool page (link above) on how to use n-grams and what this directory should contain.
\cite
and \citep
or \citet
in the same document.[sh:c:mix]\cite
commands;put all references in the same \cite
. [sh:c:mul, sh:c:mulp]\section
followed by a \subsubsection
without a\subsection
in between). [sh:secskip]\ref
instead. [sh:hcfig, sh:hctab, sh:hcsec, sh:hccha]\\
. Either start anew paragraph or stay in the current one. [sh:nobreak]\newpage
. [sh:nonp]First make sure you have the following installed:
Download the sources for TeXtidote fromGitHub or clone the repositoryusing Git:
git clone git@github.com:sylvainhalle/textidote.git
First, download the dependencies by typing:
ant download-deps
Then, compile the sources by simply typing:
ant
This will produce a file called textidote.jar
in the folder. Thisfile is runnable and stand-alone, or can be used as a library, so it can bemoved around to the location of your choice.
In addition, the script generates in the docs/doc
folder the Javadocdocumentation for using TeXtidote.
TeXtidote can test itself by running:
ant test
Unit tests are run with jUnit; a detailed report ofthese tests in HTML format is available in the folder tests/junit
, whichis automatically created. Code coverage is also computed withJaCoCo; a detailed report is availablein the folder tests/coverage
.
TeXtidote was written by Sylvain Hallé, FullProfessor in the Department of Computer Science and Mathematics atUniversité du Québec à Chicoutimi, Canada.
TeXtidote is free software licensed under the GNU General Public License3. It is released aspostcardware: if you use andlike the software, please tell the author by sending a postcard of your townat the following address:
Sylvain Hallé
Department of Computer Science and Mathematics
Univerité du Québec à Chicoutimi
555, boulevard de l'Université
Chicoutimi, QC
G7H 2B1 Canada
If you like TeXtidote, you might also want to look atPaperShell, a templateenvironment for writing scientific papers in LaTeX.
TeXtidote is a play on Antidote, which is a spelling/grammar checker wellknown to French-speaking users and works with word processors. So TeXtidote islike a version of Antidote for TeX.