Pandoc 3.6.1

If you need to convert files from one markup format into another, pandoc is your swiss-army knife. Pandoc is a Haskell library for converting from one markup format to another, and a command-line tool that uses this library.

Tags utilities haskell
License GNU GPLv3
State initial

Recent Releases

3.6.125 Dec 2024 04:05 minor feature: . . Allow YAML bibliographies to be arrays of references. Previously, they had to be YAML objects with a. references key.. Change --template to allow use of. Extensionless templates. The Intent is to allow bash process substitution: e.g., --template lt;(echo "foo"). Previously pandoc. always added an extension based on the output format. Which caused problems with the absolute filenames used by bash Process substitution (e.g. /dev/fd/11). Now, if the Template has no extension, pandoc will first try to find it Without the extension, and then add the extension if it can t be Found. So, in general, extensionless templates can now be used. But this has been implemented in a way that should not cause. Problems for existing uses, unless you are using a template NAME.FORMAT but happen to have an extensionless file. NAME in the template search path.. Allow --shift-heading-level-by=-1 to work in. Djot in the same way it works for other formats (with the Top-level heading being promoted to metadata title). This needed special treatment because of the way djot surrounds. Sections with Divs. RST reader:. . Handle explicit reference links. This case was missed when changing the reference link. Strategy for RST to allow a single pass. (It is a regression in Pandoc 3.6.) . Markdown reader:. . Use T.P.URI s pBase64DataURI in parsing data URIs. .. More efficient base64 data URI parsing. This should yield dramatic. Performance improvements for markdown documents containing large Data URIs in images. . HTML reader:. . Don t canonicalize data: URIs. It. Can be very expensive to call network-uri s URI parser on These. . LaTeX reader:. . Handle figure* environment as a figure.. . MediaWiki reader:. . Allow empty quoted attributes.. Allow cells starting with +.. . Textile reader:. . Improve parsing of spans. The. Span needs to be separated from its surroundings by spaces. Also, a span can have attributes, which we now attach.. Inline constructors shouldn t trigger if r is prece
3.609 Dec 2024 11:25 minor feature: . . Add mdoc as input format (Evan Silberman). This change introduces a reader for mdoc, a roff-derived semantic. Markup language for manual pages. This reader has been developed Almost exclusively against mandoc s documentation and Implementation of mdoc as a reference, and the real-world manual Pages tested against are those from the OpenBSD base system. Of 3500 manuals in mdoc format shipped with a fresh OpenBSD install, 17 cause the mdoc reader to exit with a parse error. Any further. Chasing of edge cases is deferred to future work. New module: Text.Pandoc.Readers.Mdoc, exporting. readMdoc API change .. warnings for duplicate YAML metadata keys.. Ensure that --sandbox affects. --embed-resources. Previously it did not (contrary to. What was implied by the manual), which means that an image with URL /etc/passwd would leak an encoded version of that. File to HTML output with --self-contained or --embed-resources, even if --sandbox was. Used. Thanks to Samuel Mortenson for pointing out the .. Text.Pandoc.App.OutputSettings: add sandbox'. Function. This computes the sandboxed files from Opt and avoids Code repetition. Docx reader:. . Parse index references as empty spans with attributes. Attributes included are entry, and optionally. bold, italic, yomi. see.. Don t create multiple paragraphs for title or subtitle. If. There are multiple paragraphs with Title or Subtitle style, use Only the first for metadata. Handle case where Zotero itemData has different id from the citationItem id. In this case we use the. citationItemId in the bibliography as well. Overriding the referenceId in the itemData. . LaTeX reader:. . Put parsed minipage in specially marked Div.. . HTML reader:. . Parse footnotes defined by dpub-aria roles.. . MediaWiki reader:. . indented tables with caption.. parsing of col/rowspan.. . Typst reader:. . Avoid generating empty paragraphs.. Support underparen, overparen.. #quote attribution. If attribution is not. Present, don t print the --.
3.526 Nov 2024 10:45 minor feature: . . Add command-line options --list-of-figures/--lof and --list-of-tables/--lot. . Only docx, latex, and context are affected by these. Options currently. Setting the lof and lot variables will also work For the formats that are currently supported. Defaults files: interpolation of environment variables now. Works for to and from fields. This is needed because these files can contain paths of custom. Readers/writers. Docx reader:. . Reset lists after headers in same list numId. To Accomplish this, we add a Heading constructor to BodyPart and Include on it all the information list items have. . DocBook reader:. . Parse id, class, and tabstyle on tables. Add parsing of id (xml:id), class, and tabstyle XML. Attributes for table and informaltable in the DocBook reader. The Tabstyle value is put in the custom-style attribute. . Dokuwiki reader:. . Be more forgiving about misaligned lists, like dokuwiki itself. .. Improve blockquote parsing in dokuwiki. Allow for quoted code. Blocks. Enable smart extension.. Properly parse -- and. --- as dashes.. block quote behavior. Blockquotes are not really block containers in DokuWiki; the lines. Are interpreted literally (so, e.g., you can t start a list), and Line breaks are added at the ends. . EPUB reader:. . links to other files in the EPUB, making them internal. Links to a fragment derived from the filename. There was already code to handle links like #foo, but not to handle links Like ch0001.html#foo. . LaTeX reader:. . Add em, ex, px, mu to list of units for dimension args.. . ANSI writer:. . subscripts (Evan Silberman).. . DokuWiki writer:. . Don t emit lt;HTML gt; tags. The Use of these tags is now strongly discouraged for security Reasons, and will be removed. We previously used them as a Fallback for lists that could not be represented using DokuWiki Syntax, e.g. ordered lists with fancy numbers or lists with Multiple blocks in their items. We also used them for block quotes With multiple blocks as their conte
3.423 Nov 2024 21:25 minor feature: . . New output format: ansi (for formatted console Output) (Evan Silberman). Most Pandoc elements are supported and Printed in a reasonable way, if not always ideally. This version Does no detection of terminal capabilities, nor does it fall back to different output styles for less-capable terminals.. Add command line options --table-caption-position and --figure-caption-position. These allow the user to specify whether to put captions above or. Below tables and figures, respectively. The following output Formats are supported: HTML (and related such as EPUB), LaTeX (and Beamer), Docx, ODT/OpenDocument, Typst.. Change default --pdf-engine via HTML to WeasyPrint. wkhtmltopdf is. Deprecated. weasyprint is The easiest-to-install, maintained alternative. For better Results, one might prefer pagedjs-cli. Org reader:. . parsing of src blocks with an -i flag. Tabs are now preserved in the contents of src blocks if the the block has the -i flag. . RTF reader:. . Handle images inside shp contexts. . RST reader:. Improve simple table support. Multiline rows occur only when the first cell is empty; we were previously treating lines with any empty cell as. Row continuations. In addition, we no longer wrap multiline cells in Para if they can be represented as Plain. This is consistent. With docutils behavior. LaTeX reader:. . Math environments don t have bracketed options.. Parse nested tabular environments.. . Typst reader:. . Change how block elements are handled. Previously they were. Always parsed as divs. But actually they can occur in some inline contexts. Now we first try to parse them as inlines, and. Only as blocks if that fails. A surrounding Div or Span element is Added only if there is an identifier. . HTML reader:. . Only parse main element s contents (if present). If. Main has an id or class, we include a div with that id or class; Otherwise just the contents. Read TeX annotation in MathML content if present.. Better handle KaTeX-generated math. KaTeX.
3.325 Aug 2024 03:25 minor feature: . . New cli option: --link-images. This causes Images to be linked rather than embedded in ODT. Allow --number-sections to take an Optional true false Argument. RTF reader:. . Handle * shppict. Without dropping image. . TWiki Reader:. . Recognize WikiWords as internal links.. Avoid partial function.. . Typst reader:. . Ignore pad and just parse its body.. Use typst 0.5.0.5. parsing of equations like 1. . . Docx writer:. . regression with nested lists. The. Affects e.g. ordered lists with bullet sublists; after the Sublist the top-level list reverts to bullets instead of being Properly numbered. This is a regression introduced in version 3.2.1.. . BibTeX writer:. . Ensure that literal names are enin braces.. . Man writer:. . Use default middle header when metadata does not include header. This Change causes pandoc to omit the middle header parameter when header is not set, rather. Than emitting "". The Parameter is optional and man will use a default based on the Section if it is not specified. . HTML templates: don t load polyfill. This. Was added in a period when MathJaX required polyfill. MathJaX no Longer recommends this and polyfill should no longer be necessary on any reasonably modern browser.. Translations:. . Add ua.yaml (Jens Oehlschlägel).. Add a script (tools/update-translations.py) And Makefile target (update-translations) to update Translation data automatically from babel and polyglossia upstream (Stephen Huan).. Use this script to update language data, increasing the number of languages we cover (Stephen Huan). a few small in. Existing translations. . some mistakes with Japanese language code. In. Several places we were mistakenly assuming that the BCP 47 code For Japanese language was jp. It is ja. Text.Pandoc.Options:. . New field in WriterOptions: writerLinkImages API change .. . Text.Pandoc.App.Opt:. . New field in Opt: optLinkImages API change . . Lua subsystem:. . Keep lpeg and. re as loaded modules. (Albert Krewinkel). The mo
3.2.125 Jun 2024 12:05 minor feature: . . gfm_auto_identifiers to replace Emojis with their aliases, as documented. CSV reader:. . Turn line breaks into LineBreaks not SoftBreaks.. . Docx reader:. . Support task lists.. a small in parsing delimiters in numbered lists, which. Led to the default delimiter being used wrongly in some Cases. Improve handling of captions. . Turn captioned images into Figure elements. #9391. Improve the logic for associating elements with captions.. Ensure that captions that can t be associated with an element. Aren t just silently dropped. . Support HorizontalRule. We support both pandoc-style and the. Style described on a Microsoft support page, an empty paragraph With a bottom border. React to "left" value on jc attribute.. Handle column and cell alignments. We. Take the column alignments from the first body row. a that caused comments inside insertions or deletions to be ignored.. . HTML reader:. . Better handle non-li. Elements in ul and ol. For Example, a p after a li will be. Incorporated into the previous li. This mirrors what browsers do with this invalid HTML.. . LaTeX reader:. . parsing of dimensions beginning with ., e.g.  kern.1pt. . Markdown reader:. . Allow author-only textual citations. E.g. outside of. Brackets. . RST reader:. . Tighten up rules for when emphasis can start.. Support :cite: role. With citeproc. A Subset of the functionality of the sphinxcontrib-bibtex extension to Sphinx is supported.. . Textile reader:. . Don t let spans begin right after a symbol.. . Texinfo writer:. . Ensure proper escaping in all node/link contexts.. Target node rather than anchor when possible in internal. Links. Remove illegal characters from internal link anchors.. Use two commas not one in Don t add anchors to headings. We don t need them, now that we. Make internal links use the node. Avoid duplicate node names.. Improve menus. Properly handle the case where the node name is. Different from the descriptive title. . Texinfo template: add variables for file
3.213 May 2024 02:25 minor feature: . . Change to --file-scope behavior : Previously a Div with an identifier derived from the filename Would be added around the contents of each file. This caused Problems for chunking files into chapters, e.g. in EPUB. We no Longer add the surrounding Div. This cooperates better with Chunking. Note, however, that if you have relied on the old Behavior to link to the beginning of the contents of a file using Its filename as identifier, that will no longer work. Markdown reader:. . Allow repeated labels in numbered example lists. Previously if. You tried to use the same label as an earlier example list item, You d get a new number, not the old one, and references to the Label would go to the second occurrence. Now an existing label Will be reused, and no new number will be generated. Caveat: this Only works reliably when the re-used example list item occurs by Itself in a list, or occurs in a list of previously used example List items that occur in exactly the same order as Previously. normalCite so it. Doesn t consume past a closing boundary. This Was causing an exponential performance on long lists of links Containing potential emphasis characters. Generalize inlinesInBalancedBrackets to inBalancedBrackets, with a parameter for the inner parser.. Auto-undivs. This. Applies to both fenced and HTML-ish varieties. Otherwise we face an exponential performance problem with backtracking. A warning is d when a div is implicitly.. . RST reader:. . figclass and. align annotations for. Figures. . LaTeX writer:. . Use polytonicgreek. Instead of polutonikogreek with babel. polutonikogreek is outdated. Also recognize both in the LaTeX reader.. Improve treatment of math inside soul commands. soul. Commands (ul, hl, st) are very fragile and the Math must be handled specially. . LaTeX reader:. . over-eager macro expansion in conditionals.. Parse flalign, flalign* math environments. We Parse these as Math elements with an aligned environment. Semantically it s not exactly
3.1.1309 Apr 2024 00:25 minor feature: . . Org reader:. . treatment of id. Property under heading. . DocBook reader:. . Add empty title to admonition div if not present. This. Allows admonition elements (e.g.  lt;note gt;) to work with gfm admonitions even if. The lt;title gt; is not Present. . DokuWiki reader:. . Link text cannot contain formatting (e.g., // is not italics). An explicitly empty link text ( url ) works the same as an Omitted link text. . Typst reader:. . Support Typst 0.11 table features: col/rowspans, table head. And foot. Parse cell col/rowspans.. . CSLJson writer:. . Put or around math in csljson output. . ConTeXt writer:. . options order with externalfigure. The dimensions Should come before the class if both are present. . Typst writer:. . Put label after Span, not before. Labels get applied to. Preceding markup item. Support Typst 0.11 table features : Colspans, rowspans, cell alignment overrides, relative column Widths, header and footer, multiple table bodies with intermediate Headers. Row heads are not yet supported. The default typst template has been modified so that tables. Don t have lines by default. As is standard with pandoc, we only Add a line under a header or over a footer. However, a different Default stroke pattern can easily be added in a template. More reliable escaping in inline .. contexts. For Example, we need to escape 1. April or it will be Treated as an ordered list. Handle unnumbered on. Headings. . LaTeX writer:. . math inside strikeout.. . Text.Pandoc.Writers.Shared:. . Export isOrderedListMarker API Change . . Change lhs tests so they don t use --standalone. This will avoid Test failures due to minor changes in skylighting versions, e.g. #9589. Use latest texmath, typst.. Require pandoc-lua-marshal 0.2.6. an arising when the value of content properties on BlockQuote, Figure, and Div elements. Was an empty list. Update lua-filters.md.. . . .
3.1.12.320 Mar 2024 00:05 minor bugfix: . . Markdown reader: with footnotes at end of fenced. Div. LaTeX reader:. . Improve tokenization of . Make Tokenization sensitive to makeatletter/ makeatother. Previously we Just always treated as a letter. This led to bad results, e.g. with the sequence . E.g., a @ b would parse as ab and as a .. Make withRaw work. Inside parseFromToks. This is needed for raw environments to work inside table cells.. Better handling of table colwidths. Previously the parser just failed if the column width specified in. p wasn t a multiple of. linewidth. This led to. Cases where content was skipped. . Typst writer:. . Add kind parameter to figures with tables.. Avoid unnecessary box around image in figure.. Omit width/height in images unless explicitly specified. Previously we computed width/heigth for images that didn t have. Size information, because otherwise typst would expand the image to fit page width. This typst behavior has changed in 0.11. This. Change a in which images would sometimes overflow page Margins, depending on their intrinsic size. Don t add hard-coded inset to tables. Instead, set this globally in the default template, allowing it to be customized.. . LaTeX template: block headings support for unnumbered. Paragraphs. HTML templates: Replace polyfill provider. Replace. Polyfill.io with cdnjs.cloudflare.com/polyfill. polyfill.io has Been acquired by Funnull, and the service has become Unstable. Korean translations: delete colon in translation for to . This was invalid YAML, and not desired anyway, since a colon is. Added. Use latest commonmark, commonmark-extensions. This a 3.12 regression in parsing of commonmark/gfm autolinks. (jgm/commonmark-hs#151). Depend on djot 0.1.1.3, which a serious parsing affecting regular paragraphs after lists.. Depend on latest skylighting, skylighting-core, typst-hs. Texmath. MANUAL.txt: Change broken link to IDML cookbook.. . . .
3.1.12.202 Mar 2024 21:05 minor bugfix: . . Docx reader:. . Ensure that table captions are counted.. Detect caption by style name not id. The. StyleId can change depending on the localization. Avoid emitting empty paragraph where caption was.. . Markdown reader: regression in link parsing with. Wikilinks extensions. This a regression introduced in 3.1.12.. Org reader/writer: support admonitions.. Org writer: omit extra blank line at end of quote. Block. Typst writer: ensure that -, +, etc. are escaped at Beginning of block. Our Recent relaxing of escaping caused Problems for things like emphasized - characters that were rendered Using #strong - #. This Now gets rendered as #strong - . LaTeX writer: when a language is specified in two. Different ways. If You used lang: de-DE but Then had a span or div with lang=de, the preamble would try to load ngerman twice. Leading to an error. This ensures that a language is only Loaded once. Docx writer: Don t copy over footnotePr in settings.xml from Reference.docx. EPUB writer: omit EPUB2-specific meta tag on EPUB3. This. Caused a validation failure in epubs with cover images. Lua: avoid crashing when an error message is not valid UTF-8 (Albert Krewinkel).. Text.Pandoc.SelfContained:. . Add role="img" to. Svgs. Add aria-label to svg. Elements with alt text if Present. Screen readers ignore alt attributes on svg elements But do pay attention to aria-label. . Text.Pandoc.Shared: regression in section numbering in. makeSections. Starting with pandoc 3.1.12, unnumbered sections incremented the. Section number. Text.Pandoc.Class: openUrl TLS negotiation. With The release of TLS 2.0.0, the TLS library started requiring Extended Main Secret for the TLS handshake. This caused problems. Connecting to zotero s server and others that do not support TLS 1.3. This commit relaxes this requirement.. Depend on djot 0.1.1.0.. Use new releases of skylighting-format-blaze-html. auto-wrapping of long source lines in HTML print media.. Use new commonmark-extensions (wit
3.1.12.120 Feb 2024 21:45 minor bugfix: . . EPUB writer: omit EPUBv3-specific accessibility features on. Epub2. a regression in 3.1.12. More for SVG ids with --self-contained. This Generalizes the to #9420 so it Applies to things like style="fill(url(#..." and Should problems with SVGs including gradients. Powerpoint writer: properly handle math in headings and. Tables. This Ensures that paragraphs containing math are wrapped in a mc:AlternateContent node as Required. Makefile: make validate-epub check v2 output too.. . . .
3.1.1216 Feb 2024 07:25 minor feature: . . Add djot as input. And output format. Djot is a light markup syntax (https://djot.net).. . New module Text.Pandoc.Readers.Djot API change . The function. readDjot is also exported by Text.Pandoc.Readers.. New module Text.Pandoc.Writers.Djot API change . The function. writeDjot is also. Exported by Text.Pandoc.Writers. . --number-sections. Now uses the first digit for the number of the top-level section, no matter what its level. So if the top-level section is level-2. Numbers will be 1, 2, etc. rather than 0.1, 0.2, as in the past. For Some backwards compatibility, we revert to the old behavior when The --number-offset Option is used. DocBook reader:. . Better handling of lt;procedure gt; and lt;substeps gt; : lt;procedure gt; now gets Parsed as an ordered list, and lt;substeps gt; as a Sublist. . Man reader:. . Move spaces outside of emph/strong.. . MediaWiki reader:. . Don t make leading blanks underscores in image links.. Allow lowercase image:. . BibTeX reader:. . Support pagetotal in. Converting BibLaTeX. . Markdown reader:. . wikilinks extensions to allow newlines in titles.. . EPUB reader:. . Don t put #. Characters in identifiers. . LaTeX reader:. . Improve treatment of cref, Cref. Use The reference-type ref+label and ref+Label. Also, associate with vref ref instead of ref+page. Limited support for Cref. Generate relative widths for linewidth, textheight. . Typst reader:. . handling of overline. Due to a typo, it was being incorrectly rendered as an underset. Improve handling of inline #quote. handling of dot(), tilde(), ddot() (jgm/typst-hs#38). character used for norm (jgm/typst-hs#38). . Typst writer:. . Use reference form for. Citations when possible. Use #ref or for links with reference-type="ref". This Attribute is added to LaTeX cref, for example. Improve citation support. Emit. form: "prose" or form: "year" qualifiers if the Citation is author-in-text or suppress-author. Strip initial comma From suf, since typst will add an
3.1.1117 Dec 2023 03:15 minor feature: . . Typst writer:. . Emit ; after typst. Code, unless followed by space. Otherwise there s the potential that the typst code will swallow up a following character.. . Text.Pandoc.Logging:. . Add MakePDFWarning. Constructor to LogMessage API change . Add MakePDFInfo. Constructor to LogMessage API change . . Text.Pandoc.PDF:. . LaTeX warnings are passed on to the user as warnings.. Use report with MakePDFWarning and MakePDFInfo to relay verbose Information and warnings, instead of writing directly to Stderr. Parse logs to determine whether additional runs needed. Instead of running a number of times. (The Number of times that was appropriate given pandoc s default Templates didn t always work for custom templates, and thus pandoc 3.1.10 s change in the number of runs led to some regressions in PDF production.). . Makefile: in make prelease, add checks that Pandoc-cli and pandoc have the same version, that pandoc-cli Depends on this exact version of pandoc, that there is an entry For this version in the changelog, and that the version numbers in The generated man pages are correct. Regenerate man pages with pandoc 3.1.10. This properly. Escapes hyphens and version numbers in man pages for pandoc-server and pandoc-lua. Depend on texmath 0.12.8.6. This omits unneeded lrs in typst math Output. Depend on typst 0.5. This allows the typst reader to. Support multiline strings, the version type, and the as keyword with import. . . .
3.1.1013 Dec 2023 13:01 major feature: Link pandoc-cli version to pandoc version. Henceforth pandoc-cli s version will be synchronized with pandoc s, and pandoc-cli will depend on an exact pandoc version. This will avoid confusion by ensuring that cabal install pandoc-cli-X.Y.Z installs pandoc version X.Y.Z. It will make things more straightforward for upstream packagers (see #9232). This scheme does not follow the Haskell PVP, but that should cause no harm, because this package does not expose a library. Add alerts markdown extension. This enables GitHub style markdown alerts as a commonmark extension. This extension is now default for gfm. It can t be used with markdown, only with commonmark and variants. Markdown reader: Preserve newlines in math instead of changing to spaces. Otherwise we can get unwanted results if there s a comment (#9193). Make attributes work with reference links (#9171). HTML reader: Improve handling of invalidly nested sublists (#9187, cf. #8150). MediaWiki reader: Allow attribute keys with hyphens (#9178). ODT reader: Support attr text:continue-numbering (#8979, Stephan Meijer). Typst reader: Allow references (e.g. @foo) to become citations if there is no corresponding label in the document. Collapse adjacent cite elements. Handle supplements in cite. Change cite (only one key allowed, a label) (typst 0.9 breaking change). Support quote element (typst 0.9). LaTeX reader: Handle otherlanguage environment and language-name environments like begin french ... end french (#9202). Fix theorem label parsing (#8872, Hikaru Ibayashi). Docx reader: Unwrap content of shaped textboxes (Stephan Meijer, #9214). Improve handling of w:sym (#9220). We now look up symbols in symbol fonts using the table defined at Text.Pandoc.Readers.Docx.Symbols. Add unexported module Text.Pandoc.Readers.Docx.Symbols. This gives us a table to