Writing Pandoc Filter
Structure of a pandoc filter
Pandoc filter consists of a filter function which takes an element object as its argument and is passed to the variable named by a pandoc type.
Of course, you can define other functions to be called by the filter function. But only the filter function is passed to one or more variables named by a pandoc type.
The following is a sample markdown file.
---
title: TITLE NAME
---
# title
content fadf a
PARA 2
Save this markdown doc in a file called ex1.md
. In the same directory, enter in your terminal the command pandoc 1.md -t json | python -m json.tool
to get the json output converted from ex1.md
.
The following are two snippets in the json output.
Element 1
{
"t": "Para",
"c": [
{
"t": "Str",
"c": "content"
}
]
},
Element 2
Content of a pandoc element can be either a string or, recursively, another pandoc element.
{
"t": "Para",
"c": [
{
"t": "Str",
"c": "PARA"
},
{
"t": "Space"
},
{
"t": "Str",
"c": "2"
}
]
}
In the json output of pandoc, the type of a pandoc object is the value of the key t
. The content of an element is the value of the key c
.
This is just one example of a type of Pandoc object. You can discover more types at the official document.
Pandoc lua filter in practice: convert all words to “meow”
You can convert all words in a document to meow
using a pandoc lua filter. After completing the lua filter, save it as filter.lua
in the same directory and run
pandoc ex1.md --lua-filter=filter.lua -t html
to get the filtered html output converted from ex1.md
.
Hint: a “word” has type Str
.
Expected output
<h1 id="title">meow</h1>
<p>meow meow meow</p>
<p>meow meow</p>
Answer:
function main(elm)
if elm.t == "Str" then
elm.c = "meow"
end
return elm
end
Str = main