Skip to content

Can't convert HTML MathJax to markdown #10673

Answered by jgm
maxima120 asked this question in Q&A
Discussion options

You must be logged in to vote

First, note that --mathjax is an option for HTML output; it won't affect parsing of HTML at all.

For parsing HTML pages that include math in $..$ or \(..\) and are meant to be processed with MathJax, you can use -f html+tex_math_dollars or -f html+tex_math_single_backslash.

For parsing HTML pages including raw mathml, you don't need to do anything; pandoc will recognize this as math.

Your input is quite strange. I assume this is not the raw page source but the JavaScript-processed output? You will have better luck with the raw page source.

In any case, this has the following structure:

  • span with class MathJax_Preview - empty
  • span with class mjx-chtml - a huge mathml in a data-mathml attr…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by maxima120
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants