Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent children contents for link vs wikilink node value #536

Open
mfontanini opened this issue Feb 13, 2025 · 5 comments
Open

Inconsistent children contents for link vs wikilink node value #536

mfontanini opened this issue Feb 13, 2025 · 5 comments
Labels

Comments

@mfontanini
Copy link
Contributor

The link and wikilink node values have inconsistent children node shapes when a title is present. For normal links, when there is no title, the node has no children. For wikilinks, when there is no title, the node has one children that contains the link itself. This makes it hard for parsers to distinguish between a wikilink with title and one without.

To reproduce run this code:

use comrak::{nodes::NodeValue, parse_document, Arena, Options};

fn main() {
    let arena = Arena::new();
    let mut options = Options::default();
    options.extension.wikilinks_title_before_pipe = true;

    let root = parse_document(
        &arena,
        "[](link1)\n[title](link2)\n[[link3]]\n[[title|link4]]",
        &options,
    );
    for node in root.descendants() {
        let data = node.data.borrow();
        match data.value {
            NodeValue::Link(_) | NodeValue::WikiLink(_) => println!("{node:#?}"),
            _ => (),
        };
    }
}

Notice the output below. I would have expected that link1 and link3 should have the same shape (no children), but they don't.

Node {
    data: RefCell {
        value: Ast {
            value: Link(
                NodeLink {
                    url: "link1",
                    title: "",
                },
            ),
            sourcepos: Sourcepos {
                start: LineColumn {
                    line: 1,
                    column: 1,
                },
                end: LineColumn {
                    line: 1,
                    column: 9,
                },
            },
            internal_offset: 0,
            content: "",
            open: false,
            last_line_blank: false,
            table_visited: false,
            line_offsets: [],
        },
    },
    children: [],
}
Node {
    data: RefCell {
        value: Ast {
            value: Link(
                NodeLink {
                    url: "link2",
                    title: "",
                },
            ),
            sourcepos: Sourcepos {
                start: LineColumn {
                    line: 2,
                    column: 1,
                },
                end: LineColumn {
                    line: 2,
                    column: 14,
                },
            },
            internal_offset: 0,
            content: "",
            open: false,
            last_line_blank: false,
            table_visited: false,
            line_offsets: [],
        },
    },
    children: [
        Node {
            data: RefCell {
                value: Ast {
                    value: Text(
                        "title",
                    ),
                    sourcepos: Sourcepos {
                        start: LineColumn {
                            line: 2,
                            column: 2,
                        },
                        end: LineColumn {
                            line: 2,
                            column: 6,
                        },
                    },
                    internal_offset: 0,
                    content: "",
                    open: false,
                    last_line_blank: false,
                    table_visited: false,
                    line_offsets: [],
                },
            },
            children: [],
        },
    ],
}
Node {
    data: RefCell {
        value: Ast {
            value: WikiLink(
                NodeWikiLink {
                    url: "link3",
                },
            ),
            sourcepos: Sourcepos {
                start: LineColumn {
                    line: 3,
                    column: 1,
                },
                end: LineColumn {
                    line: 3,
                    column: 9,
                },
            },
            internal_offset: 0,
            content: "",
            open: false,
            last_line_blank: false,
            table_visited: false,
            line_offsets: [],
        },
    },
    children: [
        Node {
            data: RefCell {
                value: Ast {
                    value: Text(
                        "link3",
                    ),
                    sourcepos: Sourcepos {
                        start: LineColumn {
                            line: 3,
                            column: 3,
                        },
                        end: LineColumn {
                            line: 3,
                            column: 7,
                        },
                    },
                    internal_offset: 0,
                    content: "",
                    open: false,
                    last_line_blank: false,
                    table_visited: false,
                    line_offsets: [],
                },
            },
            children: [],
        },
    ],
}
Node {
    data: RefCell {
        value: Ast {
            value: WikiLink(
                NodeWikiLink {
                    url: "link4",
                },
            ),
            sourcepos: Sourcepos {
                start: LineColumn {
                    line: 4,
                    column: 1,
                },
                end: LineColumn {
                    line: 4,
                    column: 15,
                },
            },
            internal_offset: 0,
            content: "",
            open: false,
            last_line_blank: false,
            table_visited: false,
            line_offsets: [],
        },
    },
    children: [
        Node {
            data: RefCell {
                value: Ast {
                    value: Text(
                        "title",
                    ),
                    sourcepos: Sourcepos {
                        start: LineColumn {
                            line: 4,
                            column: 3,
                        },
                        end: LineColumn {
                            line: 4,
                            column: 7,
                        },
                    },
                    internal_offset: 0,
                    content: "",
                    open: false,
                    last_line_blank: false,
                    table_visited: false,
                    line_offsets: [],
                },
            },
            children: [],
        },
    ],
}

I can probably create a PR to fix this myself, but this will be a breaking change so I would like someone to give it the green light before I do so.

@kivikakk
Copy link
Owner

@digitalmoksha Thoughts on such a change?

This is a bit of a philosophical matter, since CommonMark specifies that a link without a title should be empty, and therefore it must have no contents. The closest thing we have to a spec for wikilinks, on the other hand, specifies that a wikilink without an explicit title should contain the link as its title.

Whether that happens at the rendering stage or the parsing stage is indeed up to us; I sympathise with wanting to preserve the information.

@digitalmoksha
Copy link
Collaborator

since CommonMark specifies that a link without a title should be empty

@mfontanini The thing with wikilinks is that if there is not title specified, then the link and the title are the same thing. So when that link gets rendered, we expect the text of the HTML link to be that link, which is usually just a page name like "How to contribute" or something.

So [[How to contribute]] should render as <p><a data-sourcepos="5:1-5:21" href="How%20to%20contribute" data-wikilink="true">How to contribute</a></p>. If you don't have the text in the HTML, then the link would not be seen. So that child text node must be there.

And CommonMark, as @kivikakk mentioned, specifies that if there is no title, then no text is in the link. So therefore no child text node. As you can see in https://spec.commonmark.org/dingus/?text=%5B%5D(http%3A%2F%2Fexample.com)%0A

I suppose you could argue that we could put an empty text node in markdown links without a title, but I'm not really sure that would buy us anything.

@kivikakk
Copy link
Owner

I think @mfontanini is suggesting we could parse the wikilink with an empty title (so you can tell, in the AST, whether or not there was in fact a title specified with a |), but then in the renderer choose to render an empty-titled wikilink with the link as the title. This way the two forms can be distinguished at the AST level, without necessarily affecting the output.

There's another question about what to do if a title is specified but empty ([[example|]]); we could make the title an optional string, such that this case would be Some("") vs [[example]] having None.

@digitalmoksha
Copy link
Collaborator

suggesting we could parse the wikilink with an empty title (so you can tell, in the AST, whether or not there was in fact a title specified with a |), but then in the renderer choose to render an empty-titled wikilink with the link as the title.

Ah ok, I guess that makes sense.

@kivikakk
Copy link
Owner

@mfontanini I think we're all good for this change! Please do open a PR, and feel free to give us a ping if you get stuck. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants