A simple application built in React/TypeScript to demonstrate highlighting of text which:
- Displays structured text as HTML, from a sample text block
- Allows users to make multiple overlapping selections of that text
- Shows all sections separately on the page
Clone the repo, and then from eigen_highlights directory:
npm install
npm start
Browse to localhost:3000
window.getSelection()
gives the text highlighted, but not any interceding html tags. The simple text.replace()
does not deal with these - meaning cross paragraph highlights do not remain on page (although they do appear in the snippets panel).
Potential solution: RegEx to identify html tags, if found then close the span in first paragraph, reopen in next paragraph (as well as the pre/suffix spans) to highlight. This may need further consideration for highlighting across 2 or more paragraph/other html tags. I did play with xpath-range which looked promising - giving the xpath of the start and end of selection, but I didn't find a clean way of parsing the selection in the range and converting in the html content string.
The browser ::selection
formatting is shown, and the text is added to the list of snippets. However, the replace does not handle the content containing an intervening html tag properly, so the text is not correctly formatted.
Potential solution: More time spent working on this? Needs much more than regex to resolve, need to parse entire HTML to keep count of the open/closed tags at least.
Simple text.replace()
replaces the first instance of a match. If selecting single words or repeated short phrases, if this is not the first instance of this word/phrase the the wrong snippet is highlighted
Potential solution: Prevent selection if single word with no spaces and more than 1 instance of snippet.
Need to be cleverer in storing snippets. In developing this, looked at xpaths as a possible solution for finding the current location- even this would need to be dynamically updated for each new highlight as they introduce new spans.
Have always assumed that tags are original html tags, ie <tag>content</tag>
, this does not account for self-closed tags eg <tag children={content} />
seen in jsx/tsx.
Relatively safe in this context as the content in selections is based on a known object, but it is possible to insert malicious code using dangerouslySetInnerHtml.
Potential solution: Proper sanitisation of HTML with tools such as DOMPurify or even an altogether different solution.
Wasn't sure of a quick solution for highlighting snippets of text programatically. Ideally would simulate this, and then check for snippets being added to the snippet drawer.
Would be nice to add a selection mode option, eg auto-select on highlight, or offer popup to toggle highlight of current selection. This allows imprecise highlighters a chance to re-try and get accurate selection before adding to the list.
I've only put in very basic styling to position components, and for selection functionality. I have used Semantic React UI in previous work for a styling framwork, but have intentionally kept this project lightweight.