-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add in-image book reader #222
base: master
Are you sure you want to change the base?
Conversation
…r (dynamic updates!)
Open todo:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Impressive work! :) I can barely comment on the details, but what I reviewed looked good to me.
@codeZeilen Thank you! :-) I guess I'm not in a hurry to merge this but will point students to this branch, but would you agree with including the extractor into the CI and mentioning the in-image-reader in the readme and in the preface of the book in the long term? If yes, would we want to do this before or after archiving the 6.0 version on a separate branch (similar to #130)? :) |
Same question for #225. My $0.02: Given that we still ship the 6.0 version to students, it makes sense to me to keep it maintained in the default branch for now until we release the next edition. Maybe we would even want to release a 6.0.1 edition at some time, but not necessarily. Wdyt? |
Quickstarter to try out the result (works on both Squeak 6.0 and Squeak 6.1Alpha):
Implementation
This PR adds two (and a half) packages:
SBE-Book
contains the DOM for the parsed book: a node hierarchy of parts, chapters, etc.; aHelpTopic
adapter; and a couple of customTextAttribute
s for UI-theme-dependent styling and serialization.SBE-BookCompatibility-Squeak60
provides extension methods for compatibility.SBE-ExtractBook
uses Sandblock'sDomainCode
interface for parsing the SBE LaTeX sources using a tree-sitter grammar and compiling them into the DOM structure.A note on cost-benefit:
LaTeX parsing (to some extent) works. As described in
SBELatexBookExtractor class>>#todo
, all of this is very challenging because the used grammar is imperfect and LaTeX is non-context-free by nature. E.g., a single\ct{$}
anywhere in a file will result in an unsound (yet accessible thanks to sb-tree-sitter) AST for the entire chapter.I personally do not set any expectations wrt internal quality for this component. Seems not worth the effort for me, for now I was mainly interested in the output.
For running the extractor yourself, some dependencies are required that have not yet been merged upstream. The extractor is not 6.0 compatible.
I did a lot of fine-tuning to get the extractor to parse this particular book into a somewhat acceptable form. Still, this is very time-consuming, and there are several things that I left out for now. This includes:
Nevertheless, the large majority of pages render correctly and look somewhat nice, and it means we can finally read/search/analyze Squeak by Example in the image!
I will maybe do some very minor tweaking of the UI but 80% of the project is done and I am currently not planning to address the last 19.5%. However, if you find any urgent issues, I will be happy to take a look at them. Otherwise, I'd like to release this soon. Wdyt, can we merge this and ship it to our students? :-)