Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix newline handling and whitespace trimming #442

Merged
merged 2 commits into from
Jan 21, 2025

Conversation

JimBobSquarePants
Copy link
Member

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following matches the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

Fixes

This PR resolves the following issues:

  • Line wrapping issue with explicit newline #441: Breaking at a mandatory break could sometimes overshoot a break that should be made to fit the wrapping length.
  • Improved line counting and whitespace trimming to ensure consistent and accurate text measurement and layout.

Summary

This PR addresses two primary issues:

  1. Mandatory Break Overshoot:
    Breaking at a mandatory break could occasionally overshoot, causing wrapping to occur incorrectly. This is now fixed, ensuring that lines break correctly within the wrapping length.

  2. Line Counting and Whitespace Trimming Improvements:
    Additional fixes improve the handling of trailing whitespace and newline characters during text measurement and layout creation. These changes affect the number of GlyphLayout instances produced when breaking multiple concurrent newlines, ensuring consistent cross-platform behavior. These additional glyphs are not passed to the renderer.


Key Changes

  1. Trailing Whitespace Trimming

    • Trailing whitespace at the end of lines is now trimmed under all circumstance, ensuring it does not affect layout or measurement.
    • If a line consists entirely of whitespace, only the leading whitespace character is preserved for measurement.

    Example:

    Input:
    "\t\t\t\t\t"
    
    Trimmed Output:
    "\t"

    Explanation:

    • Only the first tab character is preserved for measurement. Subsequent whitespace characters are trimmed because they do not affect layout after wrapping.

    Example:

    Input:
    "\t\t\t\t\t\nHello"
    
    Trimmed and Broken into Lines:
    "\t"
    "Hello"

    Explanation:

    • The leading tab character is preserved on the first line.
    • The word Hello starts on the second line after the break.
  2. Newline Preservation

    • Line-breaking characters (\r, \n, \r\n) are now processed according to Unicode Standard Rule 13.
    • Each \r\n sequence is collapsed into a single \r, where the break occurs at \r and the \n is trimmed.

    Example:

    Input:
    "Line1\r\nLine2\r\n\r\nLine4\n"
    
    Broken into Lines:
    "Line1"
    "Line2"
    "\r"
    "Line4"

    Explanation:

    • Each \r\n becomes \r, ensuring consistent line-breaking semantics.
    • Consecutive line breaks (e.g., \r\n\r\n) correctly result in one empty line.
  3. Improved Line Counting

    • Accurate line counting ensures proper text layout, particularly when trailing whitespace or redundant line breaks were previously causing misalignment.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 7 out of 14 changed files in this pull request and generated 2 comments.

Files not reviewed (7)
  • tests/Images/ReferenceOutput/LineWrappingWithExplicitNewLine__WrappingLength_800_.png: Language not supported
  • tests/Images/ReferenceOutput/LineWrappingWithImplicitNewLine__WrappingLength_800_.png: Language not supported
  • tests/Images/ReferenceOutput/RenderingTextIncludesAllGlyphs__WrappingLength_1900_.png: Language not supported
  • tests/SixLabors.Fonts.Tests/Issues/Issues_35.cs: Evaluated as low risk
  • tests/SixLabors.Fonts.Tests/Issues/Issues_27.cs: Evaluated as low risk
  • tests/SixLabors.Fonts.Tests/Issues/Issues_36.cs: Evaluated as low risk
  • tests/SixLabors.Fonts.Tests/Issues/Issues_434.cs: Evaluated as low risk
Comments suppressed due to low confidence (2)

src/SixLabors.Fonts/TextLayout.cs:410

  • [nitpick] The use of goto statements can make the code harder to read and maintain. Consider refactoring the code to avoid using goto.
goto end;

src/SixLabors.Fonts/TextLayout.cs:1222

  • Ensure that the changes are covered by tests to verify the new behavior.
textLine = remaining;

src/SixLabors.Fonts/TextLayout.cs Outdated Show resolved Hide resolved
src/SixLabors.Fonts/TextLayout.cs Outdated Show resolved Hide resolved
@jez9999
Copy link

jez9999 commented Jan 16, 2025

Explanation:

  • Each \r\n becomes \r, ensuring consistent line-breaking semantics.
  • Consecutive line breaks (e.g., \r\n\r\n) correctly result in one empty line.

Out of interest, what happens to \n? Is that also converted to \r?

@JimBobSquarePants
Copy link
Member Author

JimBobSquarePants commented Jan 16, 2025

Explanation:

  • Each \r\n becomes \r, ensuring consistent line-breaking semantics.
  • Consecutive line breaks (e.g., \r\n\r\n) correctly result in one empty line.

Out of interest, what happens to \n? Is that also converted to \r?

No. That would remain as \n. The output is the result of trimming not conversion. Both are considered line breaks by the Unicode standard so the behavioural result is the same despite the different code points.

tocsoft
tocsoft previously approved these changes Jan 19, 2025
Copy link
Member

@tocsoft tocsoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally approved, i'm not 100% sure we need to gotos but happy to leave them you you believe they are the right construct to use

boxLocation.X = originX;
boxLocation.Y += advanceY;
continue;
goto end;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surely this would be the same as

Suggested change
goto end;
return glyphs;

for this particular case? (there's a good chance i'm missing something that make the goto worth it)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I shouldn't code late at night.

@JimBobSquarePants JimBobSquarePants merged commit a82862f into main Jan 21, 2025
8 checks passed
@JimBobSquarePants JimBobSquarePants deleted the js/additional-linebreak-fixes branch January 21, 2025 01:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants