Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BrotliEncoder.GetMaxCompressedLength returns invalid value #35142

Closed
TechPizzaDev opened this issue Apr 17, 2020 · 16 comments · Fixed by #108043
Closed

BrotliEncoder.GetMaxCompressedLength returns invalid value #35142

TechPizzaDev opened this issue Apr 17, 2020 · 16 comments · Fixed by #108043
Labels
area-System.IO.Compression in-pr There is an active PR which will close this issue when it is merged
Milestone

Comments

@TechPizzaDev
Copy link
Contributor

image
Project target: netcoreapp3.1

Test code
using System;
using System.IO.Compression;

namespace Sandbox
{
    class Program
    {
        static void Main(string[] args)
        {
            const int quality = 5;
            const int window = 10;

            byte[] source = new byte[256000];
            var rng = new Random(1234);
            rng.NextBytes(source);

            int maxLength = BrotliEncoder.GetMaxCompressedLength(source.Length);
            var resultBuffer = new byte[maxLength + 10];

            if (!BrotliEncoder.TryCompress(
                source, resultBuffer, out int written1, quality, window))
                throw new Exception();

            if (!BrotliEncoder.TryCompress(
                source, resultBuffer.AsSpan(0, maxLength), out int written2, quality, window))
                throw new Exception();
        }
    }
}
dotnet --info
.NET Core SDK (reflecting any global.json):
 Version:   3.1.201
 Commit:    b1768b4ae7

Runtime Environment:
 OS Name:     Windows
 OS Version:  10.0.18363
 OS Platform: Windows
 RID:         win10-x64
 Base Path:   C:\Program Files\dotnet\sdk\3.1.201\

Host (useful for support):
  Version: 3.1.3
  Commit:  4a9f85e9f8

.NET Core SDKs installed:
  2.1.700 [C:\Program Files\dotnet\sdk]
  2.1.701 [C:\Program Files\dotnet\sdk]
  2.1.801 [C:\Program Files\dotnet\sdk]
  2.1.802 [C:\Program Files\dotnet\sdk]
  2.2.300 [C:\Program Files\dotnet\sdk]
  3.1.201 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.11 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.All 2.2.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.11 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 2.2.5 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.AspNetCore.App 3.1.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 2.1.11 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.12 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.13 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.2.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.NETCore.App 3.1.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
  Microsoft.WindowsDesktop.App 3.1.3 [C:\Program Files\dotnet\shared\Microsoft.WindowsDesktop.App]
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added area-System.IO.Compression untriaged New issue has not been triaged by the area owner labels Apr 17, 2020
@ghost
Copy link

ghost commented Apr 17, 2020

Tagging subscribers to this area:
Notify danmosemsft if you want to be subscribed.

@carlossanlop carlossanlop added this to the Future milestone Jun 29, 2020
@carlossanlop carlossanlop removed the untriaged New issue has not been triaged by the area owner label Jul 8, 2020
@lcsondes
Copy link

Just hit this one in my code, can confirm. I'm getting a compressed buffer (using new BrotliEncoder(9, 22).Compress(...)) of 0x0040000c length (the input is exactly 0x00400000 bytes) when the expected maximum is 0x00400007.

@danmoseley
Copy link
Member

Interested in debugging some @lcsondes ?

@lcsondes
Copy link

@danmosemsft What do you need?

@danmoseley
Copy link
Member

@lcsondes it may be some time before someone can take a look, so sometimes I ask folks whether they’re interested in investigating to figure out the root cause.

@lcsondes
Copy link

From skimming the source my best guess is this line hardcoding the largest possible large block size while the BrotliEncoder constructor lets you select any window size. The guesswork here is whether the two are related, RFC 7932 doesn't seem to talk about large blocks but it does mention meta-blocks that have a variable size which just so happens to range between 0 and 16M and its header contains the size of the window so there could be a connection.

On the other hand if I take my input size which is 4M, 4M>>24 is 0 (which also looks incorrect BTW, surely there must be at least one block if the input is meaningful?) but that only explains 4 bytes of difference. The most I've observed was 6.

@neuecc
Copy link

neuecc commented Dec 8, 2022

I've encounted same issue.
brotli/encode.c has BrotliEncoderMaxCompressedSize method, that returns different value of current C# impl.
https://github.com/google/brotli/blob/3914999fcc1fda92e750ef9190aa6db9bf7bdb07/c/enc/encode.c#L1200
I've port this code to C# and compare result.

int BrotliEncoderMaxCompressedSize(int input_size)
{
    var num_large_blocks = input_size >> 14;
    var overhead = 2 + (4 * num_large_blocks) + 3 + 1;
    var result = input_size + overhead;
    if (input_size == 0) return 2;
    return (result < input_size) ? 0 : result;
}

My test data,

Input 10000005
GetMaxCompressedLength 10000012
BrotliEncoderMaxCompressedSize 10002451
Compressed 10000021

I think this is a critical issue of compression library, could you fix it?

@neuecc
Copy link

neuecc commented Jul 23, 2023

@carlossanlop
.NET 8 doesn't seem to have any changes, should this remain a TODO?

@lcsondes
Copy link

For the poor souls who are still affected by this, I recommend switching over to zstd, which is unaffected by this bug, and generally has a better runtime performance and compression ratio.

It's unlikely that this one will be fixed in anytime soon, it's been open for more than 3 years.

@neuecc
Copy link

neuecc commented Sep 12, 2024

.NET 9 RC1 supports new BrotliCompressionOptions in #105430 , however this bug is not cared,
I've tested in latest VS-preview but still exists.
@buyaa-n , @carlossanlop , @stephentoub (mentioned to #105430 reviewers)

@buyaa-n
Copy link
Contributor

buyaa-n commented Sep 16, 2024

.NET 9 RC1 supports new BrotliCompressionOptions in #105430 , however this bug is not cared,

That was not about for fixing this issue.

Anyway the managed implementation of GetMaxCompressedLength(int inputSize) was the same as the native implementation until native code updated with Update brotli to v1.0.5. Wonder why did not use the native implementation directly, anyway this is a regression happened a few years back.

Question is how to fix it:

  • Should we just apply the native changes to the managed implementation
  • Or should we update it to call native method directly (it could prevent from similar issue happen again)

CC @carlossanlop @ericstj @stephentoub

@stephentoub
Copy link
Member

Should we just apply the native changes to the managed implementation
Or should we update it to call native method directly (it could prevent from similar issue happen again)

Let's just P/Invoke to the native implementation, since it's the source of truth, and this size needs to remain in sync with what the encoder actually does.

@buyaa-n buyaa-n added the help wanted [up-for-grabs] Good issue for external contributors label Sep 16, 2024
@ericstj
Copy link
Member

ericstj commented Sep 17, 2024

@buyaa-n you marked this as regression-from-last-release - is this bug tracking a regression made in .NET 9.0?

@buyaa-n
Copy link
Contributor

buyaa-n commented Sep 17, 2024

@buyaa-n you marked this as regression-from-last-release - is this bug tracking a regression made in .NET 9.0?

No, it's regression from several releases, just there were no other label with regression

@TechPizzaDev
Copy link
Contributor Author

TechPizzaDev commented Sep 17, 2024

It was introduced with .NET Core 3.0 judging by the mentioned brotli 1.0.5 update in 2018.

I'm also surprised the milestone was set to .NET 10 considering this is a fix, not a new feature.
.NET 9 should still receive it even if it's in RC phase now, but maybe it's too low priority considering no one asked for a backport to previous LTS versions. Unless the fix is deemed as a breaking change...

@buyaa-n buyaa-n removed the help wanted [up-for-grabs] Good issue for external contributors label Sep 19, 2024
@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Sep 19, 2024
@buyaa-n
Copy link
Contributor

buyaa-n commented Sep 19, 2024

I'm also surprised the milestone was set to .NET 10 considering this is a fix, not a new feature.

Currently all PRs should go to main (10.0), based on the customer feedback and the risk of the fix management will decide if we port the fix to previous versions and which versions will get fixed

@github-actions github-actions bot locked and limited conversation to collaborators Nov 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO.Compression in-pr There is an active PR which will close this issue when it is merged
Projects
None yet
9 participants