Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON escape regression from EF Core 8 to 9 #35400

Open
sveinungf opened this issue Jan 2, 2025 · 5 comments
Open

JSON escape regression from EF Core 8 to 9 #35400

sveinungf opened this issue Jan 2, 2025 · 5 comments
Assignees

Comments

@sveinungf
Copy link

I'm in the process of upgrading an application from EF Core 8 to EF Core 9, but I'm encountering a change in how JSON is being escaped.

I have an entity with my own LocalizableString type that I'm storing as JSON by using ToJson() in the model configuration. For an existing row, if I attempt to update a property on LocalizableString with a non-ASCII character, then it seems like the value is now escaped twice. When reading the value back from the database with EF, the value appears different than what EF stored.

Include your code

Here is code to reproduce the problem:

using Microsoft.EntityFrameworkCore;

var options = new DbContextOptionsBuilder<MyDbContext>()
    .UseSqlServer("Server=.;Database=EF9JsonEscapeRegression;Trusted_Connection=True;Encrypt=False")
    .Options;

await using var ctx = new MyDbContext(options);
await ctx.Database.EnsureDeletedAsync();
await ctx.Database.EnsureCreatedAsync();

var name = new LocalizableString { En = "Door" };
var item = new Item { Name = name };

ctx.Items.Add(item);
await ctx.SaveChangesAsync();

item.Name.No = "Dør"; // Here is the crucial part
await ctx.SaveChangesAsync();

await using var ctx2 = new MyDbContext(options);
var actualItem = await ctx2.Items.SingleAsync();

// EF 8.0.11: "Dør"
// EF 9.0.0:  "D\\u00F8r"
Console.WriteLine(actualItem.Name.No);


public class MyDbContext(DbContextOptions<MyDbContext> options) : DbContext(options)
{
    public DbSet<Item> Items { get; init; }

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.Entity<Item>().OwnsOne(x => x.Name, b => b.ToJson());
    }
}

public class Item
{
    public int Id { get; init; }
    public required LocalizableString Name { get; set; }
}

public class LocalizableString
{
    public required string En { get; set; }
    public string? No { get; set; }
}

The generated update query is slightly different depending on the EF version.
Here is the query for EF 8:

exec sp_executesql N'SET IMPLICIT_TRANSACTIONS OFF;
SET NOCOUNT ON;
UPDATE [Items] SET [Name] = JSON_MODIFY([Name], ''strict $.No'', @p0)
OUTPUT 1
WHERE [Id] = @p1;
',N'@p0 nvarchar(4000),@p1 int',@p0=N'Dør',@p1=1

Here is the query for EF 9:

exec sp_executesql N'SET IMPLICIT_TRANSACTIONS OFF;
SET NOCOUNT ON;
UPDATE [Items] SET [Name] = JSON_MODIFY([Name], ''strict $.No'', @p0)
OUTPUT 1
WHERE [Id] = @p1;
',N'@p0 nvarchar(4000),@p1 int',@p0=N'D\u00F8r',@p1=1

It seems to only be a problem when updating a property on the JSON serialized type. Replacing the instance works as expected.

Include provider and version information

EF Core version: 9.0.0
Database provider: Microsoft.EntityFrameworkCore.SqlServer
Target framework: .NET 9.0
Operating system: Windows 11
IDE: Visual Studio Professional 2022 17.12

@roji
Copy link
Member

roji commented Jan 2, 2025

/cc @maumar, IIRC you did the work in this area.

@maumar
Copy link
Contributor

maumar commented Jan 6, 2025

related/dupe: #30315

@maumar
Copy link
Contributor

maumar commented Jan 6, 2025

Utf8JsonWriter that we use (starting in EF9) to construct JSON objects always escapes the string values by default for security reasons (https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/use-utf8jsonwriter#customize-character-escaping and https://learn.microsoft.com/en-us/dotnet/standard/serialization/system-text-json/character-encoding). This is the source of the break.

@maumar
Copy link
Contributor

maumar commented Jan 7, 2025

@sveinungf you can workaround the issue by using https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.unescape?view=net-9.0

#30744 is tracking the work to add global customization options for json reader/writer, which includes the encoder

For now, you can also try to replace JsonReaderWriter with a custom implementation which would unescape string when it's reading it from the json reader.

We only have metadata API for this at the moment - it's called SetJsonValueReaderWriterType. You would need to copy the implementation of current JsonStringReaderWriter and change FromJsonTyped to something like:

public override string FromJsonTyped(ref Utf8JsonReaderManager manager, object? existingObject = null)
{
var result = manager.CurrentReader.GetString()!;

return Regex.Unescape(result);
}

But keep in mind, Utf8JsonWriter is escaping everything for security reasons, so make sure you are not exposing your app to some problems, e.g. if the inputs are coming from untrusted source.

@akselkvitberg
Copy link

Is this breaking change documented anywhere? This broke a lot of data in my application.
And I don't really understand the workaround. How do I replace the JsonReaderWriter?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants