Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Condition with rendering parts in goroutines with templ-initialized context #977

Open
kanstantsin-chernik opened this issue Oct 31, 2024 · 6 comments · May be fixed by #1034
Open

Race Condition with rendering parts in goroutines with templ-initialized context #977

kanstantsin-chernik opened this issue Oct 31, 2024 · 6 comments · May be fixed by #1034

Comments

@kanstantsin-chernik
Copy link

kanstantsin-chernik commented Oct 31, 2024

In the current implementation, there is an assumption that all the parts are rendered in a single thread. Context value for children is modified for the whole context. If snippets need to be rendered in goroutines, it creates a race condition where children go missing or sporadically added

To Reproduce
https://github.com/kanstantsin-chernik/templ/blob/suspense-async-rendered/examples/suspense-async-rendered/main.templ

I also added comments to the generated code

Expected behavior
Each templ component is independent and can be rendered in a goroutine

Screenshots
Screenshot 2024-11-01 at 1 31 38 PM

Logs

WARNING: DATA RACE
Read at 0x00c01e4a13f0 by goroutine 1053:
  github.com/a-h/templ.GetChildren()
      external/com_github_a_h_templ/runtime.go:78 +0x186

Previous write at 0x00c01e4a13f0 by goroutine 1054:
  github.com/a-h/templ.ClearChildren()
      external/com_github_a_h_templ/runtime.go:68 +0x44

templ info output
% templ info
(✓) os [ goos=linux goarch=amd64 ]
(✓) go [ location=/home/user/go-code/bin/go version=go version go1.23.2 X:nocoverageredesign linux/amd64 ]
(✓) gopls [ location=/opt/go/path/bin/gopls version=golang.org/x/tools/gopls v0.14.2
golang.org/x/tools/[email protected] h1:sIw6vjZiuQ9S7s0auUUkHlWgsCkKZFWDHmrge8LYsnc= ]
(✓) templ [ location=/home/user/go-code/bin/templ version=v0.2.778 ]

Desktop (please complete the following information):

  • OS: all
  • templ CLI version (templ version): v0.2.778
  • Go version (go version): go version go1.23.2 X:nocoverageredesign linux/amd64
  • gopls version (gopls version):golang.org/x/tools/gopls v0.14.2

Additional context

@kanstantsin-chernik
Copy link
Author

A couple of questions:

  • Should Templ support this example by not reusing context or sync rendering is by design?
  • If it is by design, can we at least introduce something like WithoutContext to reset it before we launch go routines?

@kanstantsin-chernik kanstantsin-chernik changed the title Race Condition with rendering parts in goroutine sharing same context Race Condition with rendering parts in goroutine with templ-initialized context Nov 1, 2024
@kanstantsin-chernik kanstantsin-chernik changed the title Race Condition with rendering parts in goroutine with templ-initialized context Race Condition with rendering parts in goroutines with templ-initialized context Nov 4, 2024
@a-h
Copy link
Owner

a-h commented Nov 12, 2024

Hi, just looking into this. Wouldn't you also want to put a mutex around the io.Writer that you're writing to, in order to avoid the danger of multiple components rendering to the writer at the same time?

If you did that, I don't think you would have a race condition. The suspense example in the repo uses a for loop over a channel which means that although data is fetched in parallel, the rendering to the response is sequential, which is typically what you want to prevent multiple writers writing data at the same time.

So, I don't really understand what you're trying to do. Are you thinking that you could get better performance by rendering multiple parts of a page in parallel, or something else?

If it's that you think you'll get better performance by rendering individual sections, then I think you could avoid the race condition by having some HTTP middleware that initializes the context as it receives a request, to enable that, but I suspect that the overhead of creating buffers and copying them around would negate any benefits.

@kanstantsin-chernik
Copy link
Author

kanstantsin-chernik commented Nov 12, 2024

Hi, thanks for taking a look at it!

Lets start with why we are doing this. In our case, we have a lot of heavy snippets rendered in parallel but written sequentially. We saw a 50-60ms saving on paralleling the rendering. Mutex around writer could be added in the async renderer in case of suspence, but for us the order matters. So we do see better performance with rendering in parallel.

If it's that you think you'll get better performance by rendering individual sections, then I think you could avoid the race condition by having some HTTP middleware that initializes the context as it receives a request, to enable that

Unfortunately, it is exactly what doesn't work here. If the context is initialized, than every goroutine would receive the same initialized version of context with the same pointer to Children. Then each of the parallel goroutines starts reading and cleaning Children in a shared pointer, hence the race condition. What we need is to set a tombstone in the context before handing it over to goroutines and make all the goroutines initialize their context again with their own Children pointer

@a-h a-h linked a pull request Jan 1, 2025 that will close this issue
@a-h
Copy link
Owner

a-h commented Jan 1, 2025

So, the objective is to render various sections to sub-buffers, then to render them out in the expected sequence.

I can see how that could, in some cases, improve the performance.

However, @joerdav's comments on the PR are correct. The context design currently expects single threaded use, and Once, script rendering and others would require some sort of mutex on the existing context value to function correctly.

For example:

https://github.com/a-h/templ/blob/8e2db03531d9f905c69136dac1b64ee99fc9aa0f/runtime.go#L502C1-L515C2

It expects that it can read and write to the context without a mutex.

templ/once.go

Lines 53 to 62 in 8e2db03

return ComponentFunc(func(ctx context.Context, w io.Writer) (err error) {
_, v := getContext(ctx)
if v.getHasBeenRendered(o) {
return nil
}
v.setHasBeenRendered(o)
if o.c != nil {
return o.c.Render(ctx, w)
}
return GetChildren(ctx).Render(ctx, w)

We could consider updating the way that the context works so that children are separate.

This would allow us to have something like this:

package main

import (
	"bytes"
	"context"
	"errors"
	"io"
	"os"
	"sync"

	"github.com/a-h/templ"
)

func main() {
	// Create a new component.
	page := Page()

	// Render the component.
	page.Render(context.Background(), os.Stdout)
}

func Concurrent(components ...templ.Component) templ.Component {
	return ConcurrentComponent{Components: components}
}

type ConcurrentComponent struct {
	Components []templ.Component
}

func (c ConcurrentComponent) Render(ctx context.Context, w io.Writer) error {
	waitGroups := make([]sync.WaitGroup, len(c.Components))
	buffers := make([]*bytes.Buffer, len(c.Components))
	errs := make([]error, len(c.Components))

	// Start rendering.
	for i := range c.Components {
		waitGroups[i].Add(1)
		buffers[i] = new(bytes.Buffer)
		go func(i int) {
			defer waitGroups[i].Done()
			errs[i] = c.Components[i].Render(ctx, buffers[i])
		}(i)
	}

	// Render the results.
	for i := range c.Components {
		waitGroups[i].Wait()
		if errs[i] != nil {
			break
		}
		_, errs[i] = io.Copy(w, buffers[i])
	}

	return errors.Join(errs...)
}

And use it like this:

templ Page() {
	<!DOCTYPE html>
	<html>
		<head>
			<title>Page</title>
		</head>
		<body>
			<h1>Page</h1>
			@Concurrent(ExpensiveChild(1), ExpensiveChild(2), ExpensiveChild(3), ExpensiveChild(4), ExpensiveChild(5), ExpensiveChild(6), ExpensiveChild(7), ExpensiveChild(8), ExpensiveChild(9), ExpensiveChild(10))
		</body>
	</html>
}

templ ExpensiveChild(i int) {
	@templ.Flush() {
		<div>
			<h2>Expensive Child { strconv.Itoa(i) }</h2>
			@ExpensiveComponent()
		</div>
	}
}

func ExpensiveComponent() templ.Component {
	return templ.ComponentFunc(func(ctx context.Context, w io.Writer) error {
		time.Sleep(1 * time.Second)
		_, err := w.Write([]byte("Expensive Component"))
		return err
	})
}

With the expected output.

<!doctype html><html><head><title>Page</title></head><body><h1>Page</h1><div><h2>Expensive Child 1</h2>Expensive Component</div><div><h2>Expensive Child 2</h2>Expensive Component</div><div><h2>Expensive Child 3</h2>Expensive Component</div><div><h2>Expensive Child 4</h2>Expensive Component</div><div><h2>Expensive Child 5</h2>Expensive Component</div><div><h2>Expensive Child 6</h2>Expensive Component</div><div><h2>Expensive Child 7</h2>Expensive Component</div><div><h2>Expensive Child 8</h2>Expensive Component</div><div><h2>Expensive Child 9</h2>Expensive Component</div><div><h2>Expensive Child 10</h2>Expensive Component</div></body></html>

Really, we should also adjust the children behaviour. At the moment, you get one child component, but it would probably be nicer syntax to do:

@Concurrent() {
  <div>Level 1</div>
  <div>Level 2</div>
}

i.e. we return a slice of components (as a type that implements templ.Component).

I've added a draft PR above. Using a replace directive in the go.mod file of that example, I could run go run -race . and it didn't trigger the race check.

There's probably some performance implication to this change.

@joerdav - you commented on the PR, just wanted to check you had seen this issue. 😁 Would love your thoughts on it. Might be one to discuss on a call.

@joerdav
Copy link
Collaborator

joerdav commented Jan 2, 2025

It all looks good to me. I'd expect the performance implications to be minimal and only really come into play in these very niche scenarios where concurrent renders are using the same ctx.

A couple of things come to mind, on a bit of a tangent, I do think that children being a []Component is a useful idea...

  • I wonder if then the @ syntax should be able to render a slice of components?
  • Is then the { children... } syntax unneeded? We can just do @templ.GetChildren(ctx)

In general though, the addition of the mutex, I can't see any issues here besides one.. this could mean that the rendered result is not going to always be consistend. Imagine you have 2 components Foo and Bar, both render a component which contains a Once, our page renders Foo above Bar but does so concurrently. Templ will correctly only render our Once one time, however there is a bit of a race, will the Once render in Foo or Bar, it's undetermined. Maybe this is okay?

@a-h
Copy link
Owner

a-h commented Jan 2, 2025

Yes, I think the { children... } syntax could be retired, which would be a good thing, because it's one less piece of magic. An update to templ fmt could remove it from files where it exists.

I thought templ.GetChildren could return a type ComponentSlice []Component which would implement templ.Component by having a Render function, but your idea would work too.

With ComponentSlice, it's still a templ.Component, so there would be no breaking changes, but it would be possible to inspect the type, e.g. components, isComponentSlice := c.(templ.ComponentSlice) if specific handling of the children was wanted.

Good point on the race condition for Once, I hadn't thought of that! Similarly, there will be cases where a script needs to be rendered to the output before the HTML, and if happens out of order, that would mess with it...

A concurrent renderer would need to take that into account somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants